You are applying for a quant job at an investment firm that specializes in high frequency trading of equities. The take-home interview questions deal with predictive analytics with large datasets. The interviewer wants to test your theoretical and practical understanding of estimation and your ability to apply prediction models. The interviewer gives you very specific instructions because there are a lot of applicants, he is very busy and doesn't have much time to waste.
1) AR(1) is a basic model for predicting returns. Its parameters can be estimated either analytically or numerically. Analytic estimation is more challenging because it requires deriving formulas, but it is also faster to implement. The speed is important. Derive MLE estimates of its parameters.
2) Volatility modeling is very important in trading because volatility is often used for positions sizing of directional trades and option pricing. Since estimation is based on a large dataset, approximate MLE estimation can be very effective because it is much simpler while still accurate. Derive the FOC equations for the approximate MLE estimation of the parameters of ARCH(q).
3) Although the production implementation of code is done in C++ for speed, most prototype development is done in Python. Use the included Jupyter Notebook to perform the following:
• Install the necessarily libraries
• Download Dow Jones data from the St. Louis Fed FRED database (djia = web.get_data_fred('DJIA')) and process the data to calculate returns dropping missing values. Multiply the returns by 100, that will help with the estimation convergence later.
• Plot the returns. Do you see any evidence of volatility clustering?
• Fit Garch(1,1) and view the summary of the results. Copy and paste it below. What is the value of the maximum log-likelihood? What are the estimated parameter values?
• Plot standardized residuals and the annualized conditional volatility. What can you say about the standardized residuals? What about the conditional volatility? Export the figure. Also copy and paste it here.
• Forecasting. Start forecasting at the end of your complete dataset using the 10-day horizon. Then repeat the process by ending your time-series in a high volatility regime (3/1/2020) to start forecasting in a high volatility regime.