Search for question

Problem 8: In this exercise, we will generate simulated data, and will then use this data to

perform best subset selection.

(a) Use the morm() function to generate a predictor X of length n=100, as well as a noise vector of

length n=100.

(b) Generate a response vector Y of length n=100 according to the model

Y=BO+B1X+82X2+B3X3+€

Where 30, 31, 32, and 83 are constants of your choice.

(c) Use the regsubsets() function to perform best subset selection in order to choose the best model

containing the predictors X,X2,...,X10. What is the best model obtained according to Cp, BIC, and

adjusted R2? Show some plots to provide evidence for your answer, and report the coefficients of the

best model obtained. Note you will need to use the data.frame() function to create a single data set

containing both X and Y.

(d) Repeat (c), using forward stepwise selection and also using backwards stepwise selection. How

does your answer compare to the results in (c)?

Fig: 1