Boston will be used throughout the exercise and is accessible through the ISLR2 package.
We are interested in predicting medv given a single input variable from among the three
variables crim, rm and ptratio. We ran three regressions in R using each one of the three
variables as a single input variable. Then, we computed the average squared error between our
predictions for medv and the actual observed values of medv over the whole sample. We got
the following figure.
Inse
70
09
20
no
43
crim
mm
Index
Figure 4. Model Comparison
a) Write the R code to reproduce that plot. Explain the details of any calculations required.
b) Based on this plot, does it make sense to say that the best model among the three is
the one that uses rm as input? Or should it be the one that uses crim instead? Or
something else?
Fig: 1