Search for question
Question

Assignment 1

Big Data and Machine Learning for Economics and Finance

Provide your answers in a document generated by RMarkdown. For each answer,

provide the R code, the R output and your comments on the output. Comment each

line of your R code as well. Give thorough explanations throughout.

Exercise 1. (10 points) For this exercise, the only extra package allowed is ISLR2. The datset

Default will be used throughout the exercise and is accessible through the ISLR2 package.

I. Consider the following figure constructed from the dataset Default.

balance

2500

balance

2000

1500

1000

500

2000 2500

1500

1000

a) Write the R code to reproduce that plot.

b) What is the conditioning variable in that plot? Give a thorough interpretation.

II. Consider another figure constructed from the same dataset.

500

T

1.0

No

1.2

default

Figure 1. Two box plots

1.4

1

default

8

1.6

Yes

a) Write the R code to reproduce that plot.

Figure 2. A scatter plot.

T

1.8

T

2.0/nb) Carry out a regression exercise where you are attempting to predict balance

given only the variable default.

1. Write the R code to train that model.

2. Modify the plot on figure 2 to add the predicted regression line.

3. Give predictions of balance for all possible values of default. Show how

to do the calculations directly in R and by using the regression output.

III. Consider another figure from the same dataset

balance

ose coo C

1000

DOG

02

08

1.0

Figure 3. Another scatter plot

a) What are the differences between this plot and the previous one?

b) Would you obtain the same regression results as with the previous figure? Illus-

trate everything with R code and conceptual justifications if necessary.

Fig: 1

Fig: 2