n linear regression assignment the dataset contains information about
Search for question
Question
/n Linear regression assignment
The dataset contains information about several major liquor brands, their advertising revenue
information and sales. We are interested to see the impact of different types of advertising expenses,
sales and market shares to current year sales.
For each of the question, paste visualization plots when necessary. Paste the R code at the end of the
paper. Discuss the statistics in your own words when asked.
1) Compute a new variable by adding magazine, newspaper, outdoor, broadcast and print advertising
expenditure, and name it TotalAds. (10 pts).
2) Check histogram distributions of the following variables:
•
TotalSales
●
TotalAds
●
PriceRerUnit
. MarketShare
Do the histograms resemble normal distribution? Submit your visualization plots, and discuss. (10 pts)
3) Conduct a correlation plot with all the variables. Submit your visualization plots, and describe the
relationships among the variables as you learn from the plot. (20pts)
4) We are interested to see impact of different types of advertising expenses on total sales. Run
scatterplot charts with the predictor and output variables first to understand the patterns. Then,
conduct a regression analysis with the following variables:
Output variable
TotalSales
Predictor variable
Mag
News
Outdoor
Broad
Print
Which advertising medium are significant predictor of sale? Mention relevant important statistics
from the regression output, including the Beta, the P value and the R square. (30 pts)
5) We are interested to see the impact of different types of ad expenses, market share and pricing on
sales. Run scatterplot charts with the predictor and output variables first to understand the patterns.
Then, conduct a regression analysis with the following variables:
Output variable
TotalSales
Predictor variable
TotalAds
PriceRerUnit
Marketshare
Which advertising mediums are significant predictors of sales? Mention relevant important statistics
from the regression output, including the Beta, the P value and the R square. (30 pts) Helping R code
#See column names of the file.
colnames(data)
#Compute new data columns
data<-data %>%
mutate(TotalAds = Mag + News+Outdoor+Broad+Print)
#Please use the previous codes from week 1 to do data exploration and visualization.
#Create and visualize Correlation plot
library(corrplot)
M = cor(data)
corrplot(M, method = 'number')
options(scipen=999)
#Create regression equation
model <- Im(y ~ x1 + x2 + x3 + x4 + x5, data = data)
summary(model)