R Homework This assignment must be completed in Rmarkdown and the .Rmd and .pdf files must be submitted. Your conclusions must be clearly explained. Exercise 1. Use the 'cancerProstate database,

Question

R Homework This assignment must be completed in Rmarkdown and the .Rmd and .pdf files must be submitted. Your conclusions must be clearly explained. Exercise 1. Use the 'cancerProstate database,

available on Moodle, to do this exercise. We want to study the relationship between the volume of a prostate cancer (measurement unit: log(cm³)) and the level of antigen Y (measurement unit: mg/ml). The database contains observations of these variables measured on n = 97 men. a) Make a scatter plot of the data. Does the relationship between Y and x appear to be linear? b) Fit a linear regression model to the data, what is the equation of the estimated regression line? Plot this line in red on the graph produced in a). c) Create a graph of the standardized residuals as a function of the explana- tory variable. Do the assumptions of linearity and homoscedasticity seem to be met? Do there appear to be any outliers? d) Make a graph to check the assumption related to the normality of the data. Does this assumption seem to be respected? e) Use the BoxCox method to confirm that a transformation of the response variable does not seem necessary for these data. f) Find a 90 g) What is the predicted level of antigen for a man with a prostate level whose log volume is ≈ = 3.1? Obtain a 95 Exercise 2 Use the 'Satisfaction database, available on Moodle, to do this exercise. A hospital administrator wishes to study the relationship between a patient's sat- isfaction (Y) and the patient's age (x₁ measured in years), the severity of their 1 illness (x₂), and their level of anxiety (x3). The administrator has collected data for n = 46 patients where high values of Y, x2, and x3 are respectively associated with high satisfaction, increased severity of illness, and high anxiety level. a) Make a scatter plot matrix with the data and determine the correlation coefficients between each of the model's variables. b) Fit a linear regression model to the data. i) What is the equation of the estimated hyperplane, i.e., f(x₁, X2, X3) = Ê[Y]? ii) What is the estimated value of the parameter o²? iii) What are the values of R² and Radi? c) Given xo = (35, 4.3, 2.1), what is Ê[Y|xo]? Provide an 85 d) Conduct the three hypothesis tests below (indicate for each test whether you reject the null hypothesis or not) so that the overall Type I error probability is less than or equal to â = 0.12: H: B₁ = 0vsH₁ : ß₁ ‡ 0, H² : ₂ = OvsH² : ß₂ ‡ 0, Hổ : B3 = 0usHỉ:B3#0. e) Fit a linear regression model to the data but this time omitting the ex- planatory variable x2, i.e., consider a model with only 2 explanatory vari- ables: X1 and x3. What are the values of R² and Rad for this new model? f) Compare the values of R² and Rådj obtained in b)-iii) and in e) then explain in your own words the results obtained. Based on these values, which model seems more appropriate between the one adjusted in b) and the one adjusted in e), justify very briefly. 2

R Homework This assignment must be completed in Rmarkdown and the .Rmd and .pdf files must be submitted. Your conclusions must be clearly explained. Exercise 1. Use the 'cancerProstate database,

Get Instant Homework HelpOn Your Mobile

Get Instant Homework HelpOn Your Mobile

Get Instant Homework HelpOn Your Mobile