Search for question
Question

R Project 1. (20 Points) The time required in minute for 26 students to finish a 50 minute history exam is recorded as the following: 46.23713, 52.58143, 42.89946, 50.66941, 28.42386, 50.77394, 42.65797, 50.42930, 59.45894, 64.33043, 44.48510, 48.43943, 45.10024, 48.38066, 49.75874, 43.07601, 48.27409, 55.65475, 50.56731, 48.17550, 52.46376, 53.82756, 56.37297, 59.33533, 51.23410, 47.14756 (a) (5 Points) Draw a histogram of the data and describe the shape of it. (b) (5 Points) Draw a qqplot to check whether the normality assumption of the data is valid [Hint: Use "qqnorm” command] (c) (10 Points) Perform a hypothesis test to check whether we can say that students can finish the exam in time. 2. (30 Points) Vital capacity is a measure of the amount of air that someone can exhale after taking a deep breath, Data was collected on brass players and a control group. Assume the population distributions were normal. Table 1: Table for Question 2 Brass Player Control Group 4.7 4.2 4.6 4.7 4.3 5.1 4.5 4.7 5.5 5.0 4.9 5.3 (a) (10 Points) Conduct a t test to determine whether the population mean for brass is larger than that for control. (b) (5 Points) Provide the 95% confidence interval for the difference in the two population means. (c) (15 Points) A researcher claims that in theory the "spread/variance” in the two popu- lations is the same. Repeat step 1 (and 2) utilizing this assumption with the argument "var.equal" within the "t.test" function. 3. (50 Points) Consider the data file uploaded in this project named “mydata.csv". Based on the data, do the followings: (a) (3 Points) Create a scatterplot of y vs x 1 (b) (6 Points) Fit a simple linear regression model using y as the response and plot the regression line (with the data) (c) (8 Points) Test whether x is a significant predictor and create a 95% CI around the slope coefficient. (d) (3 Points) Report and interpret the coefficient of determination. (e) (3 Points) For x=20, create a CI for E(Y│X = 20). = (f) (2 Points) For x=150, can you use the model to estimate E(Y|X 150)? Discuss. (g) (25 Points) Does the model appear to be linear with respect to x? Discuss, and if not, provide alternative model and repeat the previous steps. Justify your answer. [Hint: Did you transform the response or the predictor only? Explain your choice. Recall independent that Yi N(Bo+B1Xi, σ) so transforming Y may not only impact fit (Bo+ẞ₁Xi), but also normality, independence, homogeneity of variance as well. If a better fit such as (ßo + ß1Xi + 2X power to some power, i.e. a transformation only on the predictor addresses the issue that is preferred.] 21