Search for question

Delete me; type your answer here. Do not copy-paste any symbols from outside sources, and

do not remove the asterisks.

Part (i) [2 marks]

We can also find confidence intervals based on this model, using the confint function.

In particular, my entering confint (MODEL, level = ...), where MODEL is our ANOVA

model and level is our confidence level, it will return a set of I intervals, where I is our

number of groups.

The first interval will be a confidence interval for the true mean of the "reference group",

which is whatever the first group is when we run the aggregate function. For this dataset,

the reference group is PC.

The following intervals will be confidence intervals for the difference in means between the

reference group, and each of the following groups.

Below, use confint to produce a 90% confidence interval for the true mean metascore on

the PC platform, as well as a 90% confidence interval for the different in mean metascores

between the PC and Switch platforms.

#Delete this line (including the # symbol) and place your code here.

Below, print out the 90% confidence interval for the mean metascore of the PC games:

Delete me; type your answer here. Do not copy-paste any symbols from outside sources, and

do not remove the asterisks.

Below, print out the 90% confidence interval for the difference in mean metascores/nPart (h) [1 mark]

For ANOVA to be valid, we have two main conditions that we check for. They are listed

below:

1.

Is each group Normally distributed?

2. Does each group have the same population standard deviation?

Note that the ANOVA procedure is quite robust, which means that minor violations of these

assumptions are permissible.

First, regarding Normality, we were given in the statement of the problem that Metascores

are Normal for each platform, so we will not check that condition.

To check the second assumption, we will look at the standard deviations of each group. So

long as no standard deviation is twice the size of another, this will be good enough for us./n#Delete this line (including the # symbol) and place your code here.

Note that this line of code only creates the ANOVA model.

To create the ANOVA table from the ANOVA model, we use the function summary. The

syntax of this function is summary (MODEL), where MODEL is your ANOVA model.

Below, use summary to create the ANOVA table.

#Delete this line (including the # symbol) and place your code here.

Do these results match your calculations in Parts (b) - (d)?

Delete me; type your answer here. Do not copy-paste any symbols from outside sources, and

do not remove the asterisks.

Part (g) [2 marks]

To find critical values of the Fan az distribution, we use the function qf. In particular, to

find the value x such that P Fanan

= ...).

Use qf to determine the critical value for this test. [1 mark]

#Delete this line (including the # symbol) and place your code here.

Using the critical value method, what would your decision regarding H, be? Give your

complete reasoning.

Delete me; type your answer here. Do not copy-paste any symbols from outside sources, and

do not remove the asterisks./n#Delete this line (including the # symbol) and place your code here.

Under Ho, this test statistic will follow an F distribution with df, and df, degrees of

freedom. What are these degrees of freedom?

Delete me; type your answer here. Do not copy-paste any symbols from outside sources, and

do not remove the asterisks.

Part (e) [2 marks]

To calculate probabilities of the form P Fa, an

this function is pf (x, dfl= ..., df2 = ...).

Use the pf function to find the P-value for this test.

#Delete this line (including the # symbol) and place your code here.

Based on this P-value, make a fully-worded conclusion to this test.

Delete me; type your answer here. Do not copy-paste any symbols from outside sources, and

do not remove the asterisks.

Part (f) [2 marks]

We can perform a much faster ANOVA test with the aov function.

To create the ANOVA table, the function we use in R is called aov. The syntax of this

function is aov (formula, data = ...). In this function:/nPart (c) [3 marks]

Remember that our test statistic in an ANOVA test is the F-statistic, defined as

where

MSG=

F=

SSG

MSG

MSE

SSE

N-I

and MSE=

Use the output from Part (b) to calculate and print out the SSG and SSE.

#Delete this line (including the # symbol) and place your code here.

Now, calculate and print out the MSG and MSE.

#Delete this line (including the # symbol) and place your code here.

Finally, calculate and print out the F statistic for this test.

#Delete this line (including the # symbol) and place your code here.

Under Ho, this test statistic will follow an F distribution with df, and df, degrees of

freedom. What are these degrees of freedom?

Delete me; type your answer here. Do not copy-paste any symbols from outside sources, and

do not remove the asterisks.

Part (e) [2 marks]/nPart (a) [1 mark]

Make a boxplot comparing the metascores for each platform. Remember to use the format

boxplot(yx, data= DATA), where y is your quantitative variable of interest, x is

your vector of groups, and DATA is your dataset.

#Delete this line (including the # symbol) and place your code here.

Comment on what you see above. Do you notice a difference in the centre of each dataset?

Which dataset appears to have the lowest centre, and which dataste appears to have the

highest centre?

Delete me; type your answer here. Do not copy-paste any symbols from outside sources, and

do not remove the asterisks.

Part (b) [1 mark]

In Parts (b)-(d), we will use R to help us conduct an ANOVA test to determine if there is a

significant difference in the mean Metascore for each platform, at the 1% level of

significance.

First, use aggregate with FUN = mean to calculate the mean metascore of each group.

Also, use aggregate with FUN = length to calculate the sample size of each group, and

use aggregate with FUN = sd to calculate the standard deviation of each group.

#Delete this line (including the # symbol) and place your code here./nQuestions [25 marks]

Question 1 [14 Marks]

Import the Games 200 dataset. This dataset contains a random sample of 200 games

released in 2019, along with the metascore (average critic review), the userscore (average

user review), and platform of release.

#Delete this line (including the # symbol) and place your code here.

Our goal is to determine whether each video game platform receives the same metascore

on average, or not, based on this sample. We will assume that Metascores are Normally

distributed for each platform.

Our hypotheses will be:

Ho: PPC PPS4 Switch Xx VSH: At least one differs,

where platio refers to the mean metascore for all games on that platform.

As usual, we will start with some exploratory data analysis.

Fig: 1

Fig: 2

Fig: 3

Fig: 4

Fig: 5

Fig: 6

Fig: 7