Please make sure your final output file is a pdf document. You can submit handwritten solutions for non-programming exercises or type them using R Markdown, LaTeX or any other word processor. All programming exercises must be done in R, typed up clearly and with ALL code attached as an appendix. Submissions should be made on Gradescope: go to Assignments \(\rightarrow\) Homework 1.
Game of Thrones (8 points)
The handling of female characters in the American series Game of Thrones, and indeed whether it is feminist or mysogynistic, has been hotly debated. The dataset gotscreen.RData contains information on the number of seconds of screentime for members of each gender in each episode of seven seasons of Game of Thrones. Using descriptive statistics as well as a two-way ANOVA model with interaction, explore whether an actorโs screentime differs by gender (male, female, or unspecified) and whether there are any differences in potential gender effects across the 7 seasons of the show. Your answer should include the following details.
OLS Estimation (7 points)
Part (a). Using the scalar formulation of the ANOVA model \(y_{ij} \sim N\left(\mu_j,\sigma^2 \right)\), with \(\mu=(\mu_1,\cdots,\mu_J)\), show that \(\widehat{\mu}_{OLS}=(\overline{y}_1,\cdots,\overline{y}_J)\), where \(\overline{y}_j\) is the sample mean in group \(j\).
Part (b). Using the scalar formulation of the ANOVA model \(y_{ij} \sim N\left(\mu+\alpha_j,\sigma^2 \right)\) with the constraint \(\sum_j \alpha_j=0\), assume \(n_j=n\) and show
i. \(\widehat{\mu}=\overline{y}_{\cdot \cdot}\), where \(\overline{y}_{\cdot \cdot}\) is the grand mean over all observations
ii. \(\widehat{\mu}=\frac{1}{J}\sum_j \widehat{\mu}_j\)
iiii. \(\widehat{\alpha}_j=\widehat{\mu}_j-\widehat{\mu}=\overline{y}_{j}-\overline{y}_{\cdot \cdot}\)
Part (c). Write the model in part (a) in matrix form assuming you have 3 groups with 3 observations each and show that \((X'X)^{-1}X'y\) yields the same estimates as in the scalar formulation. Note the parameter vector corresponding to this model should contain the three mean parameters \((\mu_1, \mu_2, \mu_3)\).
Contrasts (5 points)
Suppose you fit an ANOVA model to responses collected from J=3 groups. Consider the following two ANOVA model parameterizations. \[\begin{eqnarray} y_{ij}&=&\mu_j + \varepsilon_{ij} ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ (1) \\ y_{ij}&=& \mu + \alpha_1I(j=1)+\alpha_2I(j=2)+\varepsilon_{ij} ~~~~ (2). \end{eqnarray}\]
Part (a). Find the linear combinations of parameters in model (2) that are equivalent to \(\mu_1-\mu_2\), \(\mu_1-\mu_3\), and \(\mu_2-\mu_3\) in model (1).
Part (b). Show that the estimates of these mean differences are identical regardless of the coding scheme (1) or (2) used, either theoretically or by analyzing the Game of Thrones data using a one-way ANOVA model with gender as the only predictor.
Total: 20 points.