Lab 2: Hospital Rankings (Part I)

Multilevel and hierarchical models

Due: 11:59pm, Sunday, September 5



You all should have R and RStudio installed on your computers by now. If you do not, first install the latest version of R here: (remember to select the right installer for your operating system). Next, install the latest version of RStudio here: Scroll down to the “Installers for Supported Platforms” section and find the right installer for your operating system.


You MUST submit both your .Rmd and .pdf files to the course site on Gradescope here: Do NOT create a zipped folder with the two documents but instead, upload them as two separate documents. Also, be sure to knit to pdf and NOT html; ask the TA about knitting to pdf if you cannot figure it out. Be sure to submit under the right assignment entry. Finally, when submitting your files, please select the corresponding pages for each exercise.


You will analyze this data across three labs. This first lab is an individual lab and each student must submit a separate preliminary report by the due date. In the next lab, the TA will walk you through how to extend your preliminary model. In the final lab, you will work within your teams and submit one final report by that due date.

The Data

We will consider data from the Centers for Medicare and Medicaid Services on hospital costs and profit from the 2014 fiscal year. Our interest is in examining variability of net hospital income across states.

One primary item of interest is the ranking of states by net income of hospitals.

It is important to control for the potential influence of hospital ownership (in the variable called control) and of the number of beds (a proxy for hospital size). The ownership categories include Voluntary Nonprofit-Church, Voluntary Nonprofit-Other, Proprietary-Individual, Proprietary-Corporation, Proprietary-Partnership, Proprietary-Other, Governmental-Federal, Governmental-City-County, Governmental-County, Governmental-State, Governmental-Hospital District, Governmental-City, and Governmental-Other.


For this lab, ignore potential effects of number of beds and ownership of the hospital. Develop and fit a random effects anova model for netincome (your response variable) using state as your grouping variable. Obtain preliminary results on the rankings of the states.

Quick note: without including any other predictors, it turns out that there isn’t enough variation across states to properly estimate the parameters under the treatment effects parameterization. This is a very common problem especially when there are many groups. The scale of the response variable here also doesn’t help. One easy fix is to exclude the overall intercept in your specification so that you have a treatment means parameterization instead. You should consider doing that when fitting your model in R.

There will be a page limit on your final report next week but there is NO page limit for this preliminary report. Your preliminary report should include the following.

DO NOT INCLUDE R CODE OR OUTPUT IN YOUR REPORTS. All R outputs should be converted to nicely formatted tables. Feel free to use R packages such as kable, xtable, stargazer, etc.


Total: 10 points.