class: center, middle, inverse, title-slide # STA 610L: Module 4.2b ## Poststratification and weighting (Part II) ### Dr. Olanrewaju Michael Akande --- ## Multilevel model As mentioned in the previous module, we will fit a multilevel model for individual survey responses on gay marriage rights given demographics and geography, i.e. each individual's response will be a function of their demographics and state. Let `\(i\)` index each individual, `\(j\)` index the race-gender combination, `\(k\)` index the age-education combination, `\(s\)` index each state, and `\(r\)` index region. We denote `\(y_{ijksr}=1\)` for supporters of same-sex marriage and `\(y_{ijksr}=0\)` for opponents and those with no opinion. We model the mean for the state effect as a function of 3 state level variables: the region into which the state falls, the state's conservative (defined as evangelical+LDS) religious percentage, and its Democratic 2004 presidential vote share. --- ## Model We will not do any model selection here; this model is based on the questions of interest. ```r #run individual-level opinion model ml.mod0 <- glmer(yes.of.all ~ race.female + age.edu.cat + p.relig + kerry.04 + (1|state) + (1|region), data=marriage.data, family=binomial(link="logit"), control=glmerControl(optimizer="bobyqa",optCtrl=list(maxfun=2e5))) summary(ml.mod0) ``` --- ## Model .small[ ``` ## Generalized linear mixed model fit by maximum likelihood (Laplace ## Approximation) [glmerMod] ## Family: binomial ( logit ) ## Formula: yes.of.all ~ race.female + age.edu.cat + p.relig + kerry.04 + ## (1 | state) + (1 | region) ## Data: marriage.data ## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05)) ## ## AIC BIC logLik deviance df.resid ## 7461.8 7630.7 -3705.9 7411.8 6316 ## ## Scaled residuals: ## Min 1Q Median 3Q Max ## -1.9235 -0.7099 -0.4745 0.9889 4.1065 ## ## Random effects: ## Groups Name Variance Std.Dev. ## state (Intercept) 2.891e-09 5.377e-05 ## region (Intercept) 2.559e-02 1.600e-01 ## Number of obs: 6341, groups: state, 49; region, 5 ## ## Fixed effects: ## Estimate Std. Error z value Pr(>|z|) ## (Intercept) -1.267524 0.447975 -2.829 0.004663 ## race.femaleBlMale 0.065898 0.158329 0.416 0.677260 ## race.femaleHMale 0.267396 0.161471 1.656 0.097721 ## race.femaleWhFem 0.449626 0.061674 7.290 3.09e-13 ## race.femaleBlFem -0.092411 0.135863 -0.680 0.496392 ## race.femaleHFem 0.537123 0.164535 3.264 0.001097 ## age.edu.cat18-29,HS 0.037365 0.240087 0.156 0.876323 ## age.edu.cat18-29,SC 0.405350 0.248459 1.631 0.102794 ## age.edu.cat18-29,CG 0.560073 0.252118 2.221 0.026319 ## age.edu.cat30-44,<HS -0.643885 0.329296 -1.955 0.050543 ## age.edu.cat30-44,HS -0.536826 0.237163 -2.264 0.023603 ## age.edu.cat30-44,SC 0.026699 0.239138 0.112 0.911105 ## age.edu.cat30-44,CG 0.139527 0.226531 0.616 0.537942 ## age.edu.cat45-64,<HS -1.142234 0.337787 -3.382 0.000721 ## age.edu.cat45-64,HS -0.792911 0.230721 -3.437 0.000589 ## age.edu.cat45-64,SC -0.608716 0.235038 -2.590 0.009602 ## age.edu.cat45-64,CG -0.016679 0.224501 -0.074 0.940778 ## age.edu.cat65+,<HS -1.619029 0.326969 -4.952 7.36e-07 ## age.edu.cat65+,HS -1.532324 0.251011 -6.105 1.03e-09 ## age.edu.cat65+,SC -1.099368 0.268163 -4.100 4.14e-05 ## age.edu.cat65+,CG -0.568061 0.247758 -2.293 0.021859 ## p.relig -0.014821 0.004895 -3.027 0.002466 ## kerry.04 0.019578 0.006778 2.889 0.003870 ## optimizer (bobyqa) convergence code: 0 (OK) ## boundary (singular) fit: see ?isSingular ``` ] --- ## Model I am a bit concerned about the amount of information available to estimate the race-gender and age-education combinations. So, let's actually treat the race-gender and age-education combinations as random effects to borrow information across their levels. We therefore fit the following model. $$ `\begin{split} \mathrm{logit}\left[\mathrm{Pr}(y_{ijksr} = 1)\right] & = \beta_0 + \beta^{relig}\cdot relig_{s} + \beta^{vote} \cdot vote_s\\ & + \alpha^{region}_{r} + \alpha^{state}_s + \alpha^{race,gender}_{j} + \alpha^{age,edu}_{k};\\ \\ \alpha^{region}_r & \sim N(0,\sigma^2_{region}), ~~~\mathrm{r = 1, ..., 5}; \\ \alpha^{state}_s & \sim N(0,\sigma^2_{state}), ~~~\mathrm{s = 1, ..., 51}; \\ \alpha^{race,gender}_j & \sim N(0, \sigma^2_{race,gender}), ~~~\mathrm{j=1, ..., 6}; \\ \alpha^{age,edu}_k & \sim N(0, \sigma^2_{age,edu}), ~~~\mathrm{k=1,...,16}. \\ \end{split}` $$ --- ## Model Using a slightly different notation, we can also write the model as `$$\mathrm{logit}\left(\mathrm{Pr}(y_i = 1)\right) = \beta_0 + \alpha^{race,gender}_{j[i]} + \alpha^{age,edu}_{k[i]} + \gamma^{state}_{s[i]}.$$` That is, `\begin{eqnarray*} \alpha^{race,gender}_j &\sim& N(0, \sigma^2_{race,gender}), ~~~\mathrm{j=1, ..., 6}, \\ \alpha^{age,edu}_k &\sim& N(0, \sigma^2_{age,edu}), ~~~\mathrm{k=1,...,16}, \end{eqnarray*}` and `\begin{eqnarray*} \gamma^{state}_s &\sim& N(\alpha^{state}_s + \alpha^{region}_{r[s]} + \beta^{relig}\cdot relig_s + \beta^{vote} \cdot vote_s, \sigma^2_{state}), \\ \alpha^{region}_r &\sim& N(0,\sigma^2_{region}), \end{eqnarray*}` where `\(\mathrm{r = 1, ..., 5}\)` and `\(\mathrm{s = 1, ..., 51}\)`. --- ## Model coding ```r #run individual-level opinion model ml.mod <- glmer(yes.of.all ~ p.relig + kerry.04 + (1|race.female) + (1|age.edu.cat) + (1|state) + (1|region), data=marriage.data, family=binomial(link="logit"), control=glmerControl(optimizer="bobyqa",optCtrl=list(maxfun=2e5))) # just checking scale of these proportions summary(marriage.data$p.relig) ``` ``` ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 1.839 8.718 12.823 16.287 25.012 68.090 ``` ```r summary(marriage.data$kerry.04) ``` ``` ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 26.0 42.2 48.7 47.7 54.3 89.2 ``` --- ## Model results .small[ ```r summary(ml.mod) ``` ``` ## Generalized linear mixed model fit by maximum likelihood (Laplace ## Approximation) [glmerMod] ## Family: binomial ( logit ) ## Formula: ## yes.of.all ~ p.relig + kerry.04 + (1 | race.female) + (1 | age.edu.cat) + ## (1 | state) + (1 | region) ## Data: marriage.data ## Control: glmerControl(optimizer = "bobyqa", optCtrl = list(maxfun = 2e+05)) ## ## AIC BIC logLik deviance df.resid ## 7504.8 7552.1 -3745.4 7490.8 6334 ## ## Scaled residuals: ## Min 1Q Median 3Q Max ## -1.8404 -0.7100 -0.4845 0.9989 3.8023 ## ## Random effects: ## Groups Name Variance Std.Dev. ## state (Intercept) 1.284e-08 0.0001133 ## age.edu.cat (Intercept) 3.945e-01 0.6280828 ## race.female (Intercept) 4.959e-02 0.2226868 ## region (Intercept) 3.519e-02 0.1875976 ## Number of obs: 6341, groups: ## state, 49; age.edu.cat, 16; race.female, 6; region, 5 ## ## Fixed effects: ## Estimate Std. Error z value Pr(>|z|) ## (Intercept) -1.497284 0.436674 -3.429 0.000606 ## p.relig -0.014779 0.004886 -3.025 0.002487 ## kerry.04 0.019112 0.006747 2.833 0.004617 ## ## Correlation of Fixed Effects: ## (Intr) p.relg ## p.relig -0.660 ## kerry.04 -0.868 0.660 ``` ] --- ## Model results Note we have no responses from AK or HI. ```r # note nobody from AK or HI in survey marriage.data %>% filter(state=="AK",state=="HI") ``` ``` ## [1] state p.evang p.mormon ## [4] kerry.04 poll poll.firm ## [7] poll.year id statenum ## [10] statename region.cat female ## [13] race.wbh edu.cat age.cat ## [16] age.cat6 age.edu.cat6 educ ## [19] age democrat republican ## [22] black hispanic weight ## [25] yes.of.opinion.holders yes.of.all state.initnum ## [28] region no.of.all no.of.opinion.holders ## [31] race.female age.edu.cat p.relig ## <0 rows> (or 0-length row.names) ``` --- ## Predictions We make predictions in states, broken out by the demographic groups of interest, which will allow us to poststratify down the road. For now we calculate the predictions, and we'll examine them closely later. ```r ps.ml.mod <- Census %>% mutate(support=predict(ml.mod,newdata=.,allow.new.levels=TRUE,type='response')) %>% mutate(support=support*cpercent.state) %>% group_by(state) %>% summarize(support=sum(support)) ``` --- ## Bayesian model Now we fit a fully Bayesian model, with same data model as the ML model but with default priors to help some more with borrowing of information and convergence. ```r bayes.mod <- brm(yes.of.all~(1|race.female)+(1|age.edu.cat)+(1|state)+(1|region)+ p.relig+kerry.04, data=marriage.data, control = list(adapt_delta = 0.99,max_treedepth = 12), family=bernoulli()) ``` --- ## Bayesian model results .small[ ```r summary(bayes.mod) ``` ``` ## Family: bernoulli ## Links: mu = logit ## Formula: yes.of.all ~ (1 | race.female) + (1 | age.edu.cat) + (1 | state) + (1 | region) + p.relig + kerry.04 ## Data: marriage.data (Number of observations: 6341) ## Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; ## total post-warmup draws = 4000 ## ## Group-Level Effects: ## ~age.edu.cat (Number of levels: 16) ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## sd(Intercept) 0.70 0.15 0.47 1.05 1.00 936 1884 ## ## ~race.female (Number of levels: 6) ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## sd(Intercept) 0.31 0.16 0.13 0.71 1.00 1662 2588 ## ## ~region (Number of levels: 5) ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## sd(Intercept) 0.37 0.29 0.11 1.20 1.00 1519 2344 ## ## ~state (Number of levels: 49) ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## sd(Intercept) 0.07 0.04 0.00 0.16 1.00 1549 2023 ## ## Population-Level Effects: ## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS ## Intercept -1.52 0.53 -2.60 -0.51 1.00 2199 2764 ## p.relig -0.01 0.01 -0.02 -0.00 1.00 4960 3140 ## kerry.04 0.02 0.01 0.01 0.03 1.00 4802 3061 ## ## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS ## and Tail_ESS are effective sample size measures, and Rhat is the potential ## scale reduction factor on split chains (at convergence, Rhat = 1). ``` ] --- ## Benefits of Bayesian approach The most obvious benefit of a Bayesian approach is the total accounting of uncertainty, as we can easily see by plotting the estimated SD's of the group-level intercepts in the frequentist model against the posteriors from the Bayesian model. ```r library(broom.mixed) ml_sd <- broom::tidy(ml.mod) %>% filter(stringr::str_detect(term,"sd_")) bayes.mod %>% gather_draws(`sd.*`,regex=TRUE) %>% ungroup() %>% mutate(group=stringr::str_replace_all(.variable,c("sd_"="","__Intercept"="")), estimate=.value) %>% ggplot(aes(y=group,x=estimate)) + ggridges::geom_density_ridges(aes(height=..density..), rel_min_height=0.01,stat="density", scale=1.5) + geom_point(data=ml_sd) ``` --- ## Benefits of Bayesian approach <img src="4-2b-poststratification-II_files/figure-html/bayesbenefits2-1.png" style="display: block; margin: auto;" /> The dots are the point estimates from the frequentist model, but the Bayesian model gives you an idea of the full posterior distribution of values, from which we can sample. --- ## Poststratifying Bayes ```r #next let's get the point estimate and poststratify from the Bayesian model ps.bayes.mod <- bayes.mod %>% add_predicted_samples(newdata=Census, allow_new_levels=TRUE) %>% rename(support = pred) %>% mean_qi() %>% mutate(support = support * cpercent.state) %>% group_by(state) %>% summarize(support = sum(support)) ``` --- ## Comparing results Now we consider comparisons across the 3 approaches. ```r mod.disag[nrow(mod.disag) + 1,] = list("AK", mean(mod.disag$support), "no_ps") mod.disag[nrow(mod.disag) + 1,] = list("HI", mean(mod.disag$support), "no_ps") disag.ml <- bind_cols(mod.disag[,1:2], ps.ml.mod[,2]) %>% compare_scat() + xlab("disag model") + ylab("freq model") disag.bayes <- bind_cols(mod.disag[,1:2], ps.bayes.mod[,2]) %>% compare_scat() + xlab("disag model") + ylab("bayes model") ml.bayes <- bind_cols(ps.ml.mod[,1:2], ps.bayes.mod[,2]) %>% compare_scat() + xlab("freq model") + ylab("bayes model") ``` --- ## Plots ```r plot_grid(disag.ml) ``` <img src="4-2b-poststratification-II_files/figure-html/unnamed-chunk-4-1.png" style="display: block; margin: auto;" /> --- ## Plots ```r plot_grid(disag.bayes) ``` <img src="4-2b-poststratification-II_files/figure-html/unnamed-chunk-5-1.png" style="display: block; margin: auto;" /> --- ## Plots ```r plot_grid(ml.bayes) ``` <img src="4-2b-poststratification-II_files/figure-html/unnamed-chunk-6-1.png" style="display: block; margin: auto;" /> Note our predictions from the frequentist and Bayesian approaches are similar, and the models disagree with the disaggregated model from the last module, which does not borrow information. --- ## Prediction Now we can evaluate predictions, taking advantage of the uncertainty quantification advantages of the Bayesian approach. We will sample from the posterior to get predicted probabilities for each group of interest based on proportions obtained from the Census data. ```r predict_val <- predict(bayes.mod, newdata=Census, allow_new_levels=TRUE, nsamples=500, summary=FALSE) ``` --- ## Prediction .small[ ```r dim(Census) ``` ``` ## [1] 4896 15 ``` ```r head(Census) ``` ``` ## state p.evang p.mormon kerry.04 crace.WBH age.cat edu.cat cfemale .freq ## 1 AK 12.44 3.003126 35.5 1 1 1 0 467 ## 2 AK 12.44 3.003126 35.5 1 2 1 0 377 ## 3 AK 12.44 3.003126 35.5 1 3 1 0 419 ## 4 AK 12.44 3.003126 35.5 1 4 1 0 343 ## 5 AK 12.44 3.003126 35.5 1 1 2 0 958 ## 6 AK 12.44 3.003126 35.5 1 2 2 0 1359 ## cfreq.state cpercent.state region race.female age.edu.cat p.relig ## 1 21222 0.02200547 west WhMale 18-29,<HS 15.44313 ## 2 21222 0.01776458 west WhMale 30-44,<HS 15.44313 ## 3 21222 0.01974366 west WhMale 45-64,<HS 15.44313 ## 4 21222 0.01616247 west WhMale 65+,<HS 15.44313 ## 5 21222 0.04514183 west WhMale 18-29,HS 15.44313 ## 6 21222 0.06403732 west WhMale 30-44,HS 15.44313 ``` ] We'll focus on the first four subgroups: white Alaskan men with `\(<\)`HS education in the 4 age groups (18-29, 30-44, 45-64, 65+). The first 6 sampled support values for those men are in columns 1-4 here.... ```r dim(predict_val) ``` ``` ## [1] 500 4896 ``` ```r head(predict_val) ``` --- ## Poststratification again We could then use these predicted probabilities to estimate public opinion under a variety of assumptions (opinion of all residents, or applying other data on how frequently people in each demographic group vote, to get opinions of likely voters). These predictions based on data from the Census can be combined with information on how often people in each group vote to predict election outcomes. --- class: center, middle # What's next? ### Move on to the readings for the next module!