Edit 11/25/20: Another relevant graph (see here for context).
I caught a small error in the data going into the above graphs. The variable I thought was total votes cast in a county in 2020 did not actually equal the sum of all individual candidate votes (there’s no codebook or anything similar to check this). Fortunately, these two total votes variables were almost identical (correlated at > 0.999) so this didn’t change the overall trend in results (two outlier ratio values no longer remained in the data and so the y-axis span changes). I reproduce the above graphs with the correct measure, along with updated voting data, below.
Edit 11/19/20: Here’s another graph looking at how turnout and vote correlate at the county level, but this time measuring both in terms of changes from 2016 to 2020.
Note: This post and analysis emerged from discussion with others on Twitter. Many thanks to them for engaging and their thoughts/work–find the exchange in a thread starting here.
A growing theory for polling error in the 2016 and 2020 elections is nonresponse bias along social trust levels. Low social trust Americans have long been underrepresented in surveys, but historically this hasn’t mattered for pre-election polls because social trust was uncorrelated with the outcome of interest in elections — presidential vote choice. This new theory posits the trust/vote relationship changed starting with Donald Trump on the ballot in 2016, and thus lower response rates among low-trust Americans now present a serious problem for pollsters.
One important check of this theory is whether the relationship between social trust and voting did in fact change starting in 2016. I used ANES surveys in 2012 and 2016 to evaluate this question, regressing Republican vote choice in each year (Mitt Romney and Trump) on the typical measure for general social trust–“Generally speaking, how often can you trust other people?–coded such that higher values indicate more distrust. The first panel of regression coefficients in the below graph shows the results.
The coefficient in the first row shows a small association between social distrust and Trump vote that is not statistically significant. However, after adding four standard demographic controls in the second row, the relationship becomes strong, significant, and in the expected direction (positive). Going from most to least trustful individual increases the probability of voting for Trump by 0.12 percentage points. This indicates one of the demographic controls was a “suppressor variable“–a variable that, when uncontrolled for, conceals a real relationship between independent and dependent variables of interest (and correlates with the IV and DV in opposite directions). It turns out that the addition of race is what changes results here and thus acts as the suppressor variable (indeed, nonwhite racial identities are positively correlated with distrust but negatively correlated with Trump, i.e. the opposite directions).
The bottom two coefficients in the first panel are similar to the top two except for the outcome is now voting for Romney, not Trump. Here, we first see a strong negative relationship between distrust and Republican vote choice in 2012. When demographic covariates are accounted for (fourth row), the relationship remains negative but is smaller and insignificant. Most importantly, this confirms the changing nature of the relationship between social trust and vote choice. In the election before Trump, there was no strong relationship between trust and vote–and importantly, opposite of the positive direction that now makes us worry about survey nonresponse along trust levels. But in the first year with Trump on the ballot, that relationship changes, becoming strong and in the positive direction (once proper controls are accounted for).
There is yet another wrinkle in this story: in 2016, this trust/vote relationship varies by the mode of ANES interview–face-to-face (FTF, conducted in-person) or online (web). The bottom panel in the above graph focuses on modeling 2016 Trump vote and shows results (again, with and without controls) for these two modal samples. The top two rows contain results for FTF interviews only while the bottom two rows contain results for web interviews only.
For FTF interviews, there is no strong relationship between trust and vote, and it’s far from the expected positive direction. However, for web interviews, a positive relationship emerges (third row, 0.08 coefficient) and becomes strong and significant (fourth row, 0.17 coefficient) once demographic controls–including the important suppressor variable, race–are included in the models. This mirrors findings for the same type of analysis using data from the General Social Survey (GSS), which only conducts surveys in-person, and similarly fails to find a significant positive relationship between distrust and 2016 Trump vote choice. Meanwhile, analyses using other online surveys find a significant positive relationship between social trust and 2016 Clinton vote choice–the same pattern the 2016 ANES web interviews revealed (just with inverted IV/DVs). Given the ubiquity of online surveys and paucity of face-to-face surveys in the current pre-election polling landscape, this modal difference–as well as the over-time change in how trust and vote correlate–is very much consistent with surging concern over polling error in the last two elections.