How Demographics, Identities, and Policy/Group Attitudes Divide 2020 Democratic Primary Vote Preferences




More here.

9/7/2019 edit: I recently discussed some of these findings in an article for the Washington Post, “How Joe Biden attracts both black voters and racially ‘resentful’ voters.” I wanted to add a few things about this and related analysis.


Something I don’t discuss as much in the piece are the varying interpretations one can take from the regression coefficients I present. This applies to all results, but I’ll discuss the case of Biden voting and hostile sexism as an illustrative example. The article ended up stressing a certain interpretation for this relationship: “Democratic voters who score high on a scale that measures sexism, for example, gravitate toward Biden,” as I say in the article. But because the dependent variable–voting for Biden (1) or not (0)–is a relative measure, there is another viable interpretation for the observed correlation: Democratic voters who score low on sexism are repelled by Biden. It might not be that Biden has successfully appealed to sexists and that these individuals with negative views about women have been especially compelled to support him, but rather that those most liberal in their gender views have decidedly rejected him. Or it could be a mix of both. The single regression coefficient cannot distinguish these possibilities however, so this should be kept in mind when interpreting this (and other results).

Often times, voting accounts focus on a single side of this story–such as racially resentful Americans voting for Donald Trump in the 2016 election. For that case, in trying to explain the outcome of 2016 vote choice, many overlooked racial liberals doubling down on voting Clinton (the comparison group of that outcome) as a way to explain the relationship between racial resentment and voting. Andrew Engelhardt has a great new article in the journal QJPS that discusses these issues in the context of the 2016 election.


Notes on methods I employ in the analysis are a little sparse, but the caption at the bottom of the original graphs I posted above contain a lot of relevant information. I’ll discuss and reiterate and expand on a few things here:

  • All the variables I use are scaled to be 0 or 1 (dummy/categorical variables) or run from 0 to 1 (for continuous variables). I do this for the purposes of better comparison across variables (the dummy/categorical variables were capturing the full effect of the associated measure, whereas if I left the continuous variables unadjusted they wouldn’t do this). In the future, I’ll test results using a different scaling approach, dividing unadjusted continuous variables by two times their standard deviation; thanks to James Conran for bringing this up. When I re-ran results with this other scaling approach, the relationships between prejudice and voting decreased a bit, but were still sizable, and the Biden paradox mentioned in the article still held.
  • The basic model involves using OLS (this becomes a linear probability model) to regress a binary outcome–support for a candidate (1) or not (0)–on several demographics and political identities: four-category age group, female dummy, three-category race, college degree dummy, three-category income, strong Democrat dummy, and three-category self-described ideology. Results from this regression are shown in the first plot above, with the “[Demographics and Political Identities]” subtitle. For the second main graph–with the “[Policy and Group Attitudes]” subtitle–the coefficients for each predictor come from separate models that include the basic model plus one additional predictor. This is done to avoid collinearity, increased missingness, kitchen sink modeling, etc. Two other small notes: in the regressions, I use 1) survey weights and 2) robust standard errors (using lm_robust() from the estimatr package in R).

Leftover Results

Lastly, below is discussion of some other interesting results I find, as well as caveats to keep in mind.

How people think of themselves in terms of ideological labels (moderate or liberal) matters more than specific policy views (such as on the minimum wage, Medicare for all, and immigration policy). Even though the first few primary debates have seen a heavy policy focus and opened policy cleavages between candidates, policy positions have yet to divide Democratic voters. But there are exceptions: Warren voters care more about clean and renewable energy and strongly oppose deportation of illegal immigrants, whereas Medicare for All supporters have latched onto Sanders.

Socioeconomic status and gender have some role in dividing voters. Wealthier individuals lean towards Harris whereas Sanders repels them, and Warren performs well among college educated voters. Men and women hardly vary in their preferences. But for predicting vote choice, these demographics pale in comparison to age, which creates yawning gaps in support. Biden draws considerably from the older crowd, while Sanders pulls heavily from younger Democrats, perhaps showing he has kept many of his 2016 primary supporters who had a strong youth bent.

This analysis and the results do not come without important caveats. The causal arrow between attitudes and vote choice is murky, as voters may decide on candidates first and update their views accordingly; the relationship between Sanders support and Medicare for All priority looks like a prime candidate for reverse causation. Second, some variables, like the policy questions, might be measured less precisely than prejudice and demographics, limiting straightforward comparisons.

How Demographics, Identities, and Policy/Group Attitudes Divide 2020 Democratic Primary Vote Preferences

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s