Blog Post 3 - Polling | 2020 Presidential Election Analysis

Blog Post 3 - Polling

September 28th, 2020

In this week’s blog, I wanted to explore how we can use polling data to predict the outcome of elections. FiveThirtyEight has developed a model to weigh polls based on the quality, size, and recency of the polls. They determine the quality of the polls based on how these polls have performed recently. The data available this week (which can be found on my Github) included data about the weight of the polls for the 2016 election, which was helpful in determining how to evaluate the polling data and use it to make conclusions. This week, I first examined how polls performed for incumbent and non-incumbent party candidates in presidential elections from 1968 to 2016. I then explored the polling data for the 2016 election on a state-by-state basis to understand how closely the polls matched the actual results, and I found that they were close on almost every election. I also looked into the swing states of Arizona, Florida, Michigan, Ohio, Pennsylvania, & Wisconsin to see how each candidate performed against the polls in these crucial states. Finally, I explored some similarities and differences between FiveThirtyEight’s election model and The Economist’s election model

Key takeaways

Polls are better at predicting the vote share for the incumbent than for the challenger. Polls done closer to the election seem to be marginally more informative than ones done farther away.
In general, the polls in 2016 were fairly accurate. In the 6 swing states I explored, there were some discrepancies and the discrepancies tended to benefit Trump. On average, he outperformed polls by 6 percentage points in these states, while Clinton only outperformed by 1.6 percentage points.
The FiveThirtyEight and Economist models are relatively similar in terms of the statistical background. By and large, the FiveThirtyEight model seems a bit more robust and I tend to prefer it, but it also could be the case that they are incorporating too many unncessary variables that could make the result less meaningful.

Past National Poll Performance

I examined the polling accuracy for polls conducted two weeks or less before election day for both the incumbent and non-incumbent parties, as well as the polling accuracy for polls conducted more than two weeks from the election. It was interesting to see the degree to which the error bands around the best fit line were much narrower for the incumbent party than the non-incumbent party. I would have expected a relatively similar trend because there tends to only be two candidates, so naturally one candidate performing well would lead to the other candidate underperforming, but there have been cases of third parties influencing elections (1992, 1996 & 2000 being the most notable examples in recent history). Additionally, some respondents to polls do not provide a binary answer to the poll, so it could be the case that because there tends to be more uncertainty around the non-incumbent party’s nominee that we would then see uncertainty in the predictive power of polling about this nominee, which seems to be the case from this data set. One thing that stuck out to me in the explanation of FiveThirtyEight’s model is that they weigh polls more heavily closer to election day (The Economist does as well but not to the same extent, it appears). I looked at the predictive power of the polling data two weeks or closer to the election and the polling data from two weeks or more until the election. For both the incumbent and non-incumbent parties, the general trend was the same, but there was an increase in the uncertainty, though seemingly not an entirely meaningful one.

2016 Polling

The accuracy of the 2016 polls was a controversial topic, and although they performed well in general, they did not perform well in crucial swing states that gave Trump the election. I looked at the national trend first, where the relationship between the polls and the actual result was fairly close. This may be because I used the weights that were part of the data set to weigh the polls properly, though these weights could have been determined retroactively, thus negative the importance of the result. I took the weight on each poll as a fraction of the total weight placed on all of the polls and multiplied it by the share of the vote won by Trump and Clinton in each of the polls, then combined these polls together for each state, thus getting a weighted average of the polls for the 2016 election cycle.

The states that I explored were Arizona, Florida, Michigan, Ohio, Pennsylvania, & Wisconsin, which are all important swing states and this cycle and are all (with the possible exception of Arizona) consistently important swing states. In Michigan, Ohio, Pennsylvania, and Wisconsin, Trump outperformed the weighted polling average by over 6 percentage points in each state, with the polls underrating Trump by as much as 7.26 percentage points in Ohio, a frequent swing state that went to Obama twice and then was won comfortably by Trump. This will be interesting to monitor for this election and may require the weights of certain polls to be adjusted.

FiveThirtyEight vs. The Economist

Both models are interested in both polling data and fundamentals data (economic and other macro indicators), but they approach this balance in different ways. FiveThirtyEight’s model focuses more on polls and uses the fundamentals data to supplement their findings, which makes sense, as they argue that “fundamentals-only” models are not particularly precise. The Economist model starts with fundamentals and their perception of each state’s partisan swing, then uses a Bayesian framework to update their prior prediction based on data from state polls. There are merits to this approach, but I wonder if it ends up leading to overcorrections and also by being overly informed by the most recent poll, even if there is another poll only a few days earlier that presents a different result which may in fact be closer to reality. Both models take COVID-19 data into consideration, which seems like it would throw off modeling precision because of how unprecendented this election is; I was intrigued by how The Economist chose to cut off outliers in economic data, which makes sense, but seems like it was done fairly arbitrarily. FiveThirtyEight’s model is built around polls, and as a result, requires a very strict polling grading system to develop weights, which The Economist does not seem to have. The grading system also seems somewhat arbitrary, but it is based on the data that they have and seems to be done thoughtfully, which makes me think that it is a useful tool that The Economists’ is missing. Another intriguing difference is about how they adjust to new information. In the FiveThirtyEight model, as the election nears, the fundamentals are basically gone from the model, to the point where they vanish when the model hits election day. The Economist model, because it takes an explicitly Bayesian approach, will always, to some extent, factor in these fundamentals and the data that was fed into the model at all previous times. It makes sense that older information is weighted relatively less strongly, but it still plays a role. I think the FiveThirtyEight approach makes more sense because once the election nears, the information presented by the fundamentals is internalized by the voters, consciously or unconsciously, and should therefore be reflected in the polls. The ultimate goal is to predict what voters will end up doing on election day, so it does not necessarily make sense to include the fundamental data so strongly when it is already internalizedby the polling data. FiveThirtyEight, unlike The Economist, has the advantage of having already built out a model during past elections, and although The Economist was able to train its model on past data, the FiveThirtyEight model has been updated in real time for past elections, making it easier for FiveThirtyEight to build upon their past work. However, they chose not to do this, as they felt that the 2020 election was particularly unique, which is a fair statement to make. I wonder, however, if the FiveThirtyEight model is overfitting based on this election cycle, and also whether it is too specific. It takes into account COVID-19 issues to a greater extent and also tries to predict how voter suppression might impact the election. Though these are important things to condition on, it seems hard to control for these things when testing the model, so it may be the case that they have made faulty assumptions that will then be reflected in their ultimate model. Maybe it is the case that a more simplistic (though still very nuanced) model like The Economist’s could perform better. It is not possible to know that at this point however. It may also be the case that by engaging in simulation, both FiveThirtyEight and The Economist’s models are able to determine which of their variables are relevant and which are not; I tend to believe that this is probably true, so I trust the robustness of the FiveThirtyEight model.