Posted by
Aaron Strauss

Every morning one of my friends IMs me to say either, “We’re winning!!!!” or “We’re losing :( :(.” He bases these conclusions on websites such as electoral-vote.com, which aggregate polls and produce a point estimate of the presidential electoral vote. While I’m sure my friend enjoys following the ups and downs of tracking polls (as sports fans enjoy watching games instead of just reading box scores), his ultimate question is who will win the election, not who is currently ahead.

But how well do current polls predict the outcome of the election? Professors Andrew Gelman and Gary King addressed this question in their 1993

work and found that polls are very poor predictors of the final vote. They even went so far as to call voters’ responses to pollsters “not “rational.” Wild swings in polling returns during conventions or after candidate gaffes often do not translate into long-term effects.

A recent

survey by the University of New Hampshire raised the possibility that pollsters, not citizens, are to blame for these poll fluctuations. UNH asked

*half *of respondents the standard preference question:

“Suppose the 2008 presidential election was being held today and the candidates were John McCain and Sarah Palin, the Republicans and Barack Obama and Joe Biden, the Democrats. Who would you vote for?”

These voters slightly supported Obama (46% to 45%), with 8% undecided (and 1% not responding). The other half of respondents were asked a seemingly similar question:

“Thinking about the presidential election in November, would you vote for: Republicans John McCain and Sarah Palin, Democrats Barack Obama and Joe Biden, someone else, or haven’t you decided yet?”

While the vote margin for these respondents was approximately the same (McCain +1), the percent of undecideds more than doubled, to 20%. This finding underscores the facts that **citizens realize they might change their mind** before the election, and that polls are **just snapshots in time, limited in their ability to predict outcomes.**

The mantra that “polls are just a snapshot” has been repeated often. And often, polls are used appropriately. For example, snapshot polling is helpful in determining who is benefiting from current events, or the effectiveness of a shift in campaign messaging (examples

here,

here, and

here).

On the other hand, websites (

some examples here) have no business tallying the electoral vote before the election. Who has “won” the September 25th electoral vote has no bearing on public policy, and provides noisier estimates of candidate momentum than so the current snapshots of the popular vote.

One of the smartest applications of polling (from a campaign junkie’s perspective) is to analyze which states are “pivotal” (i.e., the closest state that tips the election—FL in 2000 and OH in 2004). If you have followed Nate Silver’s analyses on

538, you may have noticed how Obama’s win percentage (top left of the homepage) fluctuated during the conventions, while the list of pivotal states (middle right) barely moved. The analysis of pivotal states has two great features: (1) it provides campaigns with actionable intelligence about where to direct their resources, and (2) it is much less vulnerable to the minute-to-minute fluctuations of the news cycle.

So the next time your preferred candidate is

behind by more than the percent of undecideds, instead of freaking out, remember that a large chunk of the electorate is still persuadable. And then channel any remaining nervous energy into volunteering in the nearest battleground state.

It seems worth noting, though, that the calculations identifying "pivotal states" are based on the same assumptions about the variance and covariance of state movements over time as the calculations of Electoral College win probabilities. In both cases, those assumptions are crucial to the answer.

Nate Silver at fivethirtyeight has a rather vague and convoluted explanation of what he's assuming (http://www.fivethirtyeight.com/2008/03/frequently-asked-questions-last-revised.html), with a promise of more detail leading to a dead link. The current Obama "win percentage" is 83%, which seems to me to imply that the real uncertainty in the estimates is greatly understated.

Andrew Gelman recently posted a new paper (http://www.stat.columbia.edu/~gelman/research/unpublished/election5.pdf) using past election results and this year's polls to forecast the Electoral College outcome. No mention of covariances in the simulations, and an 84% chance of Obama winning based on polls from _February_, increasing to 89% based on more recent polls. Again, that just seems silly.

Finally, Princeton's own Sam Wang produces Electoral College projections assuming no covariance in state movements, but then declines to translate them into win probabilities. (He helpfully points out that you can integrate the histograms yourself if you're so inclined.) He has a lengthy discussion of the issue (http://election.princeton.edu/2008/08/28/technical-note-correlated-change-among-states-revisited/) that seems to boil down to this: If the projection is viewed as a "snapshot" of the current situation, state-by-state "errors" will be uncorrelated (not clear why); and if the projection is viewed as a forecast of the actual election outcome, there are lots of complicated uncertainties that would need to be modeled somehow.

Uh, yeah.

I completely agree that the variance of the state estimates (relative to the national) is crucial. But, the variance of the national estimate is not crucial, which is what hinders many other models. For the former estimate of variance, we have a good chunk of data to use: 51 points per modern election. For the estimate of how the national vote varies over time, we only have a handful of cases.

I have an email chain with Prof. Gelman about his paper on this very point. He and Lock account for the variance of public polling on their estimate of the national vote but nothing for potential shocks between today and Election Day.

After reading Wang's site, I agree with him. Why should polling errors for the states be correlated today (unless polls are collectively misssing young people because of cell phones for instance)? Correlated errors are a much bigger factor when thinking about the state of the race in the future--for example, on Friday after the VP debate moves all the states together in one direction.