After Clinton’s “win from behind” in New Hampshire, those who weren’t claiming Diebold conspiracies were left wondering how the polls could get it so wrong. One popular way the media uses to report polling data is by using the Real Clear Politics average. I see two problems with the Real Clear Politics average that contributed to the discrepancy between the polls and the final result: the averages don’t report the proportion of undecideds and when there are many polls, the average will be over a very short period of time. Case in point, the final Real Clear Politics average for New Hampshire sampled polls taken spanning a period of two days (1/5 – 1/7). The problem is that all these polls were taken after Iowa, when Obama was riding a wave of hype after his caucus victory. Here’s a look at how the average of polls before Iowa (and after November) and the average of polls taken after Iowa compare to the final primary results. (NB: the data is from Real Clear Politics, but unlike them I’ve included an Other/Undecided column).
|New Hampshire Polls Dec 2007 – Jan 2008|
The polls taken before Iowa, therefore, picked the margin between Clinton and Obama far more accurately than the polls taken after Obama’s success. But it’s the level of Undecideds that tell the story. Undecided is a very nebulous, poorly-defined category. Ignoring the minor candidates, an “undecided” voter sounds like one that could be equally likely to vote for either candidate. In reality this group could be very different indeed. Many, probably most, of the undecideds would in fact have a preference (be it slight or major) for either Obama or Clinton. What I suspect happened is that those tending Obama were convinced after his success in Iowa and those tending Clinton either voted for her anyway or were convinced the night before when she showed us she wasn’t just a robot programmed to become President. That would explain Obama’s increase (slightly overstated, perhaps due to hype) and the undecideds’ decrease after Iowa and why Clinton picked up the undecideds on polling day.
What is clear is that pollsters need to start examining the undecideds more closely. I propose pollsters split the Undecided column into the categories talked about above: “tending towards candidate x”. Suppose they had done that, this is what the above table may have looked like:
|Obama||Clinton||Edwards||Richardson||Other||Tend Clinton||Tend Obama|
Ultimately, the same polls that seem so inaccurate could have told the story of the primary if they had had a couple more categories.
I have a love-hate relationship with polls. As a politics junkie I find them fascinating, but they also contribute the politics as a game. Voters want to back a winner, so minor candidates (who may have interesting, novel ideas) are marginalized not because of these ideas, but because they “can’t win”.
To be useful, polls have to be designed well and reported well. A design flaw in the polls (no tending x category) and deficiencies in reporting (ignoring undecideds) both contributed to the controversy over New Hampshire.