Are the polls systematically underestimating Obama’s support?

Those of you who followed these posts during the nomination campaign will remember my constant refrain that not all surveys are alike and my determination to make you look at their sampling techniques in order to evaluate their accuracy. We should have the same concerns when looking at polling numbers in the general election. For much of the summer, most polling organizations simply polled registered voters.  This is appropriate for understanding the potential outcome of a presidential race.  But in trying to forecast the likely outcome, many pollsters believe it makes little sense to rely on a random sample of all eligible voters, or even registered voters, since we know that only 6 or so out of every 10 such voters (or even less) will actually cast a presidential ballot (and yes, I’m not one of them). As a result, at this stage of the race, most of the national polling organizations switch their polling and begin surveying likely voters. They want to randomly sample from that subset of voters who are actually going to vote, as opposed to simply being eligible to vote.  As Garrett Saito suggested in an email to me, however, that raises an important question: how do they determine the likely voters?  Garrett, and others, have suggested that it is possible these polling organizations might be underestimating Obama’s support.  Why might this be?

Some of you might recall that earlier this year in June Gallup ran two polls – one of likely voters, and one of registered voters. Obama led McCain among registered voters, 47% to 44%, but McCain led among likely voters, 49-44%.   And, in fact, historically (at least as far back as we have polling data), Republicans have been more likely to be included in likely voter models despite usually lagging behind Democrats in terms of registered voters.  In the actual presidential elections, the Republican candidate has often won despite the fact that there are usually more registered Democrats which appears to validate this sampling approach.  So Gallup and other organizations typically oversample in their likely voter model from Republicans, relative to their numbers among registered voters.  This is usually appropriate, but the key question is whether that dynamic still holds true in this election; is the traditional means by which pollsters identify likely voters still valid, or are there reasons to believe that the prevailing methodology needs to be reassessed?  Some observers  suggests that because Obama is attracting such strong support among younger people and others who have never voted before, the likely voter samples might be overestimating McCain supporters, and underestimating Obama voters.  In a close election, the argument goes, these newly registered voters might swing the election to Obama, but the polls will miss this.  As evidence, they point to party registration figures in a number of states which indicate that enthusiasm for the Democratic ticket is much higher than in past elections.

Is there any evidence to support this argument?

To answer that question, we need to understand how polling organizations determine likely voters.  Unfortunately, each polling organization has their own method for determining likely voters and not all of them reveal how they do so. (If you are interested in looking at this topic in more detail, here’s a link to a discussion from 2004 by polling expert Mark Blumenthal which is the one I rely on most frequently.  Keep in mind, however, that some organizations may have changed their methodology since then):

http://www.mysterypollster.com/main/2004/11/likely_voters_v.html

In examining how polling organizations determine likely voters and whether they are underestimating Obama’s support, I look at the following issues.

1.      Does the polling organization use previous voting as one of their indicators of a likely voter?  This could potentially lead to underestimating Obama’s support if most of these newly registered voters are likely to vote Democrat. (In 2004 survey organizations using previous voting as part of their likely voter screen included ABC Washington Post, AP-IPSOS, ARG, CBS/New York Times, Democracy Corps, FOX/Opinion Dynamics, Gallup, Harris, LA Times, Newsweek, Pew, Quinnipiac, Rasmussen and Time.)

2.      Does the polling organization weight its sample of likely voters by party – that is, does it assume a priori that a certain percentage of voters will be Republican, Democratic, Independent, etc., and adjust its survey results accordingly? If so, and this weighting underestimates the percentage of Democrats who are likely to vote, then again the poll could underestimate Obama’s support, assuming most Democrats vote for Obama.

3.      Does the polling organization attempt to calibrate their likely voters sample according to expected turnout?  For example, Gallup historically chooses an expected cutoff figure for turnout – say 60% – and uses that to adjust its likely voter model. Again, if turnout is much higher due to an influx of Obama supporters, that could skew the likely voter model.  In 2004 seven survey organizations – ABC/Washington Post, Gallup, LA Times, Newsweek, Pew, Quinnipiac and Time – used this method.

So, what does this mean for polling results in this election cycle?  Is there reason to believe the polls are systematically underestimating Obama’s support?  There is no single answer to this question, in part because each polling outfit uses slightly different methods for determining a likely voter. Certainly there is the potential for bias, and the polling organizations are well aware of this.

One way to counter potential bias is not to rely on any single methodology for determining likely voters. In previous posts I have cautioned about relying on the RCP average of the polls because it is rolling average that includes polls from different time periods in its estimate, and so can be slow to react to trends. On the flip side, however, by averaging polls using different methodologies, it is less likely to be biased toward any one method for determining the likely voter and hence is less likely to be biased against Obama.  So the RCP poll may be less biased in terms of likely voter models.

More generally, keep the following points in mind when considering the issue of bias in the polls. Most importantly, we know, based on past elections, that already 80% of voters have likely already determined for whom they will vote in the 2008 election. From this perspective, the ups and downs in tracking polls can be understood as partly statistical noise reflecting errors in estimation, and partly a measure of which candidate, in voters’ minds, is getting the better of the argument or media coverage, at the time.  These variations aren’t necessarily capturing changes in how people are planning to vote.  The exception is that subset of voters – particularly independents – who tend to make up their mind very late in the election cycle. Historically, likely voter models becomes more accurate as the campaign winds down, for the simple reasons that as more people become interested in the election, it is easier to choose a likely voting sample.  Second, every four years we hear about how younger voters will make a difference in the election, and every four years they don’t.  In 2004, voting among 18-29 year olds was up, but the increase was not as much as among older voters, and Bush benefited by the overall increase in turnout more than did Kerry, and expanded his support from 2000.  So, until voting patterns change, it is not unreasonable for pollsters to rely on models that have worked well in previous elections. Third, while it is true that enrollments are up disproportionately among Democrats in many states, we have to see if the Sarah Palin factor begins to counteract this – it may be that likely voter models are underestimating her impact as well.  In particular, if Palin’s support is drawn disproportionately from the bitter, religious-leaning, gun-toting blue collar workers who supported Clinton, then likely voter models based on party could be skewed against the Republicans.

The bottom line is that this is a precedent breaking election.  This means that at least some likely voter models will be off this time around. However, until we have conclusive evidence that voting patterns are systematically different, it is difficult to say which models are wrong or why.  At best, then, we need to read these polls results with caution and avoid relying on any single result.  Pay attention to the fine print describing how they determine a likely voter. Taken collectively, however, these polls remain the best source of data we have regarding voter opinion during the campaign.  And – at this point – I don’t see any evidence that they are systematically underestimating Obama’s support.  But we can’t be sure of this.  What we can do, however, is compare polls and understand why they differ – it almost always has to do with how they construct their samples.

Having spent all this time discussing polling data, I am now going to explain why political scientists don’t need any of it to predict who will win the 2008 presidential election.  In fact, their predictions are already in!

A final thought: I have received some excellent comments from many of you in my email inbox.  In particular, many of you have taken issue with some of my observations, or provided your own election analysis.  Everyone would benefit from these comments, and so I urge you to post them on my blog (you need not attach your name).  We all can benefit from a broader discussion of the issues that I am raising. So join in!

3 comments

  1. OK, no second invitation needed ! My concerns with polling aside (here’s an extreme – to a passive/politically apathetic public, does an opinion poll inform or create consent ?) I still continue to be amazed by the weightage given to personality over policy in the US media and electioneering process(being of Indian descent, I had seen and peripherally participated in lengthy (albeit boring) issue-based discussions in India, designed to inform/enlighten rather than entertain). I’m beginning to think that if the Repubs continue on a campaign of ‘winning the daily news cycle’ and keep Obama on the back foot, and if Obama continues to base his campaign on issues, he may lose.. Facts take a back seat to the delicious scandal of the one-liner.

    By the way, Matt, when is your prediction going to come out ? When we took your class a few weeks ago, I thought you said you would post it after the conventions ?!

  2. One additional undersampling issue – telephone polls only call listed landline telephones, thus excluding voters who only use cell phones, Skype, etc. Common sense would suggest that such voters are more urban and young, thus skewing toward Obama. How would you account for this technological data bias?

    Nice to see you in the blogosphere, Matt!

  3. If for the most part we are to trust the polls, then the huge disparity between what they report and what other election models predict must be examined. You’ve mentioned the election fundamentals this year favor the generic Democratic candidate over the generic Republican candidate. As evidence of this, the Abramowitz election model as of Sept. 8 puts Obama ahead 54.3% to 45.7%. In contrast, Real Clear Politics’ national average has McCain up 47.5 % to 45.2 %, a far closer race, with McCain in the lead.

    The Abramowitz model assumes elections are a referendum on the current administration. Does the fact that the race is so much closer than it should be according to this model suggest that McCain really has succeeded in separating himself from the Bush administration, despite Obama’s attempts to link him to it?

    The Republican Convention was marked by the absence of George W. Bush and Dick Cheney, conveniently stationed in Washington or across the globe. However, the substance of the convention in many ways did not depart from the policies of the Bush administration. “Drill, baby, drill” hardly fits the theme of change.

    Since candidates have few opportunities to affect their own destiny, as Professor Dickinson pointed out, any success McCain has had in separating himself from the current Republican administration can be attributed to McCain’s pick of Sarah Palin. Importantly, this was an opportunity entirely handed to the Republican’s by the Obama campaign’s failure to choose Hillary Clinton as VP. During the primaries, these posts often pointed to small ways in which the Obama campaign was well organized and perhaps better run than the Clinton campaign (for example the delay of reporting from Lake County in Indiana, the rollout of the Edwards endorsement). Has the campaign fallen apart in recent weeks? Are other factors, such as Obama’s race or inexperience significant enough to change the face of the election?

    Perhaps a more interesting disparity than the one between the polls and non-survey based election models is the disparity between how the media and punditry are reacting to the Palin pick and the reality on the ground. A great number of writers are focusing on her perceived incompetence in national and international affairs and her socially conservative views on issues considered important to women, and in doing so have blamed McCain for running a purely “political” campaign, or as Thomas Friedman put it in “Making America Stupid,” selling his soul (http://www.nytimes.com/2008/09/14/opinion/14friedman.html?ref=opinion). However, none of the pundits’ disparaging of Sarah Palin has altered the fact that she remains hugely popular with the public, making her pick a brilliant move by McCain. Furthermore, I have yet to see anyone in the mainstream take on this side of the Palin topic seriously, other than to call the American public stupid.

    Is no press bad press or do we live in a far more culturally divided nation than I thought?

Leave a Reply

Your email address will not be published. Required fields are marked *