I want to follow up on my last post regarding how variations in poll results are often due to differences in how pollsters construct their samples. The previous post talked primarily about whether pollsters were sampling likely or registered voters. Obama, I suggested, polled better among registered voters. Today I want to look at another decision pollsters must make: whether to weight their sample by party identification and, if so, what weights to use. We know that whether one considers oneself a Democrat or a Republican is the biggest single determinant of how someone will vote. Not surprisingly, people tend to vote for the candidate who shares their party identification. So a poll that includes 40% Democrats in its sample is likely to have more favorable results for Obama than one that includes 35% Democrats, all other things being equal. Ditto for McCain and variations in the number of Republicans sampled.
To see how this makes a difference, consider two respected national polls that came out yesterday. CBS/NY Times came out with their monthly national poll that has Obama up 49-44, with 6 undecided.
Rasmussen, meanwhile, has the race tied, 48-48% in its latest tracking poll.
There is a 5% difference in their results. Both polls illustrate the importance of how samples are defined. Most pollsters will weight their sample so that it matches the overall U.S. population of registered voters (or likely voters, as the case may be) along major demographic variables: gender, race, income. For example, if a pollster’s initial sample of 1300 people included 57% women – which is higher than the number of women eligible to vote according to the U.S. Census – then the pollster would typically reduce the number of women actually counted in the poll to bring it closer in line with the population figures. That is called weighting the final sample.
However, not all pollsters weight their sample by party. That is, if 38% of registered voters in the U.S. are Democrats, many pollsters will not try to weight their sample to get the same proportion of Democrats. Instead, they believe that by weighting by other demographics, the party percentages should come out pretty close to the actual totals in the population as a whole. And they worry that if they try to fix the party weight at a particular percentage, they may skew results, particularly if party support seems to be very volatile. In other words, when it comes to partisan identification, some pollsters let the sample speak for itself, rather than impose their own weight to insure a particular percentage of party members. Historically, CBS has done this; in the last CBS/NYTimes poll taken a month ago, CBS did NOT weight by party. That poll had Obama up 45-42%, with 6% undecided (and 7% “other”).
However, the latest CBS poll DID weight by party. They averaged the number of Democrats who were polled in the three previous CBS/NYTimes poll, and made sure that today’s poll included that same average number of Democrats (and Republicans and independents). Just to give you an idea of what this means, let me provide the “raw” and weighted figures for both initial sample and the smaller sample of registered voters.
I can’t paste the actual table on this blog (I can send the actual table by email if you are interested), but looking at all 1133 respondents – the “raw” initial sample – we see that 28.8% of them are Republicans. This total is almost identical to their weighted sample of voters; when they “weight” the raw sample, they reduce the number of Republicans by only 4, to 28.4% of their sample.
Looking only at the 1004 registered voters, we see that the initial raw sample includes 30.4% Republicans, but this is increased to 31.6% Republicans in the weighted sample of registered voters. Similarly, looking only at registered voters, the percent of Democrats in the raw sample versus final weighted sample doesn’t change much at all – 40% in the unweighted sample versus 40.6% in the weighted sample. The biggest difference is a reduction in independents among the registered voters, from 29.6% in the “raw” sample versus 27.8% in the weighted sample of registered voters.
I show you these numbers to give you an idea of what it means to weight by party. But why does it matter? Compare the CBS weighting to what Rasmussen calculates when they weight by party.
Note that Rasmussen’s tracking polling has always weighted by party, using a “dynamic” system in which they adjust the weight assigned to each party based on the trends revealed by previous survey. In their latest national tracking poll, they weighted their poll to include 38.7% Democrats, 33.6% Republicans, and 27.7% unaffiliated. (That’s a change from the weights they used in their tracking polls for the first thirteen days of September, when the targets were 39.7% Democrat, 32.1% Republican, and 28.2% unaffiliated.)
We see, then, that Rasmussen’s weights include a higher proportion of Republicans than does the CBS/NYTimes poll of registered voters. The gap between Democrats and Republicans in the CBS weighted poll of registered voters is 9%. For Rasmussen it is only 5.1%. Given the difference, it is perhaps not surprising that Rasmussen has the race a dead heat, while CBS gives Obama a 4% lead. Which is more accurate? I have no idea. And, in all candor, neither do the pollsters. But both CBS and Rasmussen recognize that the partisan distribution of voters is a changing target, and to their credit they are trying to make sure their samples reflect these changes.
The important point, however, is that the assumptions they make regarding the likely distribution of partisan identification among likely and registered voters has a big impact on the numbers they report. And it means we need to be careful not to impute to much importance to small changes in these polls, or differences across polls that may say more about the pollsters’ decisions on how to weight by party than it does about any changes in voters political preferences.
A final thought: in recent elections, trends in partisan affiliation among survey respondents have been a very good predictor as to who won the election; in 2004, the proportion of people calling themselves Republicans in the raw sample data went up in the latter half of the campaign, presaging the Bush victory. That may be a more important number than any of the actually reported results. I will keep an eye on this figure and report the trends later in the campaign.
I have been postponing a discussion of the political science voting forecast models results, but they are all in. I’ll try to get to them this weekend. But you might be interested to know that they all – with one exception – agree regarding who will win the popular vote this election. As far as these political scientists are concerned, the race is over.