How to grade the polls

Byline: | Category: 2012 | Posted at: Tuesday, 6 November 2012

Let me begin by saying that whomever calls it closest doesn’t necessarily win.  Sure, I trumpet Rasmussen’s 2008 prediction, but that is simply to demonstrate to those who think that he is a partisan hack that he has a pretty decent track record and cannot be discounted for partisan leanings without being guilty of partisanship yourself.

Before we can answer who grades this election right, we have to first revisit a little basic statistics.  Let’s first assume that we have populations large enough as to approximate infinity (we do).  Theoretically then, if we want to release polls within a 90% confidence interval (most are 90 or 95 percent, but often they don’t say, so I assume the lower percent), then we need a representative sample of about 1,200 voters to get a margine of error of about 3%.  That means that if you gather a 100 different representative samples of 1,200 respondents each, on average, 90 of them will give you the right answer to within a range of about +/- 3 points.

Whoever then gets closest, even if they do everything right in terms of gathering their sample, is as much a game of chance as skill.  If the final result is a 50-50 ties,  Acme Polling’s November 6th estimate of a 2-point Obama lead is not necessarily worse than Pollco’s prediction that same day of a tie.  Let’s look at those two polling companies and see who better tracked the ultimate result.

pollco_2.jpg

So who did the better job polling this race?  If you go by final result, you would say Pollco because their last poll predicted the ultimate result of a tied 50-50 race.  However, if you look at each poll over the last ten days, you see that Pollco average an Obama lead of 6 points while Acme Polling’s average was a tie.  It just so happened that on the final day of polling Pollco got a lucky break with their sample, while Acme Polling got a sample that gave them a result within their margin of error.  It just happened to be the one-day-in-ten when Pollco was outside their four-point margin of error.  However, Pollco’s methodology overall was clearly biased in Obama’s favor.  (In statistics “bias” is a neutral term that has nothing to do with ideology.)

When it comes to determining who did the best job of estimating today’s election, we’re going to have to look at not just the overall result, but their track record as well.  If Romney wins by five points, for example, then there is an argument to be made that Gallup did a better job than Rasmussen, because for a longer period of time Gallup was closer to the final result, even though both organizations will have missed it by four.  If Obama wins by three, then we’re probably looking at giving the laurels to IBD/TIPP. 

Bottom line:  you can’t just look at the final number; you have to consider the track record to know which polling company did the best job.

Comments (1)

And the prediction is . . .

Byline: | Category: 2012 | Posted at: Tuesday, 6 November 2012

Before I tell you that, I’m going to let you peer inside the gonculator that spat out the results.  This isn’t a fancy machine of myriad weights and measures.  Instead, it’s just a little bit of applied statistics to a problem of high incertitude.

About two weeks ago I noticed that there appeared to be a correlation between a poll’s strength of likely voter screen and Mitt Romney’s level of support.  Another way of saying that, is that the greater the percentage of adults included in the likely voter portion of a poll, the worse Romney’s result.  At the time, I only had a few data points and only one pairwise comparison.  That is, only one of the polls told me the results of its sample of likely voters and the results of its larger sample of registered voters.

Since then I have a few more data points, a few more pairwise comparisons, and a corroboration of my thesis that higher turnouts significantly aid President Obama’s poll numbers.  This evidence came in the form of the Pew poll released Sunday which gave the presidential preference for unlikely voters.  By a margin of 65-23, the small portion of their sample that they identified as unlikely to vote, wanted Barack Obama to win.  The Pew’s overall sample of likely voters gave Obama only a three-point edge.  Additionally, the poll indicated that “Romney’s suporters continue to be more engaged in the election and interested in election news than Obama supporters, and are more committed to voting.”  In other words, the lower the turnout, the better the result for the Republican.  No, that’s not an earth-shaking revelation, but I think that I’ve figured out a way to estimate by how much turnout effects Romney’s support.

Let me start by saying that politics is not baseball.  I admire Nate Silver’s analytical skills, but taking models that work well in one forum does not necessarily mean that they will translate to another.  The biggest difference between the ballpark and the political arena is the human element.  There are never more than 13 players on a field at any time.  If you add in the 4 umpires (2 more if it’s the post-season), the managers, and the first and third base coaches, plus the official scorer, there are a total of no more than two dozen people involved in determining what happens at every plate appearance.  In presidential politics, that number is in the tens of millions.

Furthermore, in baseball there is near certitude about what actually did happen.  Sure, occasionally an umpire blows a call, or an official scorer counts an error as a hit.  But it’s actually quite rare.  In politics there is far more uncertainty, not just about what is about to happen, but even about what did happen.  The last polls in 2000, for example, were agnostic as to the effect of the late revelations about Bush’s decades’ old DUI arrest; so extrapolating conclusions from those last poll results was and is dangerous. 

Polling, which is the entry point that goes into Nate Silver’s model, is not like a box score–which is a near exact representation of an historical event.  Instead, it is an educated guess about what the current landscape is.  Not what it is going to look like.  And definitely not what it is certain to be.  That is very different from the data that we have with which to evaluate baseball.

Let me make an analogy that may help to explain the logic by which I have arrived at my prediction.  Imagine that, if instead of having a database full of actual baseball records, your historical data was a collection of estimates pooled from baseball experts who were somehow able to consider the running speed of the batter and his teammates on base, the positioning and fielding ability of the fielders, and, in that split second after the batter made contact, predict what the outcome of the plate appearance was going to be.  Obviously, if we could gather enough data–speed and trajectory of the ball when it left the bat, for instance–we could probably guess more often than not if the hit ball was an easy out or a home run.  The model that Nate Silver and many other sabremetricians use brilliantly considers exactly these variables.

But there’s one more variable:  imagine that those making their predictions don’t know the ballpark in which the game is being played.  They have each assumed a different ballpark when making their predictions, but usually they don’t tell you which one it is.  The expert who has assumed that the game is played at Wrigley with the wind blowing out, is likely to record a sharply hit ball to left field as a probable home run.  On the other hand, if another expert assumes that the batter is at Fenway, he may conclude that the ball will bounce off the Green Monster and yield only a long single.

Every one of the polling companies is assuming a different ballpark.  And when you have a 39-point gap between honest unlikely voters and the members of your sample, the inclusion of dishonest unlikely voters is likley to significantly skew your result.  While all the polls in the RCP average are currently using likely voters in their models, they have each defined likely voters differently. 

So what I’ve done is to try to estimate the effect of an unlikely voter in their sample.  To do that I’ve taken the number of likely voters in each poll and listed it as a percentage of either registered voters or adults in their sample, if either or both numbers are given.

polls_final_table.jpg*

If we were able to know the entire random sample of adults from which poll respondents were chosen, we could estimate implied turnout percentages in the polls.  In other words, we can estimate the size of the political ballpark that each of the polling companies is using to get their results.  Four polls gave us enough data to calculate the implied turnout in their samples.  But  three of them  also told us the results of their poll for all registered voters.  Across the four different polls for which we can gather data, turnout ranges from 68.6 to 73.2 percent of the voting age population.  As a percentage of registered voters, it ranges from 75.5 to 88.5 percent.  That gave us two different data points for each of these polls, and from there we could estimate the straight line effect of increasing turnout percentages on presidential preference within individual polls. 

Below are two charts, showing the results of these polls.  The first chart is expressed as a percentage of all registered voters.  The x-axis is the percentage of registered voters in their sample.  The left-most point on each line is the poll result.  The right-most point shows what happened to the result  when they polled all registered voters from their sample.  The second chart is as a percentage of the voting age population.  Where the number of registered voters in the sample was given by the polling company, it is shown as a percentage of all adults contacted in the sample.

poll_chart_rv2.jpg

poll_chart_vap2.jpg 

I wanted to see if the slope of each line was similar enough to use for projection.  Admittedly, this is a small sample, but all slopes were negative and ranged betweenh -0.09 and -0.35, which was roughly similar enough for me to use for the purposes of this estimate.  The average of the five slopes was -0.220, meaning that for every one percent of unlikely voter included in a sample, Mitt Romney’s lead decreased by 0.22 percent.

Knowing the slope of our estimate of the influence of oversampling unlikely voters, all I needed to do was to figure out where to place the line and project what turnout is likely to be.  I chose to calculate turnout as a percentage of VAP.  That is a less subjective measure, as the number of registrations, and thus the size of the denominatior, often fluctuates a great deal.  (In Cleveland’s Cuyahoga County, for example, the purging of old data scrubbed 200 thousand names from voter rolls in the last four years, even though Cleveland’s population didn’t fall nearly that much.)  Turnout is usually in the mid 50s, and since 1972, has never exceeded 62.6% of the VAP.  I have chosen to use an estimate of 60 percent. 

As for where to place the line, I began with the RCP average* of Obama over Romney by 48.3 to 47.8 percent.  And I decided to place that at 70% on the VAP scale, since our three known data points on that scale are between 68.6 and 73.2 percent turnout.  That means that an oversample of 10% unlikely voters in our samples gives Barack Obama an advantage of 2.2% above where he would sit if turnout is 60% instead of 70%.  Taking 1.1% away from Obama and giving 1.1% to Romney gives us a new estimate Romney over Obama by 48.9% to 47.2%. 

Finally, to account for 3.9% that are undecideds or others, I just split them proportionately while leaving 1% for other candidates.  That gives us a final prediction of Romney over Obama by 50.4% to 48.6%.  In other words, I am projecting a result similar to Scenario 3, giving Mitt Romney approximately 295 electoral votes.

Let me make some caveats: 

– I’m not sure that there is going to be any effect on the election from Hurricane Sandy.  If there is, it is likely to be contained to two states that I don’t expect to be in play (NJ and NY).  I do expect that there has been an effect on polling, but having no means of predicting what it is, I’ve ignored it.

– I’m not really certain that there is a last-minute surge to Obama that some are seeing.  It would be an anomaly if there was a move in the incumbent’s direction.  Instead, I think that we’re seeing people automatically included in the likely voter pool because they said that they have already voted, when in fact they did not.  Routinely I’m seeing polls reporting early voters above what actual numbers from secretaries of state are indicating.  Part of this is the social acceptability bias that causes people to say that they are going to vote, but then they don’t.  Part of this is a respondent saying that he has already voted in the hope that the guy on the other end of the line with yet another political call will hang up.  Even before the existence of large numbers of early voters, predicting turnout was always difficult.  Predicting it now in the midst of early voting is even more difficult. 

– If there is a last-minute break to the challenger beyond the proportionate breakout of undecideds that I’ve assumed, then expect Scenario 4.  We’ll know that when we see Ohio’s returns tomorrow.  I hope to do a detailed analysis of the Buckeye State describing what to look for in order to get a sense of how things are breaking.  (UPDATE: posted here)  Bottom line:  if Ohio breaks hard one way or another: expect somewhere between Scenario 1 and Scenario 2 if it goes hard for Obama early, or Scenario 4 (or even Scenario 5) if it goes Romney’s way.

– If turnout falls to the recent historical average of about 55%, this model would give Mitt Romney another one point advantage.

– As for the individual states:  I fully expect that because of his investment in the Buckeye State, that Barack Obama does better in Ohio than he does nationally.  The same happened with John McCain when the entire nation shifted about ten points from four years before, but Ohio only moved about 7 points in the Democratic direction.  With a popular vote win of slightly under 2 points, it is not inconceivable that Mitt Romney could still lose Ohio.  However, by winning by that much nationally, he will have put away Colorado, Florida, and Virginia, while Iowa, New Hampshire, Pennsylvania, Wisconsin, and even Michigan will be teetering on the edge of tilting red.  That’s too much territory for Obama to defend and to expect them all to go his way.  (Thanks for alert reader Trent Telenko for bringing this to my attention; it may explain why polling in Michigan indicates that the Wolverine State appears more red than its PVI would lead us to believe.)

– Finally, if I had to give you my margins of error on the spread of 1.8 points, I’d swag it at +/- 2 points.  In other words, somewhere between a modified Scenario 2 (an Obama squeaker) and a version of Scenario 4 (a solid Romney win).  And yes, if you’re keeping score, that means that I’m projecting that these states already:  Florida and North Carolina, for Romney (minimum 235 EV), and Connecticut, Maine (less the 2nd district), New Jersey, New Mexico, and Oregon for Obama (minimum 190 EV).  Only 113 electoral votes are still in play (bluest to reddest: Nevada, Minnesota, Michigan, Maine 2nd, Pennsylvania, Wisconsion, Iowa, Ohio, New Hampshire, Virginia, and Colorado).

– If you’re wondering why I’m so confident about Flordia:  Early voting is not going well there for Obama.  Not at all.  By the same token I could probably chance a call on Colorado, but not quite.

* The top line on the Pew poll was Obama over Romney 50-47.  However, the poll result was 47-45; they then allocated undecideds to arrive at their projected result.  I used their raw numbers without undecideds allocated.  All numbers are calculated from the RCP average at approximately 1800 Central Monday 5 November.  With the change in the Pew poll, the RCP average at that time was Obama over Romney 48.3 to 47.8 percent.

Comments (9)

The scenarios: 5-Gallup electorate poll is right

Byline: | Category: 2012 | Posted at: Monday, 5 November 2012

(With just one day left before tomorrow’s election, I foresee see five possible scenarios.  Each day leading up to Election Day, we have explored one of the scenarios.  This post is the fifth and last installment of the series.  Tomorrow morning we will predict which one will be the result.)

In a sense, Nate Silver was right all along.  The New York Times prognosticator assured us that elections were the sum of their parts.  It’s just that he focused on the wrong parts:  instead of looking at state races, he should have watched the underlying issues and demographics that were pulling Obama down.  The clues were out there:  13% of the President’s 2008 voters in one comprehensive poll defected the Republican way.  And there wasn’t much give back from the other side.  The “undertow” effect is what some people called it.  But after this performance, most just called it (and by exetension, President Obama) “the Big Suck.”

Those expressing enthusiastic support for President Obama had rarely been even within ten points of their enthusiastic opposites who expressed strong disapproval for the incument.  With 90% of his base locked up as “broken-glass voters,” Mitt Romney was free to fight for the vital middle ground as early as the first debate.  There he increased his lead.  Almost every poll leading up to the election found him winning independents by between 6 and 20 points.  As Dan McLoughlin noted a week before the race:

“If you averaged Obama’s standing in all the internals, you’d capture a profile of a candidate that looks an awful lot like a whole lot of people who have gone down to defeat in the past, and nearly nobody who has won.”

Meanwhile, in what should have sent up alarm bells across the land, Barack Obama’s campaign was diminished to arguing that it would win on the strength of its get-out-the-vote efforts with youth and minorities.  Second to mentioning Harry Truman, this is the last refuge of the losing side.  Almost never is depending on sporadic voters a recipe for success.  Even in 2008, Obama’s leads with those demographics only padded the margin he had already won because of his eight point lead with independents.  Falling precipitously from his earlier indy numbers, he and his acolytes should have known that calamity lay ahead

Most pollsters took a beating in predicting the race.  But that should have been expected too.  The last two times that a Republican challenged a Democratic incumbent (1996 and 1980) the polls overestimated Democratic support by 5.1 and 7.2 points.  And ’96 was not even in bad economic times.  

The biggest gap was not one of gender, but between the opinions of those who voted and those who did not.  Those most likely to vote–whites, property owners, investor class, church-goers, married, and the elderly–despised where Obama had taken the country by margins of sometimes more than 20 points.   Singles, minorities, and the youth still backed the President, but with diminished support and turnout from four years before.

The entire outcome hinged on who turned out to vote, and when only nine percent of the electorate wished to talk to pollsters, projecting who that was likely to be, became a fool’s errand.  But statisticians are not above being fooled.  A look at any one of those demographics–even the ones that still supported him–would have showed the President’s downward shift from four years before.  But it was too easy to weight bad samples toward a guesstimate of a turnout model instead.  Only Gallup got it right, and the result was nasty for Democrats.  They went from a 12-point advantage in party identification among likely voters in 2008, to a one-point disadvantage in just four years.

As for the outcome:  Romney by 53 to 46.  Wave or undertow, it didn’t matter.  Barack Obama won only 11 states and DC.  They were the few lone blue islands awash in a red sea.  The effect downticket was just as bad for Democrats.  They returned to the Senate holding only 43 seats (including 2 Independents).  Bob Menendez, weighed down by a prostitution scandal, was further hamstrung by a Democratic electorate that, in the wake of Hurricane Sandy, had things to do other than vote while, fooled by the pollsters, they complacently expected a Democratic win.  In the House, the GOP returned to Washington with gains and a total of 250 seats.

Gallup was right.  A week before the race, early voter polling and actual early voter numbers showed a massive plunge in Democratic support–and this on top of Democratic expectations that they would lose on Election Day but win enough early votes to carry the election.  The fall was so dramatic that after the election, Democrats openly contemplated the notion that it was their ideas that were out of touch.

scenario_5.jpg

See all the scenarios:

Scenario 1:  Nate Silver is right

Scenario 2:  RCP is right

Scenario 3:  Rasmussen is right

Scenario 4:  Gallup tracking poll is right

Scenario 5:  Gallup electorate poll is right

And the prediction is . . .

Comments (26)

Mind the gap

Byline: | Category: 2012 | Posted at: Sunday, 4 November 2012

The Pew organization has just posted its final poll before the election.  It shows an Obama lead of 50-47. 

Here is one astounding result from the Pew poll that I think best explains why it is so difficult for polls to project this race:   Among those not planning to vote, Barack Obama leads 65-23, a margin of 42 points (Q: 8a).  With likely voters, this same poll has Obama up by only 3.  That’s a gap of 39 points, and therefore, where a polling company draws the line between those who will vote and those who will not, is the single greatest determiner of the poll’s result. 

(BTW, I had already written this gap into my fifth scenario that I will post tomorrow since I had already determined that there was a gap; I just didn’t know how large that gap really was until today.  You can see all five scenarios here.)

Looking at Obama’ enormous lead with this demographic, I can certainly understand why he has focused his campaign on turning out the vote instead of winning over the independents.  But will it work?

Here is reason for the Obama campaign to be concerned:  of the registered voters interviewed for this poll, 97% of them said that they have already voted or will do so (Q: PLAN1).  That would make 2012 one of the highest turnout elections in the history of the United States.  Undoubtedly, some of that 97% is lying.  If those not telling the truth are more  like their politically inactive but honest counterparts who support Obama by 42 points, then their absence in the voting booth will leave Barack Obama a one-term president.

Comments (1)

The scenarios: 4-Gallup tracking poll is right

Byline: | Category: 2012 | Posted at: Sunday, 4 November 2012

(With just two days left before the election, I foresee see five possible scenarios.  Each day leading up to Election Day, we’re going to explore one of the scenarios.  This post is the fourth installment of the series.)

In a karmic performance, Barack Obama loses the election with 47% of the vote.

For two weeks Gallup predicted this outcome.  For two weeks that prediction received scoffs in return from the left.  Never before in the history of the nation’s most venerable polling outfit had a presidential candidate led for so long and so strong so late in the race and still lost.  Aside from a few days when Rasmussen showed the race at its most open, Gallup sat alone out on a limb.

In the end, it wasn’t close.  A five point national victory was enough to sweep every swing state Romney’s way.  The margin was large enough–a twelve-point swing since 2008 in the Republican direction–that two of the three “reach” states (Democrats should never have disregarded the Great Lake State) voted for Romney, as did a lone elector in Maine.  The margin of victory in Nevada was enough even to overcome the margin of fraud in Clark County.  Sixteen states went to Obama.  It could have been worse.  Four of his states were within a point of going the other way.

Almost every demographic voted for Obama in smaller percentages.  Even those that still supported him in high percentages–blacks, hispanics, and youth–didn’t turn out to vote.  And because they didn’t turn out to vote, the effect on Democrats downstream was wide and deep.  Incumbents Sherrod Brown and Bob Casey went down to defeat.  The Senate ends up in Republican hands with 53 seats.  The GOP even picks up a net of a few seats in the House to bring them to 245. 

Recriminations against Obama for his underwhelming performance begin right away.  “Lazy” isn’t racist, apparently, when liberals cudgel the President with the adjective to describe how he aproached the end of this race.  For to risk being called racist is better than to admit that their ideas had . . .  They can’t even formulate the word in their heads.  Must. Not. Allow.

Final result:  Romney over Obama 52 to 47, while the Republican candidate secures 328 electoral votes.

scenario_4.jpg

See all the scenarios:

Scenario 1:  Nate Silver is right

Scenario 2:  RCP is right

Scenario 3:  Rasmussen is right

Scenario 4:  Gallup tracking poll is right

Scenario 5:  Gallup electorate poll is right

And the prediction is . . .

Comments (4)

We remember Reagan

Byline: | Category: 2012, Culture | Posted at: Saturday, 3 November 2012

One of the more interesting graphical representations of poll results is produced by the ABC/Washington Post daily tracking poll.  It shows the level of support for the President among various demographics and compares that with where it stood four years ago. 

abc_wp_graph.jpg

Not surprisingly, support for President Obama across virtually every demographic is down from four years before.  But there is one demographic that stands out with the sharpest decline–and it is mine.

Americans age 40-49 went from a 49-49 split in 2008 to give Romney a 59-38 advantage today.*  Not only is this 21-point swing the largest on the chart, it makes this age cohort the most conservative of any chronological group.

People in their 40s today were born between 1963 and 1972 and we are a product of that age.  Our childhood political memories include  gas lines, inflation, long recessions, unemployment, a weak American military, and a fearful USSR.  Those were the issues of the late-1970s when we were 7 and 17 years old.  Along with polaroids of ourselves in plaid polyester and set to a soundtrack of disco (undoubtedly played on an 8-track with an annoying “jump” in the middle of a song), they weren’t very memorable years.

In our adolescence we had Ataris and Apples, music on CDs, improving economies, stable gas prices, a strong dollar for overseas travel, the death of the Soviet Union, and jobs when we finished college.  Dressed in khakis and Izods, our lives under Reagan were significantly better and that was when we came of age.

Consider also what we didn’t have because we either weren’t born or we were too young:  the Tet Offensive, Watergate, Martin Luther King’s assassination, urban riots, and discrimination against women and blacks when they searched for jobs.  We are more conservative than both our elders who remember the injustice of the 60s and our juniors who never knew how bad Carter was. 

But it is not just our cohort that makes us conservative; it is our age.  We are in our forties:  too young to be old and too old to be young. We are too young to be near enough to Social Security to hope that we can preserve it and too far away from it to think that we can.  We are entering our years of peak earnings but we fear that our peaks are in the rear.  Our children are bound for college and we wonder how we’ll pay the bills.  We are too young for the complacency of nostalgia and too old to live without care. 

It is no wonder that we favor Mitt Romney.  We have seen better days and worse ones and we know that we can choose which ones to have.  We are young enough still to look forward, but not so young that we live only day to day.  We are old enough to have acquired some wisdom but not so old to have given some of it back.  And we see dismal echoes of Carter in Obama and we remember Reagan ending those years. 

* NOTE:  The cohort of 40-49 year olds four years ago is not exactly the same people as four years ago; only those born between 1963 and 1968 exist in the 40-49 age group in both years.

Comments (2)

The scenarios: 3-Rasmussen is right

Byline: | Category: 2012 | Posted at: Saturday, 3 November 2012

(With three days left before the election, I foresee see five possible scenarios.  Each day leading up to Election Day, we’re going to explore one of the scenarios.  This post is the third installment of the series.)

If this scenario comes to pass, it shouldn’t really be a surprise.  Not since 1936 has an incumbent’s attempt at re-election succeeded against the scale of the job losses we’ve seen these last four years.  Abroad, America has struggled as well.  While the war in Iraq is over, in Afghanistan it lingers on with no prospect for success in sight.  The Middle East is in full-scale collapse, while economic woes in Europe and China risk dragging us deeper down.

In light of all that and more, it was a testament to how well liked Barack Obama is that he could keep this race as close as it was.  Or was it?  After the election, exit polls reveal that women and Jews weren’t as enamored with abortion as they were of jobs.  Coal miners and electricians didn’t care as much about the collective bargaining rights of government employees as they cared about jobs.  Hispanics cared less about immigration reform if the man promising to bring the reform couldn’t let them stay in a country with jobs.  Unemployed recent college graduates didn’t care as much about being cool as they cared about something as banal as jobs.  Upon closer inspection of turnout data, the election’s results are really a testament to how well liked Obama is by black Americans that he was able to keep his numbers as high as they were.  In spite of bearing the greatest brunt of the recession, Obama lost not a bit of their vote.  In every other demographic, support and enthusiasm for the President were markedly down.

In the end, Ohio didn’t matter.  Yes, Mitt Romney won it, but with wins in Colorado, Iowa, New Hampshire, Virginia, and Wisconsin, Ohio was just the Buckeye-flavored frosting on Romney’s already baked election day cake.  Even Mondale’s Minnesota teetered on the edge of breaking the GOP’s way.  Alone among most lists of swing states, Nevada chose Obama.  Final Results:  Romney wins 51% to 48% and takes 295 electoral votes.

scenario_3.jpg

The President’s poorer-than-expected performance lost his party the Senate as well.  Virginia’s George Allen returns after a six-year absence.  Bill Nelson can’t resist the tide.  The Upper House ends up 51-49 while the Lower House sees only two net Republicans go away, leaving it 240-195.

See all the scenarios:

Scenario 1:  Nate Silver is right

Scenario 2:  RCP is right

Scenario 3:  Rasmussen is right

Scenario 4:  Gallup tracking poll is right

Scenario 5:  Gallup electorate poll is right

And the prediction is . . .

Comments (8)

State numbers that should give you some doubt

Byline: | Category: 2012 | Posted at: Friday, 2 November 2012

What follows is a departure from my usual skepticism regarding state polling, but if you follow me to the end, I think that you’ll understand why.

Real Clear Politics compiled poll results today from the three states below that are not on anyone’s list of a contested race.

swings.jpg 

In comparison with how he did four years ago, in two of those states Obama is losing support by double digit amounts.   In the rough average of our small sample, he has dropped 9.  Keep in mind that four years ago he won by a little over 7 points, so a drop of 9 nationally, puts him a point or two behind Romney.

So what?  The outcome in none of these three states is in doubt.  (Actually, there has been some parrying over Nebraska’s 2nd District’s electoral vote, and that may explain the lack of a shift in the Cornhusker State from four years before.)  That’s true, but as I learned four years ago, when the swing is big enough, it hits states that are off of most people’s lists.  We saw that in 2008 when Barack Obama opened up enough of a lead nationally to pick up Indiana, North Carolina, and Virginia–states that weren’t on earlier target lists.  If the swing outside the swing states is about 9 points, then Minnesota (Obama 2008 by 10.2%) and Pennsylvania (Obama 2008 by 10.3%) are well within reach.

Which brings me to the Keystone State.  For a week, Democrats have said that Romney’s late move there is either a head fake or desperation, while Republicans say that they think that they have a realistic shot.  I tend to the latter view, but even if it’s just a small chance, why not?  There is no shortage of Romney money like McCain had when he was outspent $600 million to $80 million at the end.  The report I heard the other day is that the campaigns and their proxies have spent a combined total of $160 million in Ohio alone.  What difference will another two or even twenty make at this late date?  I contend that in the swing states, both candidates have reached the point of diminishing marginal returns.  One more phone call, one more mailer, one more ad is just going to piss people off.  If it is trench warfare in the Buckeye State, why send more troops to Maginot when they can march unscathed through the Ardennes?

Finally, if I am such a skeptic toward state polling data done by small-track-record firms, why do I cherry-pick this data to bolster my case?  Actually, I don’t.  But if you are one of those who has built your case on numbers of this type, this latest polling ought to give you some doubt.

Comments (2)

Why you don’t weight for party

Byline: | Category: 2012, Culture | Posted at: Friday, 2 November 2012

In 2008 I was running a few monthly polls in Iraq.  One of the polls was a huge survey with many-thousands of respondents so that we could dig down deep to the provincial level and below.  The contractor was Gallup.

I was engaged in a running commentary with Gallup’s analysts about how they weighted their sample results.  They weighted for factors like age, sex, and urbanicity.  But I was concerned that they didn’t weight for the one characteristic likely to produce the most divergent answers throughout the rest of the poll:  religious sect.  I argued that whether one was Sunni or Shia, it wasn’t something that changed from time to time, and that therefore, we could weight samples to account for it.

I was wrong.

A few months later, one of the provinces which had always reported a Sunni percentage of the population that was within the margin of error of zero percent, had all of the sudden jumped to about 15% Sunni.  Something was wrong with the poll!  Then the next month it happened again.  And again.

There was nothing wrong with the poll.  It was simply a reflection of a change in the security situation on the ground.  Iraqi government forces had crushed the local Shia mafia and the Sunni who lived among them were no longer afraid.  Prior to then, a social acceptability bias was in place, and that caused Sunni respondents to tell pollers that they were of the other sect.  (There was also the cultural effect of taqiya, which had the effect of a form of political correctness.)

Three months ago Sean Trende explored a similar issue when he explained why it was wrong to weight for party ID:

The problem is that party identification is not an immutable characteristic, such as race, age, or gender. It fluctuates. Even the wording of the question can elicit very different answers. 

Trende notes that party ID isn’t usually a factor as long as it doesn’t fluctuate wildly from poll to poll when conducted by the same organization. 

I would broaden that to include what I learned from my example above.  When there is a big divergence in party preference, it should make you go “hmmmm.”  Changes may be a result of exogenous factors.  Republican-leaning independents in Missouri, for example, might not have been so inclined to identify with the GOP after Todd Akin made his idiotic remarks.  Change may also be indicative of a bad sample overly filled with unlikely voters (particularly if unlikely voters are more likely to identify with one side).  And if one polling company continuously has different party identifications from another, by seeing that rather than disguising it through weighting, we are immediately notified of different methodologies and assumptions employed by the different organizations.

If polling organizations weighted for party ID, we would have fewer clues that something is weird with a poll or that things might have changed on the ground.  So while you should watch the party breakdown number, it is correct that polling companies should not weight for the result.

Comments (2)

What if the polls are really wrong?

Byline: | Category: 2012, Culture | Posted at: Friday, 2 November 2012

Without getting actual location data from telephone companies or sampling cellphones nationwide, pollsters are forced to ignore people who move into a given state with an out-of-state cell number. Consider the manifestly unrepresentative sample that resides in my cellphone. Of the roughly 100 cellphone numbers I have saved, 49% are owned by people living (and presumably voting) in states that do not match their cellphone’s area code.

Because area codes didn’t cross state baoundaries, back in the old days of reliable landline polling, that made predicting state races relatively simple.  But when it came to predicting House races, the patchwork of area codes made for complications that could skew a poll’s result. 

We may have a similar phenomenon at the state level today.  The above quotation comes from Dan Hopkins when he wondered a couple weeks ago about the geographic reliability of cell phone numbers since area codes effectively no longer stop at state lines.  Undoubtedly this situation exists.  However, it only effects polling if the population of out of state cell phone households users is politically different from the population of geographically consistent cell phone households.  Not knowing who they are or how to contact them, we simply don’t know.

Ten years ago, even a luddite like me had a Tennessee area code but lived in Virginia.  As more households go wireless and as labor portability continues to spread, this polling problem is likely to increase from year to year. 

I wonder if some day, in order to get a representative sample, we’ll go back to actual in-person polling?  Or perhaps we’ll go the full Rasmussen and use only internet polls?

Comments (2)