I am in Cork, Ireland, attending the Irish Conference on Artificial Intelligence and Cognitive Science (I gave a talk on sports scheduling and three themes of modern integer programming: complicated variables, large scale local search, and logical Benders constraints). Conversation here (when an American is in the group: presumably without an American conversation is about hurling or something) is on the US Presidential Election. Some of the historical anomalies are a bit confusing. Why is it only now that Barack Obama “accepts” the nomination from the Democratic Party: shouldn’t he have decided on this long, long ago? What if he didn’t accept the nomination?
The most confusing aspect of the election process is our use of the Electoral College to elect the President. Rather than directly electing the President, voters vote for electors, with each state being given a set number of electors. For most states, all of the state’s electors are given over to just one candidate. This makes interpreting the polls quite difficult. One recent poll had Obama (the now-nominee of the Democrats) and McCain (the presumptive Republican) tied at 47% support each. A natural leap was to then assume that the election is a toss-up. But it is really the distribution of support that counts. It is possible to win the election for President of the United States with .00001% of the vote. For instance, suppose only one voter shows up in 49 states, and those voters vote for Obama, and 10,000,000 Republicans vote for McCain in New York, then Obama would lose the national popular vote 10,000,000 to 49 but he would have an overwhelming majority in the electoral college. While the results would never be that extreme, it is certainly possible (and has happened) to win the national popular vote and lose the electoral vote.
Interpreting polls gets more complicated when you try to address the uncertainties in the polls. For instance, the 47% results above are only for those in the survey who had a preference. There are a huge number of “undecided” voters who do not yet have a preference. How should they be handled as we try to figure out who is ahead (I hate this idea of elections as a “horse race”, but if the media is going to see it as a race, they could at least accurately represent the real race)?
Sheldon Jacobson (University of Illinois), Steven Rigdon, and Ed Sewell (both of Southern Illinois University Edwardsville) are addressing this issue by taking the current poll data and determining the probability of winning the election for each candidate. They have a fascinating website that is being constantly updated.
It is worthwhile to read their methodology section.
The mathematical model employs Bayesian estimators that use available state poll results (at present, this is being taken from Rasmussen, Survey USA, and Quinipac, among others) to determine the probability that each candidate will win each of the states. These state-by-state probabilities are then used in a dynamic programming algorithm to determine a probability distribution for the number of Electoral College votes that each candidate will win in the 2008 presidential election.
There is a full paper by the above authors along with Christopher Rigdon.
They point out a few limitations of their approach. Of course, the results are only as good as the poll data: if the poll data is off, then their results are meaningless. Further, they are not (currently) treating Maine and Nebraska correctly: those two states divide their electors by congressional district, while every other state is all-or-nothing.
Currently, they have Barack Obama with an 89% chance of winning, which is pretty high, but down from the 96% chance they had him at on July 31.