The Appeal of Operations Research and Sports

For a more recent comment on MLB scheduling and the Stephensons see my response to the 30 for 30 video.

The relationship between operations research and sports is one topic that I return to often on this site.    This is not surprising:  I am co-owner of a small sports scheduling company that provides schedules to Major League Baseball and their umpires, to many college-level conferences, and even to my local kids soccer league.  Sports has also been a big part of my research career.  Checking my vita, I see that about 30% of my journal or book chapter papers are on sports or games, and almost 50% of my competitive conference publications are in those fields.  Twenty years ago, my advisor, Don Ratliff, when looking over my somewhat eclectic vita at the time (everything from polymatroidal flows to voting systems to optimization implementation) told me that while it was great to work in lots of fields, it is important to be known for something.  To the extent that I am known for something at this point, it is either for online stuff like this blog and or-exchange, or for my part in the great increase of operations research in sports, and sports scheduling in particular.

This started, as most things in life often do, by accident.  I was talking to one of my MBA students after class (I was younger then, and childless, so I generally took my class out for drinks a couple times a semester after class) and it turned out he worked for the Pittsburgh Pirates (the local baseball team).  We started discussing how the baseball schedule was created, and I mentioned that I thought the operations research techniques I was teaching (like integer programming) might be useful in creating the schedule.  Next thing I know, I get a call from Doug Bureman, who had recently worked for the Pirates and was embarking on a consulting career.  Doug knew a lot about what Major League Baseball might look for in a schedule, and thought we would make a good team in putting together a schedule.  That was in 1996.  It took until 2005 for MLB to accept one of schedules for play.  Why the wait?  It turned out that the incumbent schedulers, Henry and Holly Stephenson were very good at what they did.  And, at the time, the understanding of how to create good schedules didn’t go much beyond work on on to minimize breaks (consecutive home games or away games) in schedules, work done by de Werra and a few others.  Over the decade from 1996-2005, we learned things about what does work and what doesn’t work in sports scheduling, so we got better on the algorithmic side.  But even more important was the vast increase in speed in solving linear and integer programs.  Between improvements in codes like CPLEX and increases in the speed of computers, my models were solving millions of times faster in 2005 than they did in 1996.  So finally we were able to create very good schedules quickly and predictably.

In those intervening years, I didn’t spend all of my time on Major League Baseball of course.  I hooked up with George Nemhauser, and we scheduled Atlantic Coast Conference basketball for years.  George and I co-advised a great doctoral student, Kelly Easton, who worked with us after graduation and began doing more and more scheduling, particularly after we combined the baseball activities (with Doug) and the college stuff (with George).

After fifteen years, I still find the area of sports scheduling fascinating.  Patricia Randall, in a recent blog post (part of the INFORMS Monthly Blog Challenge, as is this post) addressed the question on why sports is such a popular area of application.  She points to the way many of us know at least something about sports:

I think the answer lies in the accessibility of the data and results of a sports application of OR. Often only a handful of people know enough about an OR problem to be able to fully understand the problem’s data and judge the quality of potential solutions. For instance, in an airline’s crew scheduling problem, few people may be able to look at a sequence of flights and immediately realize the sequence won’t work because it exceeds the crew’s available duty hours or the plane’s fuel capacity. The group of people who do have this expertise are probably heavily involved in the airline industry. It’s unlikely that an outsider could come in and immediately understand the intricacies  of the problem and its solution.

But many people, of all ages and occupations, are sports fans. They are familiar with the rules of various sports, the teams that comprise a professional league, and the major players or superstars. This working knowledge of sports makes it easier to understand the data that would go into an optimization model as well as analyze the solutions it produces.

I agree that this is a big reason for popularity. When I give a sports scheduling talk, I know I can simply put up the schedule of the local team, and the audience will be immediately engaged and interested in how it was put together. In fact, the hard part is to get people to stop talking about the schedule so I can get on talking about Benders’ approaches or large scale local search or whatever is the real content of my talk.

But let me add to Patricia’s comments: there are lots of reasons why sports is so popular in OR (or at least for me).

First, we shouldn’t ignore the fact that sports is big business. Forbes puts the value of the teams of Major League Baseball to be over $15 billion, with the Yankees alone worth $1.7 billion. With values like that, it is not surprising that there is interest in using data to make better decisions. Lots of sports leagues around the world also have high economic effects, making the overall sports economy a significant part of the overall economy.

Second, there are a tremendous number of issues in sports, making it applicable and of interest to a wide variety of researchers. I do essentially all my work in scheduling, but there are lots of other areas of research. If you check out the MIT Sports Analytics conference, you can see the range of topics covered. By covering statistics, optimization, marketing, economics, competition and lots of other areas, sports can attract interest from a variety of perspectives, making it richer and more interesting.

A third reason that sports has a strong appeal, at least in my subarea of scheduling, is the close match between what can be solved and what needs to be solved. For some problems, we can solve far larger problems than would routinely occur in practice. An example of this might be the Traveling Salesman Problem. Are there real instances of the TSP that people want to solve to optimality that cannot be solved by Concorde? We have advanced so far in solving the problem, that the vast majority of practical applications are now handled. Conversely, there are problems where our ability to solve problems is dwarfed by the size of problem that occurs in practice. We would like to understand, say, optimal poker play for Texas Hold’em (a game where each player works with seven cards, five of them in common with other players). Current research is on Rhode Island holdem, where there are three cards and strong limitations on betting strategy. We are a long way from optimal poker play.

Sports scheduling is right in the middle. A decade ago, my coauthors and I created a problem called the Traveling Tournament Problem. This problem abstracts out the key issues of baseball scheduling but provides instances of any size. The current state of the art can solve the 10 team instances to optimality, but cannot solve the 12 team instances. There are lots of sports scheduling problems where 10-20 teams is challenging. Many real sports leagues are, of course, also in the 10-20 team range. This confluence of theoretical challenge and practical interest clearly adds to the research enthusiasm in the area.

Finally, there is an immediacy and directness of sports scheduling that makes it personally rewarding. In much of what I do, waiting is a big aspect: I need to wait a year or two for a paper to be accepted, or for a research agenda to come to fruition. It is gratifying to see people play sports, whether it is my son in his kid’s soccer game, or Derek Jeter in Yankee Stadium, and know not only are they there because programs on my computer told them to be, but that the time from scheduling to play is measured in months or weeks.

This entry is part of the March INFORMS Blog Challenge on Operations Research and Sports.

Learn Operations Research, Make Millions

Russ Ohlendorf received his bachelors degree in 2006 and now, a mere five years later, he will be making $2.025 million in 2011.  The degree was in operations research and financial engineering at Princeton.  It just goes to show how far you can go in operations research:  salaries in the millions are possible!

Perhaps Russ has some skills beyond operations research, since he is a starting pitcher for the Pittsburgh Pirates which, against all rules of logic or fairness, is part of Major League Baseball.  Still, I’m marching right into my dean’s office and asking for a raise to match the highest operations research salary in Pittsburgh!

World Cup Forecast Pool, with a Twist

The Brazilian Society of Operations Research is organizing a competition for predicting the results of the group stage at the upcoming World Cup.  If you have to ask for which sport, you probably aren’t the target audience:  it is for football (aka soccer).  Many sites have such pools for many sports:  for US college basketball, the OR blogs are practically given over to the topic every March.

This competition is a bit different though:  you aren’t allowed to simply guess the winners of each game.  First, you need to give probabilities of win, loss or tie, with scoring based on squared errors.  Second, you need to use some sort of model to generate the predictions, and be willing to describe that model.  And no fair modifying your results to better fit your own ideas!  You need to stick to the model’s predictions.    There are three subcompetitions, with track A allowing fewer types of information than track B, and track C limited to members of the Brazilian OR society.

There is quite a bit of past data on the players and teams, so perhaps it is possible to create useful models.  I look forward to seeing the results.  Deadline for entries is June 7, with games starting June 11.

Journalists Should Be Required to Pass an Exam on Conditional Probability

There is nothing more grating than having a journalist toss around numbers showing no understanding of conditional probability (actually, there are 12 more grating things, but this ranks right up there).  In a nice story from NBC Chicago, journalists Dick Johnson and Andrew Greiner write about an autistic teen who has picked the first two rounds of the NCAA tournament correctly:

An autistic teenager from the Chicago area has done something almost impossible.

Nearly 48 games into an upset-filled NCAA tournament, 17-year-old Alex Herrmann is perfect.

“It’s amazing,” he says. Truly.

Yes it is amazing. But the writers get tripped up when trying to project the future:

There are still four rounds remaining, so it could fall apart — the odds of a perfect wire to wire bracket is about 1 in 35,360,000 by some measures or 1 in 1,000,000,000,000 by others.

Aaargh! Let’s let pass the factor of 28,000 or so difference in estimates. THIS IS NOT THE RELEVANT STATISTIC! We already know that Alex has picked the first two rounds correctly. We are interested in the probability he has a perfect bracket given he picked the first 48 games correctly. This is about the simplest version of conditional probability you can get.

If all he did was flip a coin for each of the remaining 15 games, he would have a one in 32,768 chance of having a perfect bracket, given where he is now. Not great odds, certainly but nothing like the probabilities given in the quote. You can argue whether 50/50 on each the fifteen remaining games is the right probability to use (Purdue as champion?), but there is absolutely no justification for bringing in the overall probability of a perfect bracket.  By quoting the unconditional probability (and who knows where those estimates come from), the writers vastly underestimate Alex’s chance of having a perfect bracket.

I swear I see the confusion between unconditional probabilities and conditional probabilities twice a day. I suspect the stroke that will finally get me will be caused by this sort of error.

Edit.  9:06PM March 23. From the comments on the original post, two further points:

  1. The writers also seem confused about joint probabilities:
  2. One in 13,460,000, according to It’s easier to win the lottery. Twice.

    No… not unless your lottery has the probability of winning of one in the square root of 13,460,000, or one in 3669. While there are “lotteries” with such odds, the payoffs tend to be 1000 to 1, not millions to 1. I bet they thought winning one lottery might be 1 in 7,000,000 so two lotteries “must be” 1 in 14,000,000. No, that is not the way it works.

  3. It appears that if you manage a pool on, you can edit the picks after the games. That might be a more reasonable explanation for picking 48 games, but it is hard to tell.

So, to enumerate what journalists should be tested on, lets go with:

  1. Conditional Probability
  2. Joint Probabilities
  3. Online editing possibilities

You are welcome to add to the certification test requirements in the comments.

Update on LRMC after first round

Sokol and teams’ Logistic Regression/Markov Chain approach had a pretty good first round in the NCAA tournament.  It correctly picked 24 of the 32 games.  On the plus side, it picked the following upsets (NCAA seeds in parens):

Northern Iowa (9) over UNLV (8), Georgia Tech (10) over Oklahoma State (7), Murray State (13) over Vanderbilt (4), Old Dominion (11) over Notre Dame (6), St. Mary’s (10) over Richmond (7)

It incorrectly predicted upsets

San Diego State (11) over Tennessee (6), Florida State (9) over Gonzaga (8), Utah State (12) over Purdue (4)

It missed the upsets

Ohio (14) over Georgetown (3), Missouri (10) over Clemson (7), Washington (11) over Marquette (6), Cornell (12) over Temple (5), Wake Forest (9) over Texas (8)

Overall, favorites (higher seeds) only won 22 of the 32 first round games, so in that sense LMRC is doing better than the Selection Committee’s rankings.  3 of LRMC’s “Sweet Sixteen” have been eliminated but they are still fine for the round of eight on.

March Madness and Operations Research, 2010 Edition

Normally I do a long post on operations research and predicting the NCAA tournament.  I did so in 2009, 2008, 2007 and even in 2006 (when I think I made blog entries with an IBM selectric typewriter).   This year, I will cede the ground to Laura McLay of Punk Rock Operations Research, who has a very nice series of OR related entries on the NCAA tournament (like this post and the ones just previous to it).  I’d rather read her work than write my own posts.

That said, perhaps just a comment or two on Joel Sokol (and his team)’s LRMC picks, as covered in the Atlanta Business Chronicle.  Joel’s ranking (LRMC stands for Logistic Regression Markov Chain) can be used to predict winners.  They have a page for their tournament picks.  Some notable predictions:

1) Their final 4 consists of 3 number 1’s (Kansas, Syracuse, and Duke) and one number 2 (West Virginia).  The remaining number 1 is Kentucky, ranked only number 9 overall by LRMC.

2)  Kansas beating Duke is the predicted final game.

3) 7th seeded BYU is ranked 4th overall, so makes the Elite Eight until knocked off by Syracuse (3).

4) 12th seeded Utah State is predicted to beat 5th seeded Texas A&M and 4th seeded Purdue.

5) 13th seeded Murray State over 4th seeded Vanderbilt is the biggest predicted first round upset.

Let’s see how things turn out.

Final Olympic Results: Canada owns 45.25% of the Podium

Further to yesterday’s entry, we can now determine exactly how much of the podium Canada owns.  To determine the “winner” of the Olympics, you need to determine the relative values of gold, silver, and bronze medals (with the assumption that non-medalers do not count, which is arguably false, but necessary in order to stop me from spending the night compiling broader lists).  The final medal standings are (from

Country Medalists GOLD SILVER BRONZE Total
CAN United States See Names 9 15 13 37
CAN Germany See Names 10 13 7 30
CAN Canada See Names 14 7 5 26
CAN Norway See Names 9 8 6 23
CAN Austria See Names 4 6 6 16

So, if you count every medal equally, then the USA won; if you only count gold, Canada won. But what if you count things 5 for a gold, 3 for a silver, and 1 for a bronze? Then the USA wins. How about 10, 5, 1? That would be Canada. Is there a set of points for Germany to win? It turns out there is not: anyone with operations research training would fiddle around for a while and figure out that 3/4 of the US medals plus 1/4 of the Canadian medals dominates the German medal counts.  Everyone else is dominated by the USA:  only Canada and the USA might win for a given set of medal weights.

Now not every point system makes sense. Giving 10 points for a bronze and 1 point for a gold might match up with certain egalitarian views, but would not really be in keeping with a competition. So we can limit ourselves to point systems with gold >= silver >= bronze. Further, we can normalize things by making the weights add up to 1 (since multiplying a weighting by a constant number across the scores doesn’t change the ordering) and having the weights be non-negative (since getting a medal shouldn’t hurt your score).

This gives a base set of linear equalities/inequalities. If we let wg, ws, and wb be the weights for gold, silver and bronze, we are interested in weights which satisfy

wg >= ws >= wb
wg+ws+wb = 1
wg, ws, wb >= 0

Now, which weights favor Canada? It turns out that, with some basic algebra, you can deduce (using the medal counts above) that Canada wins whenever wg > 8/13 (and ties with wg=8/13). So as long as you put more than 61.5385% of the weight on gold, Canada wins. This amounts to about 45.25% of the feasible region. USA wins on the remaniing 54.75% of the region. If Canada had won one more silver medal, they would have prevailed on more than half the reasonable region.

The diagram illustrates the weights for the USA and Canada, giving only the weights for gold and silver (the weight for bronze is 1-gold-silver). The red region are the weights where Canada wins; the blue is for the USA. Point A is “all medals are equal”; Point B is “count only gold and silver”; Point C is “Count only gold”.  The yellow line corresponds to the weight on gold equaling 8/13.

Bottom line: on this measure, the USA won the Olympics in an extraordinarily close race.  Canada may not have “Owned the Podium” but they came darn close.

Canada owns 40% of the Olympic podium

In this year’s Olympics, much has been made of the Canadian efforts to “own the podium“.  Canada has spent $118 million in training its athletes, far more than the US has spent ($55 million over four years).  Since it seems that, despite a late rush, the Canadian goal of winning more medals than any other country will not be met, the Own the Podium effort appears to be a failure.  But perhaps operations research can come to the rescue here.

The problem is, perhaps, in defining the goal.  By defining the goal in terms of overall medals, the Canadians were perhaps too modest.  If they had simply strived for excellence and defined their goal in terms of “Most gold medals”, then they would have succeeded:  they have 13 gold compared to the Germany’s 10 gold with two events to go.

It does seem kind of strange to define winning as “Most Medals”:  a bronze is not the same as a gold!  But it also seems pretty strange to only count gold:  the others seem to have some value.

Rather than look at any particular weighting of the medals, perhaps we should look at any reasonable weighting and see who wins.  If we give weights wg, ws, and wb to each of gold, silver, and bronze, and let ng, ns, and nb be the number of such medals won, then the score of a country is wg*ng+ws*ns+wb*nb.  The stated Canadian goal had (wg,ws,wb)= (1,1,1).  Counting gold only has (wg,ws,wb) = (1,0,0).  What other weights would be reasonable?

Clearly, gold is at least as valuable as silver which is at least as valuable as bronze, so we want wg>=ws>=wb.  Also, we can normalize so that wb+ws+wb=1 (since, for instance, (2,2,2) is the same as (1,1,1) which is the same as (1/3, 1/3, 1/3)).   With these requirements, there are only three teams that might be considered ahead at this point.  Consider the leading countries (from

Country Medalists GOLD SILVER BRONZE Total
CAN United States See Names 9 14 13 36
CAN Germany See Names 10 12 7 29
CAN Canada See Names 13 7 5 25
CAN Norway See Names 8 8 6 22
CAN Austria See Names 4 6 6 16

Norway, Austria, and every other country (other than Germany and Canada) is dominated by the USA, so cannot be the winner, no matter the weight.

There are many weights other than (1,0,0) for which Canada is the winner.  For instance (.68, .16, .16) is also a win for Canada.  Even (.64, .32,.04) results in Canada in first.

Going through the grid of possible values, it seems that Canada is currently in first in about 40% of the cases;  the USA is in first in the remaining 60% of the weights.  Germany is never in first, being dominated by the combination of 74% USA and 26% Canada.  So perhaps it is fair to say that Canada owns 40% of the podium, trailing the USA with 60%.

If Canada were to beat the USA in hockey on Sunday, they would go up to 45%.  This assumes no further medals in the men’s 50km cross country.  But if a Canadian could also win the cross country, then the fraction of weights for which Canada wins goes up to 54.7%.  There is still a chance for Canada to “Own the Podium!”.

Winston, Sports, Statistics, and Decision Making

winstonWayne Winston, author of famous textbooks in operations research and a new book on math and sports,  and sports statistics/decision making guru, has a column in the Huffington Post, which certainly catapults him to rock-star status in the operations research world.  The entries are also posted on his personal blog, where he posts additional material.

His recent post is on a controversial decision that the coach, Bill Belichick of the New England Patriots (US football) made yesterday.  With just a couple of minutes left to play, Belichick decided to try for a first down on 4th and 2 deep on his own 28 yard line.  If the Patriots had made the first down, the game would be over with a Patriots win.  If they failed (which they did), the Indianapolis would need to move the ball about 30 yards in two minutes to score and win (which they did).  The alternative would have been to punt, which would then require Indy to move perhaps 60 or 70 yards in that time to score.

The vast majority of coaches in this situation would punt.  Winston suggests Belichick made the right move, given that Indianapolis had a high probability of scoring even from 60 or 70 yards (Indianapolis has the quarterback and team to do so).  The result is pretty clear:  as long as you believe that Indianapolis had at least a 50-50 shot of scoring after the punt (and in many cases with a lower probability than that), you should go for it.  Advanced NFL Stats has a slightly different take on this, with the same conclusion.

I think it is important to note that Winston doesn’t just do statistics.  He combines it with decision making.  Sometimes that decision making is reasonably straightforward but unintuitive (like the above), and sometimes it is more complicated.

Winston has done a lot to bring clarity to the complicated world of basketball statistics and decision making.  I look forward to seeing what he has to say to Huffington’s huge audience.  And maybe have him sneak in the phrase “operations research” once in a while.

Scheduling the US Open

The New York Times has a nice article on what goes into scheduling the (tennis) US Open. You would think that most of the scheduling is done once the brackets are determined, but that is not the case. While the brackets determine who plays on each day, the assignment of matches to courts and to times of the day is done live, and depends on the outcomes of the matches. Venus Williams wins, and her match goes into a big court at a time best for TV. Venus loses, and her conqueror may be exiled to an outer court at 11AM.

There are lots of things that go into the schedule:

The task is to balance the often conflicting desires of players (who submit match-time preferences before the tournament), coaches (who often have more than one pupil and prefer they play at different times), broadcasters (including three in the United States and a litany of others around the world, each hoping to boost ratings with well-timed slots for particular players) and ticket holders (some holding passes for daytime matches, others with tickets to the prime-time show, all wanting compelling tennis spread evenly throughout their stay).

This is, of course, a great opportunity for operations research: our models are really good at doing this sort of scheduling. The hard part is doing the balancing: what should the objective be?

It appears that the system they have in place primarily tracks things as people hand-schedule:

To demonstrate, Curley and Crossland moved Friday’s matches around, cutting and pasting from one court to another, from one time to another. A matchup, outlined in blue for men and pink for women, would turn red if the computer recognized a problem.

In one case, a player had a doubles match before her singles match — a no-no. In others, the computer flagged too little rest for players. Several noted that coaches had players playing simultaneously.

Clicking on a command called “pairs” showed the two matches whose winners would meet in the next round. Ideally, the matches take place at the same time, giving each winning player the same amount of rest leading into the next match.

This is not the sort of interface or description you would expect if the system was optimizing the schedule.

I have seen a very nice paper on tennis umpire scheduling which talks about scheduling the umpires for the US Open but there the constraints and requirements are a lot clearer. It would be quite challenging to put together a system for scheduling the matches that would allow for the sort of tradeoffs the hand-scheduling provides. But I would love to try to do so!