NL8 Solved!

Almost 10 years ago, Kelly Easton, George  Nemhauser and I created something called the Traveling Tournament Problem.  The name is George’s:  I wanted to call it the Problem where Teams Play a Round Robin and Travel and Stuff, but George’s name is catchier.  The problem came from work we had done with Major League Baseball in creating their schedule.  Rather than try to post 150 pages of requests and requirements (which MLB would never let us do anyway), we abstracted out the key aspects of the schedule.  Those were:

  1. Travel:  a wish to keep the total travel down, and
  2. Flow: a wish not to be home or away for more than 3 series consecutively.

I also added a “no repeat” constraint:  if A is at B, then B cannot be at A in the next time slot.

When we put this together, we also put together the website http://mat.tepper.cmu.edu/TOURN to keep track of results and that turned out to be a great idea.  By having one site with results and instances, it has been easy to keep track of “current records”.

When I first worked on this, my goal was to show how some of our methods that we use for MLB could handle problems of reasonable size for the Traveling Tournament Problem.  That turned out to be impossible to do.  The Traveling Tournament Problem is hard!  Even solving the eight team problem turned out to be nontrivial.  Six teams is doable with a bit of work.

Kelly Easton, in her dissertation, put a lot of effort into the TTP, and even solved the eight team problem.  She used a network of 20 computers over the course of a week to prove optimality.  Unfortunately, she did not include the “no repeat” constraint, so we didn’t have the result for exactly the right problem.  While we believed that adding the “no repeat” constraint wouldn’t affect things too much, Kelly graduated, began working (for my small sports scheduling business) and we never solved the real problem.

Despite lots and lots of people working on the TTP, proving the optimality of a known solution to NL8 with value 39721 has been elusive.  In fact, relatively little (but some, thanks to Melo, Ribeiro, and Urrutia) work has been done on improving lower bounds.

I was thrilled yesterday to get an email Stefan Irnich of Aachen who, together with his master student Ulrich Schrempp, claims to have proved the optimality of 39721.  But here is when it gets hard.  How do I know?  It is easy to check feasible solutions, but unless there is a simply calculated lower bound, it it hard to confirm optimality.   Fortunately Stefan sent the slides of his work, and it seems clear that the approach they are taking is one that would work, and would be stronger than what others have tried.  So I have no hesitation in proclaiming that NL8 is now solved!  I am sure there are easily dozens (OK, half-dozens … OK, a half-dozen) people who are excited about this, and I am one of them (and it is my blog).

On to NL10!

Arnoff Lecture

I am just back from Cincinnati, where I gave the 17th E. Leonard Arnoff Memorial Lecture on the Practice of Management Science. When I look at the list of presenters, with my name on it, I am reminded of the Sesame Street tune “One of These Things (Is Not Like the Others)“. I was honored to be asked, and very happy to give the talk. There were about 85 in the audience, despite armaggeddon-like lightning and thunder outside.

The lecture is named after Len Arnoff, best known for his 1957 book “Introduction to Operations Research” with Churchman and Ackoff, one of the first (or the first) operations research textbook. He had a varied career: faculty member at Case and others, partner at Ernst and Ernst, and Dean of the Business School at the University of Cincinnati, before retiring to Florida in 1988. He died in 1991. (David Rogers is preparing a short biography of Arnoff, and I am grateful for the early version he sent me).

I was very pleased that his wife Ann flew up to attend the lecture, and that his daughter Susan could also attend.

The title of my talk was “Sports Scheduling and the Practice of Operations Research”. In addition to telling some stories about working with various sports leagues (you will have to attend one of my talks to hear those!), I spend some time on some of the technical issues. The main theme was that if you only knew operations research from ten years ago, you probably don’t have the techniques needed to solve real-world sport scheduling problems. In particular, the following methods don’t work very well:

  1. “Routine” integer programming: let x[i,j,t] = 1 if i plays at j in time t, etc. etc. Weak formulation that does very poorly
  2. Greedy heuristics. Generally do not lead to anything close to feasible.
  3. Local search. Simple exchange generally leads to infeasible solutions, making it hard to use simulated annealing, tabu search or other metaheuristic approaches.

The following do make a big difference (and are much more recent ideas):

  1. Complicated variables (like in branch and price): Let a variable represent a road trip, a schedule section, or a whole schedule for a team.
  2. Large neighborhood search. Relax large pieces of a schedule and reoptimize keeping the rest fixed.
  3. Constraint programming, ideally combined with integer programming, in order to quickly find feasible solutions.

More success for OR and sports

ILOG has a press release on using constraint programming for the Japanese Football (soccer for Americans) League.  Seems like a pretty big league:

J. LEAGUE is a top professional football (soccer) league in Japan and one of the most successful leagues in the Asian Football Confederation. The organization recognized a need to automate and streamline its complex match scheduling process, involving 33 teams and 79 schedules, with a total of 682 matches over a 10-month playing season. With no packaged application available for complex scheduling, J. LEAGUE decided on a custom approach based on optimization technology from ILOG. Optimization improves business decision making speed and efficiency by allowing organizations to calculate the best utilization of existing resources — in this case, team personnel, stadiums and spectators.

It is not clear why 33 teams need 79 schedules!  I find it hard enough to find a schedule for every team in a league.

Final Four based on Salaries

Payscale.com has the college basketball bracket based on median salaries of its graduates (5-10 years after graduation). Stanford ($113,000) beats Notre Dame ($99,100) in the final, with Duke ($96,800) and Georgetown ($92,500) the other two in the final four.

I always do my picks based on the quality and quantity of operations research done at the school. My final four this year were George Mason, Wisconsin, Cornell (in a tough division, beating Stanford, Kentucky, Texas, and Pittsburgh along the way) and UCLA, with Cornell winning it all. Needless to say, I won’t be getting paid this year: it appears that OR quality has, at best, a weak correlation with basketball prowess.

Use OR for your NCAA Picks

Joel Sokol and his LRMC method have made their picks for this year’s NCAA College Basketball Tournament. Joel’s method is based on work he did with Paul Kvam on a “Logistic Regression/Markov Chain Model for NCAA Basketball” published in Naval Research Logistics. This approach agrees in part with the NCAA selectors in that the predicted final four are the four number one seeds. It disagrees strongly with some of the rankings though. It would not have put UNC as a number one seed at all, let alone the top number one seed, preferring Duke over UNC. The number six seeds, who the NCAA ranks, presumably, 24-27 or so, are ranked 16 (Marquette), 24, 26, and 34 (Oklahoma). My city’s Pitt Panthers, an NCAA 4, seed are only ranked 23rd by Sokol’s method. Perhaps the biggest difference is 11th seed Kansas State, who is 19th (4th seed) under LRMC.  Kansas State over USC is also the big first round upset.

In Praise of a … Yankee?

Since I live in Pittsburgh and attend many Pittsburgh Pirate baseball games, it is tough to be very positive about the Yankees. For European readers, it is kinda like a Barça fan saying something nice about Real Madrid. But I have a new favorite player: Russ Ohlendorf, relief pitcher for the Yankees!

Why is he my favorite player? It has to do with operations research, of course. Russ received his undergraduate degree from Princeton (he is the third Princeton graduate to play in the majors) in operations research. From the New York Times article on him:

Ohlendorf graduated with a 3.75 grade-point average and a degree in operations research and financial engineering. His curiosity extends widely. Someday, Ohlendorf said, he may want to try investment banking or entrepreneurship. He has also thought about pursuing a front-office job in baseball.

“He seems to have a real interest in people from all walks of life,” Curtis Ohlendorf said. “He’s always been pretty active. He needs to be involved in something.”

For his senior thesis, Ohlendorf studied the value of draft picks for major league teams. His conclusion — which the Yankees now embrace — was that teams generally double their investment in the draft based on the production of players in their first few major league seasons.

I can’t find the thesis on the web: I would love to see the approaches used. Statistics and baseball go way back, of course, but real economic analysis or operations research modeling is a lot rarer.

Simulation and the NHL playoffs

Growing up in Winnipeg, Canada (city motto: “At least it is a dry cold”), I had a short and rather forgettable hockey career (though getting a shutout as a goalie at 12 years old remains one of my favorite memories). I have been greatly outdone by my nephew Mathieu, who actually looks like he knows what he is doing when facing a shot. Since then, I follow hockey mainly through my local team, the Pittsburgh Penguins. I have been lucky to see a number of amazing players on the Penguins: Lemieux, Jagr, and “Sid the Kid” Crosby perhaps best of all. I keep hoping to provide the schedule for the NHL, which I think is the only thing that would impress my buddies back in Winnipeg.

Armann Ingolfsson of the University of Alberta has appeared in an article in the Toronto Sun on using simulation to predict who will make the playoffs this year. You can read a more detailed description of what he does in an article published in the INFORMS Transactions on Education in 2004. The main idea is to use simulation to generate 500 possible continuations of the regular season and to determine how often every team makes the playoffs. A critical feature of this is the ability to assign probabilities of wins for every game.

I am very happy to see that the Penguins currently have a 92% chance of making the playoffs. But the system doesn’t take into account that Crosby has a “high ankle sprain” keeping him out for two months.  That might knock off a percentage point or two.

Operations Research and Fantasy Football

After a very successful year, my fantasy football teams are crashing and burning in the playoffs.  For those who do not know fantasy sports, fantasy football involves a group (8-12 people) drafting NFL players at the beginning of a season.  Each week, my team gets points based on the success (or lack thereof) of the players in their “real” games.  If my players get more points than my opponent’s, then I win.  After a regular season,  the best fantasy teams in the league then face off in the playoffs.  Some fantasy sports work a bit differently:  most fantasy baseball leagues collect statistics from the entire year and give points in the final year standings on the categories, without the head-to-head matchups.

One of my teams this year was the best I ever had.  I had the top three wide receivers in the league (Moss, Owens, and Edwards), the second best quarterback (Romo), a top running back (Addai), a great defense (Patriots), and a top kicker (Folk).  My tightend was not great, but that was the only weakness.   Over the regular season, I outscored the second best team in the league by an average of 30 points in a game (teams typically score 60-120 points in a game).  But, sure enough, the playoffs come around, my team goes cold, and I lose the first playoff round.   So now I am just struggling to get third place in the league.

I am not the best fantasy player around.  To be so would require much, much more time than my three-year-old son will allow me.  But I do try to use a bit of operations research thinking in my play.  Some (but not all) key decisions come in the initial draft.  Players are taken in turns in a serpentine fashion:  if there are 10 fantasy players, then the person with the first pick will next get the 20th pick;  the person with the 10th pick also gets the 11th pick).  Given the projected player values (which is the real key to the problem: data!), the “best pick” depends on what you expect others to do, and the relative value of the alternatives.  For instance, if you need a quarterback, and the best quarterback is worth, say, 280 points, with the next best being worth only 200 points (a huge difference), then you better pick that quarterback (unless you are absolutely sure that quarterback will be available when you pick next).  But if there are five quarterbacks in the 280 range, you can afford to wait, since it is more likely that one of them will still be available when you pick next.

Mike Fry, Andrew Lundberg (both of the University of Cincinnati) and Jeffrey Ohlmann (University of Iowa) analyzed this issue in depth in an article published in the new Journal of Quantitative Analysis in Sport.  Their work got a fair amount of press a few months ago, including a writeup in USA Today (sorry I missed it earlier:  New Zealand doesn’t cover football well!).  I just went through the article (while waiting for my son to wake up and enjoy Christmas), and it is terrific, going well beyond the obvious points.  I particularly liked the analysis of the value of each of the draft positions.  There is a view that drafting early is best, but the serpentine nature of the draft evens things out.  With the data they looked at, it is true that the first position is best, but the differences are quite slight, and the value is not monotonic in draft position.

At the end, though it comes down to player projections.  As the article quotes:

“If you value Ryan Leaf as the best quarterback, it’s going to tell you take him when it’s time to take a quarterback,” Ohlmann says. “If you give me bad projections, I’m going to give you some very bad advice.”

Sports Scheduling and Rubik’s Cube

The October 6, 2007 edition of The Spectator (an English weekly with a right-of-center, Olde England bias but wonderful cartoons, chess column, and commentary once the bias is accounted for) contains a “review” of a book The Blind Eye: A Book of Late Advice by the poet Don Paterson (I don’t think the book is available in the US). The review is actually just a collection of 30 or so extracts from the book. The review alone left me chuckling all morning. Some quotes:

I can see exactly what not to do at the moment. No doubt through the usual process of elimination, I’ll arrive at my favourite strategy of total paralysis.

I enjoyed L.’s creeping senility. I could have him repeat my favorite stories as often as I wanted, sometimes several times in the space of the same afternoon. X’s sudden lurch into his anecdotage, on the other hand, was a disaster: until then, his shyness had prevented our discovering what a bore he was.

and

You’ve made a blog … clever boy! Next: flushing.

One really hit home for me:

A poem with one line wrong is like a Rubik’s Cube with one square wrong: what it is precisely not is one move away from completion.

I have been looking for an intuitive explanation to various sports leagues on why a schedule that is perfect except for one flaw is likely a long way from a perfect schedule. I will proceed to rip off this analogy and pretend it is my own.  I admit that comparing a schedule to Rubik’s cube is not as clever as comparing a poem to the cube:  I wonder if I should compare a schedule to a poem?