Pittsburgh: Hotbed of Operations Research and Baseball

Pittsburgh is becoming the center of the universe when it comes to combining baseball with operations research.  First, there is … well, me! … a Professor of Operations Research whose company provides Major League Baseball with their player and umpire schedules.  And, beginning last year, Pittsburgh has had Ross Ohlendorf, who has converted his Princeton degree in Operations Research and Financial Engineering into 5 wins and a 4.82 ERA (this year) as a starting pitcher for the Pittsburgh Pirates.

Ross seems a serious OR guy.  He did his undergraduate thesis on the financial return of players from the draft.  Overall, his conclusion was that teams get a pretty good return from the money they put into top draft picks.  ESPN has a nice article on Ross’s OR side.

In his thesis, Ross looked at drafts from 1989-1993.  Some players offered tremendous return:

Ohlendorf determined that the average signing bonus during those years was $210,236, and the average return was $2,468,127. Here are the top 10 players from his study.

“So based on the assumptions I made in my paper, the A’s signing Giambi was the biggest winner in top-100 picks of the 1989 through 1993 drafts because he played extremely well in his first six years of major league service,” Ohlendorf said. “The White Sox did the best job in these drafts, with an internal rate of return of 217 percent. Their best signing was Frank Thomas.”

It is nice to see that Ross’s intelligence does not come at the expense of collegiality:

Ohlendorf is also a popular guy in the Pirates’ clubhouse. “He is so smart,” said Pirates shortstop Jack Wilson. “We give him a hard time about how smart he is, and he’ll come right back at us. We’ll say, ‘Ross, what is the percentage chance of this or that happening?’ and he’ll say, ‘The percentage chance of you winning that game of Pluck [a card game] is 65.678 percent, not 65.667 percent.”’

Starting pitcher might not be a standard job with an OR degree, but with a 2009 salary of $491,000, Ross may have found one of the more lucrative outcomes.

Ross:  if you read this, I’ll be the guy in the stands with a sign “Operations Researchers for Ohlendorf”!

Baseball and Operations Research

Blogged at the INFORMS Practice site on how to make a trip to a baseball game a legitimate business expense.

I just arrived in Phoenix, and I’m off to this evening’s game between the Giants and the Diamondbacks.  There is an operations research connection, of course:  both the teams and the umpires are scheduled with operations research.  So this is kinda like a site visit:  I’m there to be sure exactly two teams show up, along with four umpires!

More serious posts tomorrow when I attend some of the Technology Workshops.

Sports Scheduling Woes

Being involved with sports scheduling (though not yet the National Football League), I sympathized the the schedulers, who missed an issue, from espn.com:

The NFL has moved up the start time of the New York Jets’ game against the Tennessee Titans on September 27 after the team complained to the league about having to play home games on consecutive Jewish holidays.

The league made the change Friday, rescheduling the 4:15 start to 1 p.m. a day after Jets owner Woody Johnson sent a letter to commissioner Roger Goodell suggesting the switch to allow fans to arrive home before sundown on Yom Kippur, the Jewish day of atonement.

The Jets’ home opener against New England is 1 p.m. on Sept. 20, which falls during Rosh Hashanah, the Jewish New Year.

It is stunning the number of issues a sports league needs to address.

Time for Baseball

The baseball season started a few minutes ago with Atlanta playing Philadelphia.  I’ve been working with Major League Baseball for more than a dozen years, and my (along with partners, of course) company, The Sports Scheduling Group, produces the schedules for MLB (our chief scheduler Kelly Easton does all the hard work, but I do the final day assignments), as well as for the umpires (which I do, based on some fantastic work done a few years ago in a Tepper School  MBA project, further developed in Hakan Yildiz‘ dissertation).  The start of the season is always a time of anxiety for me (not strong anxiety, but a gnawing fear):  what if I forgot to put in a game?  What if Philadelphia shows up tonight, but Atlanta’s schedule has them in Los Angeles?  It is a rather silly worry, since thousands have people have looked at the schedule at this point, so it is unlikely that anything particularly egregious is happening.

Still, I was happy tonight to see Brett Myers toss the first pitch to Kelly Johnson (a ball).

And know that he did so because of operations research.

Much more on the NCAA Tournament

In the hyper-competitive world of operations research blogging, needing to teach a class can put you hopelessly behind.  The Blog-OR-sphere is abuzz with pointers to the CNN article on computer models for predicting success in the upcoming NCAA tournament featuring Joel Sokol (see the video here).  See the blog entry at Punk Rock Operations Research as well as previous entries by me and Laura on LRMC (Logistic Regression/Markov Chain) and Laura’s article on the work of Sheldon Jacobson and Doug King.  We previously saw Sheldon in articles on predicting the US Presidential election.

Getting back to the CNN article, it is a good illustration on how hard it is to write about models:

At their cores, the computer models all operate like question machines, said Jeff Sagarin, who has been doing computer ratings for USA Today since 1985.

Different people come up with different brackets because they’re asking different questions.

Sagarin’s equations ask three questions: “Who did you play, where did you play and what was the result of each specific game?” The computer keeps repeating those questions in an “infinite loop” until it comes up with a solid answer, he said.

Sagarin has arranged the formula as such partly because he thinks home-court advantage is a big deal in college basketball.

Other models ask different questions or give the questions different weights. Sokol, of Georgia Tech, for example, cares more about the win-margin than where the game was played.

Well… kinda.  It is not that Joel has a philosophical belief in win-margin versus home court.  It is simply that his models include win-margin and the resulting predictions are more accurate because they do so.  Joel didn’t go in and say “Win margin is more important than home court”:  it is the accuracy of the resulting predictions that gives that result.  Some of his models don’t include win margin at all!

I also loved the quote:

Dan Shanoff, who blogs on sports at danshanoff.com, said gut feeling is more important than statistics, but taking a look at the numbers can never hurt.

Followup question:  “So how do you know that gut feeling is more important than statistics, Dan?”.  Reponse (presumably): “Well, it is really my gut feeling, you know, since I really haven’t looked at the numbers”. [Followup added:  Dan isn’t sure he really said what he was quoted as saying.]

Be sure to check out Laura and me, and any other OR people twittering the tournament with tag #ncaa-or, starting noon Thursday.

Tweeting the Tournament

Following up on a post from Punk Rock Operations Research, let’s use a hashtag for OR people twittering about the tournament.  I think “#ncaa-or” should work nicely.  Follow that tag at http://search.twitter.com or directly here.  And start your tweets with #ncaa-or if you want to be part of the group. Thanks to twitterers hakmem and nanoturkiye for instructions on how to set this up!

Are you ready for some College Basketball?

Joel Sokol, Paul Kvam, and George Nemhauser have a ranking called LRMC (Logistic Regression/Markov Chain) for college basketball.  This weekend is when the NCAA selects teams for its championship.  You can check out the current rankings to see whether your favorite team deserves to be in the tournament.  And, once the bracket is published, LMRC provides a guideline for predicting who will win each game.  In the past, LRMC has done very well, but I am still going to go with Pittsburgh over UNC, despite the rankings.

Back at the IMA

I am at the Institute for Mathematics and its Applications at the University of Minnesota.  This brings back very fond memories.  I was a postdoc here 21 years ago at the start of my career when they had a Special Year on Applied Combinatorics.  As I recall there were 10 postdocs that year:  nine combinatorialists and me who was trained in operations research.  The combinatorialists were all scary smart and many (including Bernd Sturmfels) went on to sparkling careers.   Doing my two postdocs (in addition to the IMA, I spent a year in Bonn, Germany at Prof. Korte’s Institute) was the best thing I have done in my career.  The postdoctoral time gave me the opportunity to move past my doctoral work and to start new research directions even before I took a permanent position.  And, given I met my wife during my postdoc in Bonn, the social aspects were also wonderful.

I am speaking tonight in the IMA’s Math Matters series.  My topic is “Sports Scheduling and the Practice of Operations Research”.   The talk is open to everyone,  so if you are in the Minneapolis area, feel free to come on by!  There has already been some press on this.

Bugs and Modeling

The web was all abuzz on December 31 as the 30Gig version of the Microsoft Zune players all stopped working.  What was up?  Was it a terrorist attack?  Solar flares?  A weird Y2K bug almost a decade later?

The truth is a bit prosaic:  there was simply a bug related to leap years.  Since the Zune was not around four years ago, 2008 was the first time for the bug to show itself.  There are descriptions of the bug in numerous places:  here is one if you haven’t seen it yet.  Bottom line:  a section of code for converting “days since January 1, 1980” (when the universe was created) to years, months, and days didn’t correctly handle a leap year, leading to an infinite loop.

It is easy to laugh at such a mistake:  why didn’t a code review or unit testing catch such a simple mistake?  But, it seems, such “simple” parts of the code seem the ones most likely not to get tested.  When you have to test all sorts of complicated things like checking authorization, playing music, handling the interface and so on, who expects problems in a date calculation?  And hence a zillion Zunes fail for a day.

Ooops!
Ooops!

I experienced something similar when I was reviewing some code I use to create a sports schedule.  Never mind the sport:  it doesn’t matter.  But the model I created aimed to have a large number of a particular type game on a particular week.  And, for the last few years, we didn’t get a lot of those games on that week.  This didn’t particularly worry me:  there are a lot of constraints that interact in complicated ways, so I assumed the optimization was right in claiming the best number was the one we got (and this was one of the less important parts of the objective).  But when I looked at the code recently, I realized there was an “off by one” error in my logic, and sure enough the previous week had a full slate of the preferred games.  Right optimization, wrong week.  Dang!

So one of my goals this week, before class starts is to relook at the code with fresh eyes and see what other errors I can find.    There are some things I can do to help find such bugs, like trying it on small instances and turning on and off various constraint types, but one difficult aspect of optimization is that knowing the optimal solution requires … optimization, making it very hard to find these sorts of bugs.

Who Knows Where Operations Research Will Lead You?

One of the nice aspects of working in operations research is that you can end up working in practically any field. I know a lot about the United States Postal Service, Major League Baseball, auction design, voting systems, and many other areas because that is where my research and reading in operations research took me.

Compared to Ronald Johnson, however, I am hopelessly narrow in my skills and interests. Major General Johnson was, until recently, the number two engineer in the US Army, as reported in the New York Times (thanks to Barry List from INFORMS for the pointer). His responsibilities were described as follows:

Before retiring from the military, Johnson was the deputy commanding general of the Army Corps of Engineers, the second-highest-ranking engineer in the Army. He supervised $18 billion of reconstruction projects in Iraq from 2003 to 2004 and commanded the 130th Combat Engineer Brigade in Bosnia from 1996 to 1998.

Now, however, he has a new job: he was hired by the National Basketball Association to be in charge of their referees.

While Johnson readily acknowledges that he does not know anything about refereeing, he knows quite a bit about difficult rebuilding efforts.

Why was he able to make this sort of career change?

N.B.A. officials are highlighting Johnson’s management and analytical skills.

And where did he get those analytical skills? Operations research, of course.

Unquestionably, Johnson did not take the typical career path to the N.B.A.’s executive suites. The commissioner’s office has generally been populated by lawyers and basketball people. Johnson, a 1976 graduate of West Point, studied mathematics and mechanical engineering. He later earned a master’s degree in operations research and systems analysis from Georgia Tech’s School of Industrial Engineering, and a master’s degree in strategy from the Army’s School of Advanced Military Studies.

This is a great example of the flexibility analytical skills provide in one’s career.