Skip to content

Complete Enumeration Arguments Deemed Harmful…

… or “The Traveling Salesman Problem is Not That Hard”.

When I talk to people about what I do for a living, I often face blank stares (or, worse, rapidly retreating backs) when I describe problems like the Traveling Salesman Problem.

Me: “Well, suppose the dots on this napkin represent cities, and you want to find the shortest route that visit them. How could you do it?”

Them: Quick scribble through the dots. “There! So you spend the day drawing lines through dots?”

Me: “No, no! Suppose there are 1000 cities. Then…. well, there are 1000*999*998*…*2*1 ways through all the dots, and that is more than the number of freckles on a dalmatian’s back, so I look for better ways to draw lines through dots.

And I fall into the same old trap.  I try to convince people the Traveling Salesman problem is hard by saying there are a lot of solutions to look through.  And that is an extremely misleading argument that I am now convinced does more harm than good.

Imagine if I had said one of the following:

  1. “Sorting a bunch of numbers!  Hoo-wee, that’s a hard one.  If I had 1000 numbers, I would have to check all orderings of the numbers to find the sorted order.  That is 1000*999*998*…*2*1 orders and that is more than the hairs on a Hobbit’s toes.”
  2. “Maximize a linear function on n variables over m linear constraints?  Well, I happen to know some theory here and know all I need to do is look at subsets of m variables and take a matrix inverse and do some checking.  Unfortunately, with 1000 variables and 700 constraints, that is 1000 choose 700, which is more than the number of atoms in a brick.”
  3. “Find a shortest path from here to the Walmart?  That’s a tough one.  There are a lot of roads from here to there and I need to check all the paths to find the shortest one.  That would take longer than the wait in a Comcast line!”

Of course, all of these are nonsense:  having lots of possibilities may be necessary for a problem to be hard, but it is certainly not sufficient.  Sorting is a problem that anyone can find a reasonable solution for.  Shortest path is a bit harder, and linear programming (problem 2) is harder still.  But all can be solved in time much, much less than the approaches above suggest.

So why do we use this argument for the Traveling Salesman problem?  We know from complexity theory that the problem is NP-Hard, so most of us believe that there is not now a known polynomial time algorithm (there are still some who believe they have such an algorithm, but they are in the very, very tiny minority), and many of us believe that no such algorithm exists.

But that does not mean complete enumeration is necessary:  it is easy to come up with approaches that go through less than the full n! solutions to the Traveling Salesman problem (see the postscript for one example).  This is not surprising.  Complete enumeration for Satisfiability is 2^n but there are methods known that take time proportional to something like 1.4^n [Edit: as per a comment, I meant 3-Sat].  Still exponential but not complete enumeration.  And there is a big difference between 2^n and 1.00001^n (or 2^n/10^10) even if complexity theory likes to call them all “exponential”.

But the “complete enumeration” statement is even more harmful since it glosses over a much more important issue: many “hard” problems (in the complexity theory meaning) are not particularly hard (in the English sense), if you limit yourself to instances of practical interest.  Due to the tremendous amount of work that has been done on the Traveling Salesman Problem, the TSP is one such problem.  If you have an instance of the TSP of practical interest (i.e. it makes a difference to your life if you solve it or not, and it is really a TSP, not the result of some weird set of set of reductions from some other problem), then I bet you the Concorde program of Bill Cook and others will get you a provably optimal solution in a reasonable amount of time.  In fact, I would be really interested in knowing the smallest “real” instance that Concorde cannot solve in, say, one day, on a normal laptop computer.

Being hard-to-solve in the complexity theory sense does not mean being hard-to-solve in the common language sense.  And neither have much to do with the number of possible solutions that complete enumeration might go through.

As an example of this confusion, here is a paragraph from Discover on work done by Randy Olson:

Planning a vacation is a daunting task, so why not let big data take the reins?

That’s exactly what data scientist Randy Olson did. Using specialized algorithms and Google Maps, Olson computed an optimized road trip that minimizes backtracking while hitting 45 of Business Insider’s “50 Places in Europe You Need to Visit in Your Lifetime.” (Five locations were omitted simply because you can’t reach them by car.)

Mapping software has come a long way, but for this kind of challenge it’s woefully inadequate. Google Maps’ software can optimize a trip of up to 10 waypoints, and the best free route optimization software can help with 20. But when you’re looking to hit 45 or 50 landmarks, things get complicated.

According to Olson, a computer would need to evaluate 3 x 10^64 possible routes to find the best one. With current computing power, it would take 9.64 x 10^52 years to find the optimal route to hit all your desired locations — Earth would have long been consumed by the Sun before you left the house. So Olson used a clever workaround, based on the assumption that we don’t need to find the absolute best route, but one that’s pretty darn good.

Now, all this is based on an article Olson wrote months ago, and this has all be hashed over on twitter and in blogs (see here and here for example), but let me add (or repeat a few points):

  1. 45 points, particularly geographically based, is a tiny problem that can solved to optimality in seconds.  The number of possible routes is a red herring, as per the above.
  2. The “based on the assumption that we don’t need to find the absolute best route, but one that’s pretty good” is not a new idea.  Techniques known as “heuristics” have been around for millenia, and are an important part of the operations research toolkit.
  3. “Minimize backtracking” is not exactly what the problem is.
  4. The computer does not need to evaluate all possible routes
  5. With current computing power, it takes seconds (at most) to find the optimal route.
  6. I don’t expect the sun to consume the earth in the next minute.
  7. Olson’s “approach” is fine, but there are a zillion heuristics for the TSP, and it would take a very poor heuristic (maybe 2-opt by itself) not to do as well as Olson’s on this particular instance.
  8. Seriously, bringing in “big data”?  That term really doesn’t mean anything, does it?

In Olson’s defense, let me make a few other points:

  1. Few of the rest of us researchers in this area are covered in Discover.
  2. Journalists, even science journalists, are not particularly attuned to nuance on technical issues, and Olson is not the author here.
  3. The instances he created are provocative and interesting.
  4. Anything that brings interest and attention to our field is great! Even misleading articles.

But the bottom line is that real instances of the TSP can generally be solved to optimality.  If, for whatever reason, “close to optimal” is desired, there are zillions of heuristics that can do that.  The world is not eagerly waiting a new TSP heuristic. Overall, the TSP is not a particularly good problem to be looking at (unless you are Bill Cook or a few other specialists):  there is just too much known out there.  If you do look at it, don’t look at 50 point instances: the problem is too easy to be a challenge at that size.  And if you want to state that what you learn is relevant to the TSP, please read this (or at least this, for a more accessible narrative) first.  There are other problems for which small instances remain challenging: can I suggest the Traveling Tournament Problem?

Finally, let’s drop the “number of solutions” argument:  it makes 50 point TSPs look hard, and that is just misleading.


 

 

Postscript: an example of not needing to do complete enumeration for the Traveling Salesman Problem

For instance, take four cities 1, 2, 3, and 4, and denote the distance matrix to be D (assumed to be symmetric, but similar arguments work for the asymmetric case).  Then one of D(1,2)+D(3,4), D(1,3)+D(2,4), or D(1,4)+D(2,3) is maximum (if there are ties choose one of the maximums), say D(1,2)+D(3,4).  It is easy to show that the optimal tour does not have both the edges (1,2) and (3,4), since otherwise a simple exchange would get a tour no worse.  If you want to be convinced, consider the following diagram.  If a tour used (1,2) and (3,4), then an alternative tour that uses either the dashed edges (1,3) and (2,4) or the dotted edges (1,4) and (2,3) would still be a tour and would be no longer than the (1,2) (3,4) tour.  As drawn, the dashed edges give a shorter tour.

tsp

 

If you are convinced of that, then here is an algorithm that doesn’t enumerate all the tours: enumerate your tours building up city by city, but start 1, 2, 3, 4.  At this point you can stop (and not move on to 1,2,3,4,5) and move on to 1,2,3,5 and so on.  In doing so, we avoided enumerating (n-4)! tours.  Ta da!  Of course, you can do this for all foursomes of points, and don’t need to only have the foursome in the initial slots.  All this can be checked quickly at each point of the enumeration tree.  How many tours are we left with to go through?  I don’t know:  it is still exponential, but it is a heck of a lot less than n! (and is guaranteed to be less than that for any instance).  Maybe even less than the time it takes for the sun to consume the earth.

 

 

State of Operations Research Blogging

It has been almost a year since I had a blog entry here.  Not that I don’t have a lot to say!  I am using twitter more, and I do have ideas for blog entries in cases where 140 characters is not enough.  But there is limited time.

But I think something more fundamental is at work.  What is the state of blogging, and, in particular, the operations research blogging world?  It doesn’t appear all that healthy to me, but perhaps I am not seeing things correctly.

I think the blogging world was badly hurt by the cancellation of Google Reader.  At least for me, Google Reader was a fast a convenient way to follow the blogging world.  And, beyond me, I had creating a public list of OR Blogs, and a public feed of OR blog entries.  It seemed to be well used, but those ended with the end of Reader. It is harder to get word out about OR blogging.

I have tried to continue aspects of these listings on this page with a feed of OR blogs (also in sidebar) but it is much less convenient.

I also think the relentless onslaught of comment spam discouraged (and perhaps still discourages) people from trying out blogging.

Today I went through my list of OR blogs to see who has posted in the past year, and was distressed to see how few have done so.  Even with a pretty broad view of what an OR Blog is, it came to only about 40 people, with many of those (including myself!) barely meeting the “posted in the last year” requirement.

Those that remain are a fantastic source of information.  I think particularly of Laura McLay’s Punk Rock Operations Research and Anna Nagurney’s RENeW as must-read entries.  But there now seem to be few active bloggers.

Am I missing a ton of entries in the OR Blogging world  (let me know if I am missing some from my list)?  Has the world of twitter taken over from the long-form journalism that blogging provides?

In any case, I will make an effort to write more and welcome thoughts about how to strengthen the OR blogging community.

Using Analytics for Emergency Response

I just attended a great talk by Laura McLay at the German OR Society meeting in Aachen.  In her semi-plenary, Laura talked about all the work she has done in Emergency Medical Response.  Planning the location and operation of ambulances, fire trucks, emergency medical technicians, and so on is a difficult problem, and Laura has made very good progress in putting operations research to use in making systems work better.  She has been recognized for this work not only in our field (through things like outstanding paper awards and an NSF CAREER award) but also by those directly involved in emergency response planning, as evidenced by an award from the National Association of Counties.

Laura covered a lot of ground in her talk (she has a dozen papers or more in the area), but I found one result in particular very striking.  Many ambulance systems have a goal of responding to 80% of their calls in 9 minutes (or some such numbers).  One of the key drivers of those values is the survivability from heart attacks:  even minutes matter in such cases.response  The graph attached (not from Laura, available in lots of places on the internet) shows a sharp dropoff as the minutes tick away.

But why 9 minutes?  It is clear from the data that if the goal is to provide response within 9 minutes, there is an awful lot of 8 minute 30 second response times.  Systems respond to what is measured.  Wouldn’t it be better, then to require 5 minute response times?  Clearly more people would be saved since more people would be reached within the critical first minutes.  This looks like a clear win for evidence-based medicine and the use of analytics in decision making.

But Laura and her coauthors have a deeper insight than that.  In the area they are looking at, which is a mix of suburban and rural areas, with a 9 minute response time, the optimal placement of ambulances is a mixture of suburban and rural locations.  With a 5 minute response time, it does no good to place an ambulance in a rural location: they can’t get to enough people in time.  All the ambulances would be placed in the higher-density suburban location.  If a call comes in from a rural location, eventually an ambulance would wend its way to the rural location, but after 20 or 30 minutes, many cases become moot.

To figure out the optimal response time, you need to figure out both survivability and the number of cases the system can reach.  For the area Laura and her team looked at, the optimal response time turned out to be 8 to 9 minutes.

Of course, this analysis is not relevant if the number of ambulances is increased with the decreased response time requirement.  But the enthusiasm for spending more on emergency response is not terrifically high, so it is more likely that the time will be changed without a corresponding increase in budget.  And that can have the effect of making the entire system worse (though things are better for the few the ambulance can reach in time).

This was a great example of the conflict between individual outcome and social outcomes in emergency response.  And a good example of how careful you need to be when using analytics in health care.

I highly recommend reading her Interfaces article “Hanover County Improves its Response to Emergency Medical 911 Patients” (no free version).  I even more highly recommend her blog Punk Rock Operations Research and her twitter stream at @lauramclay.

Taking Optimization With You After Graduation

In the Tepper MBA program, we use versions of Excel’s Solver (actually a souped up version from Frontline Systems)  for most of our basic optimization courses.  Students like this since they feel comfortable with the Excel interface and they know that they can use something like this in their summer internships and first jobs, albeit they are likely to the more crippled version standard with Excel.  For those who are particularly keen, we point them to an open source optimization system that can allow them to stay within the Excel structure.

In our most advanced course, we use AIMMS with Gurobi as the underlying solver. Students generally love the power of the system, but worry that they will not be able to translate what they learn into their jobs.  This wouldn’t be an issue if companies had analytics and optimization as a core strength, and routinely had some of the commercial software, but that is not the case.  So the issue of transfer comes up often.

I am really happy to see that Gurobi has a deal in place to allow students to continue using their software, even after they graduate.  This gives new graduates some time to wow their new employers with their skills, and to make the argument for further investment in operations research capabilities.

Here is an excerpt from an email I received from Gurobi:

Academic Site License

Our FREE academic site license allows students, faculty, and staff at a degree-granting academic institution to use Gurobi from any machine on a university local-area network. This program makes it easy for everyone at the university to have access to the latest version of Gurobi, without having to obtain their own license. You can learn more on our Academic Licensing page, and your network administrator can request a license by emailing support@gurobi.com.

Take Gurobi With You Program Update

This program allows qualified recent graduates to obtain a FREE one-year license of Gurobi for use at their new employer.

Qualified recent graduates can complete a short approval process and then receive the license including maintenance and support at no cost to themselves or their employers. This reflects our continuing support of students, even after they graduate. You can learn more on our Take Gurobi With You page.

I think this sort of program can have a great effect on the use of optimization in practice.  And we need to rethink what we teach in the classrooms now that we know the “can’t take it with you” effect is lessened.

The Baa-readth of Operations Research

IMG_20140806_150251At the recent International Federation of Operational Research Society (IFORS) meeting in Barcelona (a fabulous conference, by the way), I had the honor of being nominated as President of that “society of societies”.  If elected, my term will start January 1, 2016, so I get a bit of a head start in planning.

I was looking through one of the IFORS publications, International AbIMG_20140806_150201stracts in Operations Research.  I am sure I will write about this more, since I think this is a very nice publication looking for its purpose in the age of Google.  This journal publishes the abstracts of any paper in operations research, including papers published in non-OR journals.  In doing so, it can be more useful than Google, since there is no need to either limit keywords (“Sports AND Operations Research”) or sift through tons of irrelevant links.

I was scanning through the subject categories of the recent issue of IAOR to find papers published in “sports”.  I saw something really quite impressive.  Can you see what caught my eye?

 

Continue reading ›

Optimization, Operations Research and the Edelman Prize

This year, I have the distinct honor of chairing the committee to award the Franz Edelman Award, given out by INFORMS for the best work that “attests to the contributions of operations research and analytics in both the profit and non-profit sectors”.  This competition has been incredibly inspiring to me throughout my career.  Just this year, as a judge, I got to see extremely high-quality presentations on eradicating polio throughout the world, bringing high-speed internet to all of Australia, facilitating long kidney exchange chains, and more.  I have seen dozens of presentations over my years as an Edelman enthusiast and judge, and I always leave with the same feeling: “Wow, I wish I had done that!”.

There is nothing that makes me more enthusiastic about the current state and future prospects of operations research than the Edelman awards.  And, as a judge, I get to see all the work that doesn’t even make the finals, much of which is similarly inspiring.  Operations Research is having a tremendous effect on the world, and the Edelman Prize papers are just the (very high quality) tip of the iceberg.

I was very pleased when the editors of Optima, the newsletter of the Optimization Society of INFORMS, the newsletter of the Mathematical Optimization Society, asked me to write about the relationship between optimization and the Edelman Prize.  The result is in their current issue.  In this issue, the editors published work by the 2013 winner of the Edelman, work on optimizing dike heights in the Netherlands, a fantastic piece of work that has saved the Netherlands billions in unneeded spending.  My article appears on page 6.  Here is one extract on why the Edelman is good for the world of optimization:

There are many reasons why those in optimization should be interested in, and should support, the Edelman award.

The first, and perhaps most important, is the visibility the Edelman competition gets within an organization. A traditional part of an Edelman presentation is a video of a company CEO extolling the benefits of the project. While, in many cases, the CEO has already known about the project, this provides a great opportunity to solidify his or her understanding of the role of optimization in the success of the company. With improved understanding comes willingness to further support optimization within the firm, which leads to more investment in the field, which is good for optimization. As a side note, I find it a personal treat to watch CEOs speak of optimization with enthusiasm: they may not truly understand what they mean when they say “lagrangian based constrained optimization” but they can make a very convincing case for it.

Despite the humorous tone, I do believe this is very important:  our field needs to be known at the highest levels, and the Edelman assures this happens, at least for the finalists.  And, as I make clear in the article: it is not just optimization.  This is all of operations research.

There are dozens of great OR projects done each year that end up submitted to the Edelman Award.  I suspect there are hundreds or thousands of equally great projects done each year that don’t choose to submit (it is only four pages!).  I am hoping for a bumper crop of them to show up in the submissions this year.  Due date is not until October, but putting together the first nomination would make a great summer project.

Blogging and the Changing World of Education

As a blogger, I have been a failure in the last six months.  I barely have enough time to tweet, let alone sit down for these extensively researched, tightly edited, and deeply insightful missives that characterize my blog.  I tell you, 1005 words on finding love through optimization doesn’t just happen!

phdtimeI have my excuses, of course.  As the fabulous PHD Comics points out, most of us academics seem somewhat overbooked, despite the freedom to set much of our schedule.  I am not alone in being congenitally unable to turn down “opportunities” when they come by.  “Help hire a Norwegian professor?” Sounds fun! “Be the external examiner for a French habilitation degree?” I am sure I’ll learn a lot!  “Referee another paper?” How long can that take?  “Fly to Australia for a few days to do a research center review?”  Count me in!  And that was just four weeks in February.

All this is in addition to my day job that includes a more-than-healthy dose of academic administration.  Between doing my part to run a top business school and to move along in research, not to mention family time, including picking up the leavings of a hundred pound Bernese Mountain Dog (the “Mountain” in the name comes from said leavings) and entertaining a truly remarkable nine-year-old son, my time is pretty well booked up.

And then something new comes along.  For me, this newness is something I had a hand in putting together: the Tepper School’s new FlexMBA program.  This program offers our flagship MBA program in a hybrid online/onsite structure.  Every seven weeks or so, students in the program gather at one of CMU’s campuses (we have them in Pittsburgh, Silicon Valley, and New York, we have not yet used our Qatar campus) and spend a couple days intensively starting their new courses.  This is followed by six weeks of mixed synchronous and asynchronous course material.  Asynchronous material is stuff the students can do in their own time: videos, readings, assignments, and so on.  The synchronous lesson is a bit more than an hour in a group, meeting via a group video conference, going over any issues in the material and working on case studies, sample problems, and so on.  The course ends with exams or other evaluations back on campus before starting the next courses.

Our commitment is to offer the same program as our full-time residential MBA and our part-time in-Pittsburgh MBA.  So this means, the same courses, faculty, learning objectives, and evaluations that our local students take.

We started this program last September with 29 students, and so far it has gone great.  The students are highly motivated, smart, hard-working, and engaged.  And the faculty have been amazing: they have put in tons of work to adapt their courses to this new structure.  Fortunately, we have some top-notch staff to keep things working.  Unlike some other MBA programs, we have not partnered with any outside firm on this.  If we are going to offer our degree, we want it to be our degree.

I have just finished my own course in this program.  I teach our “Statistical Decision Making” course.  This is a core course all MBA students take and revolves around multiple regression and simulation (the interesting relationships between these topics can wait for another day).  This is not the most natural course for me:  my research and background is more  on the optimization side, but I very much enjoy the course.  And teaching this course has made clear to me the real promise of the hot phrase “business analytics”:  the best of business analytics will combine the predictive analytics of statistics and machine learning with the prescriptive analytics of optimization, again a topic for another day.

My initial meeting with the students concentrated on an overview of the course and an introduction to the software through some inspiring cases.  We then moved on the the six-week distance phase.  Each of the six modules that make up a course is composed of four to eight topics.  For instance, one of my modules on multiple regression includes the topic “Identifying and Handling Muliticollinearity”.  (Briefly: multicollearity occurs when you do regression with two or more variables that can substitute for each other; imagine predicting height using both left-foot-length and right-foot-length as data).  That section of the module consists of

  • A reading from their textbook on the subject
  • One 8 minute video from me on “identifying multicollinearity”
  • One 6 minute video from me on “handling multicollinerity”
  • A three minute video of me using our statistical software to show how it occurs in the software (I separate this out so we can change software without redoing the entire course)
  • A question or two on the weekly assignment.

It would be better if I also had a quiz to check understanding of the topic, along with further pointers to additional readings.

So my course, which I previously thought of as 12 lectures, is now 35 or so topics, each with readings, videos, and software demonstrations.  While there are some relationships between the topics, much is independent, so it would be possible, for instance, to pull out the simulation portion and replace it with other topics if desired.  Or we can now repackage the material as some supplementary material for executive education courses.  The possibilities are endless.

Putting all this together was a blast, and I now understand the structure of the course, how things fit together, and how to improve the course.  For instance, there are topics that clearly don’t fit in this course, and would be better elsewhere in the curriculum.  We can simply move those topics to other courses.  And there are linkages between topics that I did not see before I broke down the course this finely.

I look forward to doing this for our more “operations research” type courses (as some of my colleagues have already done).  Operations Research seems an ideal topic for this sort of structure.  Due to its mathematical underpinnings and need for organized thinking, students sometimes find this subject difficult.  By forcing the faculty to think about it in digestible pieces, I think we will end up doing a better job of educating students.

Creating this course was tremendously time consuming.  I had not taken my own advise to get most of the course prepared before the start of the semester, so I was constantly struggling to stay ahead of the students.  But next year should go easier:  I can substitute out some of the videos, extend the current structure with some additional quizzes and the like, adapt to any new technologies we add to the program, and generally engage in the continuous improvement we want in all our courses.

But perhaps next year, I won’t have to take a hiatus from blogging to get my teaching done!

 

Russia really owned this podium

Back in 2010, Canada’s  goal was to “own the podium” at the Winter Olympics.  What “owning the podium” meant was open to interpretation.  Some argued for “most gold medals”; others opted for “most overall medals”; still others had point values for the different types of medals.  Some argued for normalizing by population (which was won, for London 2012, by Grenada with one medal and a population of 110,821, trailed by Jamaica, Trididad and Tobago, New Zealand, Bahamas, and Slovenia) (*). Others think the whole issue is silly: people win medals, not countries.  But still, each Olympics, the question remains: Who won the podium?

I suggested dividing the podium by the fraction of “reasonable” medal weightings that lead to a win by each country.  A “reasonable” weighting is one that treats gold at least as valuable as silver; silver at least as valuable as gold; no medal as a negative weight; and with total weighting of 1.  By that measure, in Vancouver 2010, the US won with 54.75% of the podium compared to Canada’s 45.25%.  In London 2012, the US owned the entire podium.

The Sochi Olympics have just finished and the result is…. Russia in a rout.  Here are the medal standings:

 

2014medals

Since Russia has more Gold medals than anyone else plus more “Gold+Silver” plus more overall, there are no reasonable weightings for gold, silver, and bronze that result in anyone but the Russian Federation from winning.

Nonetheless, I think Canada will take golds in Mens and Womens hockey along with Mens and Womens curling (among others) and declare this a successful Olympics.

———————————————————————————

(*)  I note that some sports limit the number of entries by each country, giving a disadvantage to larger countries for population based rankings (there is only one US hockey team, for instance but Lithuania also gets just one).

Own a Ton of Operations Research History

dantzigOr perhaps own two tons of Operations Research History (I am not sure how much 70 bankers boxes weigh)!  And not just any history:  this is the mathematics library of George B. Dantzig, available by “private treaty” (i.e.: there is a price;  if you pay it, you get the whole library) from PBA Galleries.  I suspect everyone who reads this blog knows who Dantzig was, but just in case: he is the Father of Operations Research.  His fundamental work on the simplex algorithm for linear programming and other work should have won the Economics Nobel Prize. He had a very long (spanning the 1940s practically to the end of his life in 2005) , and very influential, career.  You can read more about him in this article by Cottle, Johnson, and Wets.

At the auction site, there are also some reminiscences from his daughter Jessica Dantzig Klass.   She talks about some of the books in the library:

I found two copies of Beitraege zur Theorie der linearen Ungleichungen, Theodore S. Motzkin’s dissertation, translated “Contributions to the Theory of Linear Inequalities.” This work anticipated the development of linear programming by fourteen years and is probably the reason Motzkin is known as the “grandfather of linear programming”. A close family friend, Ted, as he was known, was a gentle, mild mannered man, with intense eyes, and a sweet smile, and he “lived” mathematics, even keeping small pieces of paper by his bed, so that when he had an idea at night he would be able to write it down. His dissertation is interesting from an historic perspective; bridging the gap between Fourier and my father’s work. Ted, a student at the University of Basel in Switzerland, was awarded his Ph.D. in 1933, but it was not published until 1936 in Jerusalem. One can trace the mathematical lineage of Motzkin’s advisor, Alexander Ostrowski, back to Gauss. And until his untimely death in 1970, Motzkin was my husband’s Ph.D. advisor at UCLA.

I don’t know how expensive the collection is (and I certainly don’t have room for 70 bankers boxes of material), but it would be great if an organization (INFORMS, are you listening) or a historically-minding researcher picked this up.  I suspect in the future, there will be far fewer libraries from great researchers.  I know that my own “library” is really nothing more than the hard drive on whatever computer I am using.

Scheduling Major League Baseball

ESPN has a new “30 for 30″ short video on the scheduling of Major League Baseball.  In the video, they outline the story of Henry and Holly Stephenson who provided Major League Baseball with its schedule for twenty-five years.  They were eventually supplanted by some people with a computer program.  Those people are Doug Bureman, George Nemhauser, Kelly Easton, and me, doing business as “Sports Scheduling Group”.

It was fascinating to hear the story of the Stephensons, and a little heart-breaking to hear them finally losing a job they obviously loved.  I have never met Henry or Holly, and they have no reason to think good thoughts about me.  But I think an awful lot of them.

I began working on baseball scheduling in 1994, and it took ten years of hard work (first Doug and me, then the four of us) before MLB selected our schedule for play.

Why were we successful in 2004 and not in 1994? At the core, technology changed. The computers we used in 2004 were 1000 times faster than the 1994 computers. And the underlying optimization software was at least 1000 times faster. So technology made us at least one million times faster. And that made all the difference. Since then, computers and algorithms have made us 1000 times faster still.  And, in addition, we learned quite a bit about how to best do complicated sports scheduling problems.

Another way to see this is that in 1994, despite my doctorate and my experience and my techniques, I was 1 millionth of the scheduler that the Stephensons were. Henry and Holly Stephenson are truly scheduling savants, able to see patterns that no other human can see. But eventually technological advances overtook them.

More recently, those advances allowed us to provide the 2013 schedule with interleague play in every time slot (due to the odd number of teams in each league), something not attempted before. I am confident that we are now uniquely placed to provide such intricate schedules. But that does not take away from my admiration of the Stephensons: I am in awe of what they could do.