## Help Santa Plan His Tour (Twice!)

Just in time for the holidays, there is a very nice competition at Kaggle: The Traveling Santa Problem. The problem was devised by the ever-creative Bob Bosch, about whom I have written before. The problem is an interesting variant on the Traveling Salesman Problem. Given a set of points and distances between them, the TS(alesman)P is to find the shortest cycle through all the points.  The TS(alesman)P would not be a great competition problem:  you could save the effort and just send the prize to my soon-to-be-neighbor Bill Cook with his Concorde code.

The TS(anta)P asks for two disjoint cycles, with a goal of minimizing the longer of the two. Of course, there is a clear explanation of why Santa wants this:

Santa likes to see new terrain every year–don’t ask, it’s a reindeer thing–and doesn’t want his route to be predictable.

OK, maybe it not so clear. But what Santa wants, Santa gets.

There is just one instance to work with, and it is a monster with 150,000 points. It turns out that the points are not randomly scattered throughout the plane, nor do they seem to correspond to real cities. I won’t spoil it here, but there is a hint at the kaggle discussion boards.

There are prizes for the best solution by the end of the competition (January 18, 2013) and, most interestingly, at a random date to be chosen between December 23 and January 17.  The \$3000 in prizes would certainly make a nice Christmas bonus (that is  7.5 Lego Death Stars!). You can check out the full rules, leaderboard, discussion, and more at the competition site.

I personally find this competition much more interesting than the data-mining type competitions like the General Electric sponsored Flight Quest (which is certainly not uninteresting). In Flight Quest, the goal is to predict landing times for planes. This is all fine and good, but as an operations researcher, I want to make some decisions to change those times to make the system work better.  Helping Santa avoid ennui might not be particularly realistic but it is much better than simply predicting when my Christmas toys arrive.

If we can get a good turnout for the Santa problem, perhaps we can see more optimization competitions, or better yet, competitions that combine data mining with optimization.

## Registries To Avoid Publication Bias

I have been thinking about the issue of how a field knows what they know.  In a previous post, I wrote about how the field of social psychology is working through the implications of fraudulent research, and is closely examining the cozy interactions between journals, reviewers, and famous researchers.   And any empirical field based on statistical analysis has got to live with the fact that if there 1000 results in the field, some number (50 perhaps, if p=.05 is a normal cutoff and lots of results are just under that value) are going to be wrong just because the statistical test created a false positive.  Of course, replication can help determine what is real and what is not, but how often do you see a paper “Confirming Prof. X’s result”?  Definitely not a smooth path to tenure.

This is worse if malevolent forces are at work.  Suppose a pharmaceutical company has bet the firm on drug X, and they want to show that drug X works.  And suppose drug X doesn’t work.  No problem!  Simply find 20 researchers, sign them to a non-disclosure, and ask them to see if drug X works.  Chances are one or more researchers will come back with a statistically significant result (in fact, there is about a 65% chance that one or more will, given a p=.05).  Publish the result, and voila!  The company is saved!  Hurray for statistics and capitalism!

Fortunately, I am not the first to see this issue:  way back in 1997, the US Congress passed a bill requiring the registration of clinical trials, before the trials get underway.

The first U.S. Federal law to require trial registration was the Food and Drug Administration Modernization Act of 1997 (FDAMA) (PDF).

Section 113 of FDAMA required that the National Institutes of Health (NIH) create a public information resource on certain clinical trials regulated by the Food and Drug Administration (FDA). Specifically, FDAMA 113 required that the registry include information about federally or privately funded clinical trials conducted under investigational new drug applications (INDs) to test the effectiveness of experimental drugs for patients with serious or life-threatening diseases or conditions.

This led to the creation of clinicaltrials.gov (where I am getting this history and the quotes) in 2000.  This was followed by major journals requiring registration before papers could be considered for publication:

In 2005 the International Committee of Medical Journal Editors (ICMJE) began to require trial registration as a condition of publication.

The site now lists more than 130,000 trials from around the world.  It seems this is a great way to avoid some (but by no means all!) fraud and errors.

I think it would be useful to have such systems in operations research.  When I ran a DIMACS Challenge twenty years ago, I had hoped to keep up with results on graph coloring so we had a better idea of “what we know”:  then and now there are graph coloring values in the published literature that cannot be correct (since, for instance, they contradict published clique values:  something must be wrong!).  I even wrote about a system more than two years ago but I have been unable to find enough time to take the system seriously.  I do continue to track results in sports scheduling, but we as a field need more such systems.

## Optimizing Angry Birds

Most operations research competitions look at problems that are, frankly, dull.  At least to the non-OR-connoisseur.   Whether it is optimizing computer usage, nurse scheduling,  financial portfolio creation, or rescheduling airplanes, there is a certain utilitarianism about operations research competitions.  There are some competitions that have some level of charm (predicting murders in Philadelphia perhaps).  But, to the outsider, most of competitions sound as exciting as eating one’s vegetables.   Good to do, but rarely filled with anticipation.

I note that our colleagues in artificial intelligence and planning have addressed this issue straight on by developing the Angry Birds Challenge.  If you do not know the Angry Birds game, get yourself down to the airport, sit next to any waiting business person with a tablet, and watch what they are doing:  chances are they are playing Angry Birds. Or ask any eight year old.

With more than a billion downloads, Angry Birds is one of the world’s leading wasters of computer cycles.  The game is popular enough that my wife and I went as game characters last Halloween, as you can see.

To summarize, your goal in Angry Birds is to slingshot birds towards various structures in an attempt to kill pigs lurking within (really!).  Your birds have different capabilities (some might drop bombs, others can split into multiple birds), and it is necessary to use those capabilities to get to all the pigs.

The game is not as mindless as it might appear.  The underlying physics engine is very good, so the collapse of the structures when hit by the birds generally feels right.  Killing the pigs takes foresight, planning, creativity (of a very specialized kind), and sometimes a little luck.  So, a natural question is: can you create a computer system to play angry birds?

For some human time-wasters, computer systems are very easy to create.   It is straightforward to create a sudoku-solver far better than any human.  For Angry Birds, it is not so clear.  There is no obvious brute-force search that has a hope of leading to a solution.  Even the problem representation seems to be a difficult issue.  This looks to be quite a challenging problem!

I don’t see how to use traditional operations research methods for this problem.  But it would be great if we could show that operations research is awesome enough to solve the big problem of our time:  how to get three stars on level 4-14!

## The Treasure has been Found!

I wrote previously about an “Analytics Treasure Hunt” organized by John Toczek.  I have just heard from John that the treasure has been found:

Just wanted to give you an update on the 2012 Analytics Treasure Hunt.  The treasure was found by two member team, Madineh Sarvestani and Shepard Wallace who successfully retrieved the treasure on Saturday, 8/18/2012 at about 10am EST.  The correct coordinates were 40.07284, -75.21907.  I’ve posted their pictures on www.puzzlor.com with a short note indicating that it has been found and that the contest is over for this year.

Here are the photos of the winning pair (Shepard looks quite pleased with the cash!):

## I may be depressed but I could win some money through operations research

I got an email recently from John Toczek, an OR professional with Aramark who writes the PuzzlOR column for OR/MS Today):

You ran an article on a contest I created back in 2010 called the Analytics X Competition [http://mat.tepper.cmu.edu/blog/?p=1024] where contestants were trying to predict where homicides would occur in Philadelphia zip codes.  That contest ended some time ago but I am working on two new projects that I thought might interest you.

The first is a medical diagnostic site (www.whatsmydiagnosis.com) that uses the correlations between symptoms and diseases to predict diagnoses.  The site ranks the likelihood of all diseases in descending order based on the symptoms you select.  It’s in the proof of concept phase right now but I wanted to get the word out in order to get some feedback from the community so I can improve it.

The second is an analytics treasure hunt that is due to run in Analytics magazine on July 9th.  (www.puzzlor.com then click on “The 2012 Analytics Treasure Hunt” link.)  It’s basically a series of 5 puzzles that when completed form a latitude and longitude where I have hidden \$100.  If you’ve ever heard of GeoCaching, it is similar to this.  My hope with this project was to get the word out about O.R. and hopefully attract new people to the field.

I checked out the medical diagnosis site.  After putting in my sex (Male), height (5’10” and above) and weight (over 180 lb:  seriously?  in a land of morbese obesity, over 180 is the maximum grouping?) and found …. a 42.5% likelihood of depression.  That’s a bit of a downer.  In fact, I’m feeling kinda depressed about the whole thing.  Hey! It works!

I wouldn’t take the site too seriously (as the About page there says: should not be used as professional medical advice) but perhaps as more data gets entered, the predictions will get better.

As for the treasure hunt, I am looking forward to the challenges (available July 9).  I hope the hunt is difficult, though.  I am currently in Vilnius, Lithuania and will be for most of the week.  Either I luck out and John has decided that Lithuania is the ideal place to hide his \$100 or it will take a while for me to get to where the money is hidden.  That’s assuming I can solve his challenges, of course.

## Benchmarks: Coloring, Sports and Umpires

I have always felt strongly that operations research needs more libraries of instances for various problem classes, along with listings of current best solutions.  By tracking how well we solve problems over time, we can show how we advance as a field.  It also makes it easier to evaluate new work, making both authors and referees work easier.

I began this direction almost two decades ago when I spent a year at DIMACS (a fantastic research institute on discrete mathematics and computer science based at Rutgers) when I (together with David Johnson) ran their Computational Challenge, with an emphasis on solving graph coloring, clique, and satisfiability instances.  From that, I put together a page on graph coloring (which has to be one of the oldest pages on the internets!)   David, Anuj Mehrotra and I followed that up in 2003 with an updated challenge just on graph coloring.   It was great to see people re-use the same instances, so we could understand the advances in the field.  It is hard to tell exactly how many papers have used the various benchmark repositories, but it is clearly the basis for hundreds of papers, based on google scholar hits on the DIMACS paper referencing the instances.

I had this experience in mind ten years ago when Kelly Easton, George Nemhauser and I wanted to publish about work we had done with Major League Baseball in their scheduling.  It made no sense to use MLB as a benchmark, since there is only one instance per year and much of the information about the needs of a schedule is confidential.  So we created the Traveling Tournament Problem that abstracts two key issues in MLB scheduling: travel distance, and “flow” (the need to mix home and away games).  We created a set of instances, solving a few smaller ones, and let it loose on the world.  The result was fantastic:  dozens of groups started working on the problem, and we could clearly see which techniques worked and which didn’t.

I had made a terrible mistake when creating benchmarks for graph coloring.  I didn’t keep track of best results.  This led to a fair amount of chaos in the field, with contradictory results appearing (claimed coloring values better than claimed lower bounds), and no clear picture of where things are going.  I had thought at one time that I would try to clean things up with a “Repository of Optimization Instances and Solutions”, but too many other things have intruded for me to spend the time necessary on that.  Fortunately, Stefano Gualandi and Marco Chiarandini have put together a site for graph coloring solutions, and I hope they will be successful in their efforts to put a little structure in the field.

I learned from that mistake and was much more diligent about keeping track of solutions for the Traveling Tournament Problem.  The TTP site is always up to date (OK, almost always), so people can reasonably trust the results there.  I have recently extended the site to include instances for non-round-robin scheduling and for the Relaxed TTP (where there is an opportunity for off-days).

One relatively new problem I am excited about is scheduling umpires in sports.  Former doctoral students Hakan Yildiz (now at Michigan State) and Tallys Yunes (Miami) and I developed a problem called the Traveling Umpire Problem which again tried to abstract out key issues in Major League Baseball scheduling.  In this case, the umpires want to travel relatively short distances (unlike the players, the umpires have no “home city”, so they are always traveling) but should not see the same teams too often.  This problem feels easier than the Traveling Tournament Problem, but we still cannot solve instances with 14 or more umpires to optimality.  This work received a fair amount of interest when the university PR people caught hold of our Interfaces paper.  Since that paper, Hakan and I have put together a couple of other papers, exploring optimization-based genetic algorithms and benders-based local search approaches for this problem (to appear in Naval Research Logistics).  Both papers illustrate nice ways of using optimization together with heuristic approaches.  The website for the problem gives more information on the problem, along with instances and our best solutions.

I don’t think my repositories of benchmarks will be as influential as, say MIPLIB, which focuses on mixed integer programs.  But I do like to think that they make the operations research world run a bit smoother.

I am a big fan of “challenges” in operations research.  Almost twenty years ago, I ran a DIMACS Challenge on finding cliques, coloring graphs and solving satisfiability problems.  That challenge gave a clear picture of where we were then in those areas and showed the variety of approaches possible for those three problems.  I also  met a large number of people and took pride in creating something useful to many.

ROADEF (the French OR Society) has run a series of Challenges over the years.  These challenges have been inspired by real-world problems and are generally very rich in detail (details on past Challenges, including test instances, are available at the Challenge site).  The recently announced 2012 Challenge is no different.  The problem to be solved comes from Google, and involves assigning jobs to machines.

The aim of this challenge is to improve the usage of a set of machines. A machine has several resources as for example RAM and CPU, and runs processes which consume these resources. Initially each process is assigned to a machine. In order to improve machine usage, processes can be moved from one machine to another. Possible moves are limited by hard constraints, as for example resource
capacity constraints, and have a cost. A solution to this problem is a new process-machine assignment which satisfies all hard constraints and minimizes a given objective cost.

The problem description then goes on for 10 pages, including issues such as locations, services, conflicts, moving costs and much, much more.  This certainly looks like a real problem and one that a firm like Google would need to solve routinely and quickly.  I can also see a number of different approaches that might work.  I look forward to seeing what methods are successful for this.

This Challenge is sponsored by ROADEF, EURO, and Google.  Google, for all its success, has only recently started to use and support operations research (though the line between computer science and operations research is very fuzzy in this area), so I am delighted to see their support of this Challenge.  Now, how about some million dollar prizes to go with it … (though 20,000 euros is not to be sniffed at).

## Call for Challenging MIP Problems

MIPLIB is a collection of instances of Mixed Integer Programs.  Versions of MIPLIB have been around since 1992, and these have been invaluable as we see how we have advanced in solving difficult MIP instances. Some instances that were essentially unsolvable twenty years ago can now be solved in a fraction of a second.  We know we are getting better!  Of course, there is a downside:  maybe we are just getting better at solving MIPLIB instances.  For instance, here is an extract from an excellent optimization code to attack MIPLIB:

```if (problem == "team10") then {
sol_val = 924;
}
else if (problem == "a1c1s1") then { ...
```

Now, I don’t claim that any real code cheats this blatantly, but it is inevitable that as codes use benchmarks like MIPLIB, they will become tuned to the issues that come from those instances.

This makes it important that MIPLIB reflect a broad range of issues faced “in the real world”.  So it is very good that the MIPLIB people (a group based at ZIB in Berlin, but including people from all over) are updating the library to create MIPLIB 2010.  If you have hard or real-life MIP instances, particularly if they are taking 15 minutes to 2 hours with commercial codes, you are welcome to upload those instances for consideration for the new library.  The more varied the library, the better our field will be.

There is one other biasing aspect that an update to MIPLIB 2010 does not fix and that is the emphasis on talking about integer programs as MPS files.  Once an instance is in an MPS file, it is often extremely difficult to back out the underlying structure.  Our constraint programming brethren are happy to talk about formulations with global constraints like alldifferent and to have codes that explicitly take advantage of those structures. I do wonder if we would be better at solving integer programs if we talked about them as higher-level structures (like tour(x1,x2,x3) and matching(x,y) and other common substructures).  By limiting ourselves to MPS format, the structures we exploit tend to be those easiest to find, like knapsack and setcovering).

But that is a topic for the future:  right now, we need to be sure MIPLIB 2010 is as varied and interesting as it can be!

## World Cup Forecast Pool, with a Twist

The Brazilian Society of Operations Research is organizing a competition for predicting the results of the group stage at the upcoming World Cup.  If you have to ask for which sport, you probably aren’t the target audience:  it is for football (aka soccer).  Many sites have such pools for many sports:  for US college basketball, the OR blogs are practically given over to the topic every March.

This competition is a bit different though:  you aren’t allowed to simply guess the winners of each game.  First, you need to give probabilities of win, loss or tie, with scoring based on squared errors.  Second, you need to use some sort of model to generate the predictions, and be willing to describe that model.  And no fair modifying your results to better fit your own ideas!  You need to stick to the model’s predictions.    There are three subcompetitions, with track A allowing fewer types of information than track B, and track C limited to members of the Brazilian OR society.

There is quite a bit of past data on the players and teams, so perhaps it is possible to create useful models.  I look forward to seeing the results.  Deadline for entries is June 7, with games starting June 11.

## Nurse, Nurse! Where’s my Nurse?

I am a sucker for competitions.  The people at PATAT (a conference series Practice and Theory of Automated Timetabling; I was co-chair for their 2004 conference, and a member of their steering committee for a few years) do a really good job at timetabling competitions.  I really liked their competition for the 2008 conference on course and exam timetabling.  I had some thoughts of using logical Benders’ to solve one of the problems but didn’t quite get my act together for it.  But I liked the problems and I liked the organization.

The next competition has just begin, though it is on an accelerated schedule.  The topic is nurse rostering and ends June 1, 2010.  This is an important practical problem:

Building good rosters for nurses in hospitals has received ample attention in recent years and is recognised as a difficult combinatorial optimisation problem with practical relevance. In hospitals much effort is spent producing solutions which are workable and of a high quality.

It is a nicely designed competition, but be aware that you can’t use commercial codes.  So those of us who use CPLEX/XPRESS/Gurobi as a linear or integer programming base are probably out of luck:

### Rule 5

Participants have to implement an algorithm to tackle the problem on a single processor machine; they can use any programming language. The use of third-party software is allowed under the following restrictions:

• it is free software
• it’s behaviour is (reasonably) documented
• it runs under a commonly-used operating system (Linux or Windows)

Working with COIN-OR is probably legal, under a broad definition of “reasonably documented”. But it is interesting that Mac operating systems are not deemed “commonly-used”.

Despite these caveats, I do recommend the competition, and perhaps will see you in Belfast!