Complete Enumeration Arguments Deemed Harmful…

… or “The Traveling Salesman Problem is Not That Hard”.

When I talk to people about what I do for a living, I often face blank stares (or, worse, rapidly retreating backs) when I describe problems like the Traveling Salesman Problem.

Me: “Well, suppose the dots on this napkin represent cities, and you want to find the shortest route that visit them. How could you do it?”

Them: Quick scribble through the dots. “There! So you spend the day drawing lines through dots?”

Me: “No, no! Suppose there are 1000 cities. Then…. well, there are 1000*999*998*…*2*1 ways through all the dots, and that is more than the number of freckles on a dalmatian’s back, so I look for better ways to draw lines through dots.

And I fall into the same old trap.  I try to convince people the Traveling Salesman problem is hard by saying there are a lot of solutions to look through.  And that is an extremely misleading argument that I am now convinced does more harm than good.

Imagine if I had said one of the following:

  1. “Sorting a bunch of numbers!  Hoo-wee, that’s a hard one.  If I had 1000 numbers, I would have to check all orderings of the numbers to find the sorted order.  That is 1000*999*998*…*2*1 orders and that is more than the hairs on a Hobbit’s toes.”
  2. “Maximize a linear function on n variables over m linear constraints?  Well, I happen to know some theory here and know all I need to do is look at subsets of m variables and take a matrix inverse and do some checking.  Unfortunately, with 1000 variables and 700 constraints, that is 1000 choose 700, which is more than the number of atoms in a brick.”
  3. “Find a shortest path from here to the Walmart?  That’s a tough one.  There are a lot of roads from here to there and I need to check all the paths to find the shortest one.  That would take longer than the wait in a Comcast line!”

Of course, all of these are nonsense:  having lots of possibilities may be necessary for a problem to be hard, but it is certainly not sufficient.  Sorting is a problem that anyone can find a reasonable solution for.  Shortest path is a bit harder, and linear programming (problem 2) is harder still.  But all can be solved in time much, much less than the approaches above suggest.

So why do we use this argument for the Traveling Salesman problem?  We know from complexity theory that the problem is NP-Hard, so most of us believe that there is not now a known polynomial time algorithm (there are still some who believe they have such an algorithm, but they are in the very, very tiny minority), and many of us believe that no such algorithm exists.

But that does not mean complete enumeration is necessary:  it is easy to come up with approaches that go through less than the full n! solutions to the Traveling Salesman problem (see the postscript for one example).  This is not surprising.  Complete enumeration for Satisfiability is 2^n but there are methods known that take time proportional to something like 1.4^n [Edit: as per a comment, I meant 3-Sat].  Still exponential but not complete enumeration.  And there is a big difference between 2^n and 1.00001^n (or 2^n/10^10) even if complexity theory likes to call them all “exponential”.

But the “complete enumeration” statement is even more harmful since it glosses over a much more important issue: many “hard” problems (in the complexity theory meaning) are not particularly hard (in the English sense), if you limit yourself to instances of practical interest.  Due to the tremendous amount of work that has been done on the Traveling Salesman Problem, the TSP is one such problem.  If you have an instance of the TSP of practical interest (i.e. it makes a difference to your life if you solve it or not, and it is really a TSP, not the result of some weird set of set of reductions from some other problem), then I bet you the Concorde program of Bill Cook and others will get you a provably optimal solution in a reasonable amount of time.  In fact, I would be really interested in knowing the smallest “real” instance that Concorde cannot solve in, say, one day, on a normal laptop computer.

Being hard-to-solve in the complexity theory sense does not mean being hard-to-solve in the common language sense.  And neither have much to do with the number of possible solutions that complete enumeration might go through.

As an example of this confusion, here is a paragraph from Discover on work done by Randy Olson:

Planning a vacation is a daunting task, so why not let big data take the reins?

That’s exactly what data scientist Randy Olson did. Using specialized algorithms and Google Maps, Olson computed an optimized road trip that minimizes backtracking while hitting 45 of Business Insider’s “50 Places in Europe You Need to Visit in Your Lifetime.” (Five locations were omitted simply because you can’t reach them by car.)

Mapping software has come a long way, but for this kind of challenge it’s woefully inadequate. Google Maps’ software can optimize a trip of up to 10 waypoints, and the best free route optimization software can help with 20. But when you’re looking to hit 45 or 50 landmarks, things get complicated.

According to Olson, a computer would need to evaluate 3 x 10^64 possible routes to find the best one. With current computing power, it would take 9.64 x 10^52 years to find the optimal route to hit all your desired locations — Earth would have long been consumed by the Sun before you left the house. So Olson used a clever workaround, based on the assumption that we don’t need to find the absolute best route, but one that’s pretty darn good.

Now, all this is based on an article Olson wrote months ago, and this has all be hashed over on twitter and in blogs (see here and here for example), but let me add (or repeat a few points):

  1. 45 points, particularly geographically based, is a tiny problem that can solved to optimality in seconds.  The number of possible routes is a red herring, as per the above.
  2. The “based on the assumption that we don’t need to find the absolute best route, but one that’s pretty good” is not a new idea.  Techniques known as “heuristics” have been around for millenia, and are an important part of the operations research toolkit.
  3. “Minimize backtracking” is not exactly what the problem is.
  4. The computer does not need to evaluate all possible routes
  5. With current computing power, it takes seconds (at most) to find the optimal route.
  6. I don’t expect the sun to consume the earth in the next minute.
  7. Olson’s “approach” is fine, but there are a zillion heuristics for the TSP, and it would take a very poor heuristic (maybe 2-opt by itself) not to do as well as Olson’s on this particular instance.
  8. Seriously, bringing in “big data”?  That term really doesn’t mean anything, does it?

In Olson’s defense, let me make a few other points:

  1. Few of the rest of us researchers in this area are covered in Discover.
  2. Journalists, even science journalists, are not particularly attuned to nuance on technical issues, and Olson is not the author here.
  3. The instances he created are provocative and interesting.
  4. Anything that brings interest and attention to our field is great! Even misleading articles.

But the bottom line is that real instances of the TSP can generally be solved to optimality.  If, for whatever reason, “close to optimal” is desired, there are zillions of heuristics that can do that.  The world is not eagerly waiting a new TSP heuristic. Overall, the TSP is not a particularly good problem to be looking at (unless you are Bill Cook or a few other specialists):  there is just too much known out there.  If you do look at it, don’t look at 50 point instances: the problem is too easy to be a challenge at that size.  And if you want to state that what you learn is relevant to the TSP, please read this (or at least this, for a more accessible narrative) first.  There are other problems for which small instances remain challenging: can I suggest the Traveling Tournament Problem?

Finally, let’s drop the “number of solutions” argument:  it makes 50 point TSPs look hard, and that is just misleading.


 

 

Postscript: an example of not needing to do complete enumeration for the Traveling Salesman Problem

For instance, take four cities 1, 2, 3, and 4, and denote the distance matrix to be D (assumed to be symmetric, but similar arguments work for the asymmetric case).  Then one of D(1,2)+D(3,4), D(1,3)+D(2,4), or D(1,4)+D(2,3) is maximum (if there are ties choose one of the maximums), say D(1,2)+D(3,4).  It is easy to show that the optimal tour does not have both the edges (1,2) and (3,4), since otherwise a simple exchange would get a tour no worse.  If you want to be convinced, consider the following diagram.  If a tour used (1,2) and (3,4), then an alternative tour that uses either the dashed edges (1,3) and (2,4) or the dotted edges (1,4) and (2,3) would still be a tour and would be no longer than the (1,2) (3,4) tour.  As drawn, the dashed edges give a shorter tour.

tsp

 

If you are convinced of that, then here is an algorithm that doesn’t enumerate all the tours: enumerate your tours building up city by city, but start 1, 2, 3, 4.  At this point you can stop (and not move on to 1,2,3,4,5) and move on to 1,2,3,5 and so on.  In doing so, we avoided enumerating (n-4)! tours.  Ta da!  Of course, you can do this for all foursomes of points, and don’t need to only have the foursome in the initial slots.  All this can be checked quickly at each point of the enumeration tree.  How many tours are we left with to go through?  I don’t know:  it is still exponential, but it is a heck of a lot less than n! (and is guaranteed to be less than that for any instance).  Maybe even less than the time it takes for the sun to consume the earth.

 

 

State of Operations Research Blogging

It has been almost a year since I had a blog entry here.  Not that I don’t have a lot to say!  I am using twitter more, and I do have ideas for blog entries in cases where 140 characters is not enough.  But there is limited time.

But I think something more fundamental is at work.  What is the state of blogging, and, in particular, the operations research blogging world?  It doesn’t appear all that healthy to me, but perhaps I am not seeing things correctly.

I think the blogging world was badly hurt by the cancellation of Google Reader.  At least for me, Google Reader was a fast a convenient way to follow the blogging world.  And, beyond me, I had creating a public list of OR Blogs, and a public feed of OR blog entries.  It seemed to be well used, but those ended with the end of Reader. It is harder to get word out about OR blogging.

I have tried to continue aspects of these listings on this page with a feed of OR blogs (also in sidebar) but it is much less convenient.

I also think the relentless onslaught of comment spam discouraged (and perhaps still discourages) people from trying out blogging.

Today I went through my list of OR blogs to see who has posted in the past year, and was distressed to see how few have done so.  Even with a pretty broad view of what an OR Blog is, it came to only about 40 people, with many of those (including myself!) barely meeting the “posted in the last year” requirement.

Those that remain are a fantastic source of information.  I think particularly of Laura McLay’s Punk Rock Operations Research and Anna Nagurney’s RENeW as must-read entries.  But there now seem to be few active bloggers.

Am I missing a ton of entries in the OR Blogging world  (let me know if I am missing some from my list)?  Has the world of twitter taken over from the long-form journalism that blogging provides?

In any case, I will make an effort to write more and welcome thoughts about how to strengthen the OR blogging community.

Blogging and the Changing World of Education

As a blogger, I have been a failure in the last six months.  I barely have enough time to tweet, let alone sit down for these extensively researched, tightly edited, and deeply insightful missives that characterize my blog.  I tell you, 1005 words on finding love through optimization doesn’t just happen!

phdtimeI have my excuses, of course.  As the fabulous PHD Comics points out, most of us academics seem somewhat overbooked, despite the freedom to set much of our schedule.  I am not alone in being congenitally unable to turn down “opportunities” when they come by.  “Help hire a Norwegian professor?” Sounds fun! “Be the external examiner for a French habilitation degree?” I am sure I’ll learn a lot!  “Referee another paper?” How long can that take?  “Fly to Australia for a few days to do a research center review?”  Count me in!  And that was just four weeks in February.

All this is in addition to my day job that includes a more-than-healthy dose of academic administration.  Between doing my part to run a top business school and to move along in research, not to mention family time, including picking up the leavings of a hundred pound Bernese Mountain Dog (the “Mountain” in the name comes from said leavings) and entertaining a truly remarkable nine-year-old son, my time is pretty well booked up.

And then something new comes along.  For me, this newness is something I had a hand in putting together: the Tepper School’s new FlexMBA program.  This program offers our flagship MBA program in a hybrid online/onsite structure.  Every seven weeks or so, students in the program gather at one of CMU’s campuses (we have them in Pittsburgh, Silicon Valley, and New York, we have not yet used our Qatar campus) and spend a couple days intensively starting their new courses.  This is followed by six weeks of mixed synchronous and asynchronous course material.  Asynchronous material is stuff the students can do in their own time: videos, readings, assignments, and so on.  The synchronous lesson is a bit more than an hour in a group, meeting via a group video conference, going over any issues in the material and working on case studies, sample problems, and so on.  The course ends with exams or other evaluations back on campus before starting the next courses.

Our commitment is to offer the same program as our full-time residential MBA and our part-time in-Pittsburgh MBA.  So this means, the same courses, faculty, learning objectives, and evaluations that our local students take.

We started this program last September with 29 students, and so far it has gone great.  The students are highly motivated, smart, hard-working, and engaged.  And the faculty have been amazing: they have put in tons of work to adapt their courses to this new structure.  Fortunately, we have some top-notch staff to keep things working.  Unlike some other MBA programs, we have not partnered with any outside firm on this.  If we are going to offer our degree, we want it to be our degree.

I have just finished my own course in this program.  I teach our “Statistical Decision Making” course.  This is a core course all MBA students take and revolves around multiple regression and simulation (the interesting relationships between these topics can wait for another day).  This is not the most natural course for me:  my research and background is more  on the optimization side, but I very much enjoy the course.  And teaching this course has made clear to me the real promise of the hot phrase “business analytics”:  the best of business analytics will combine the predictive analytics of statistics and machine learning with the prescriptive analytics of optimization, again a topic for another day.

My initial meeting with the students concentrated on an overview of the course and an introduction to the software through some inspiring cases.  We then moved on the the six-week distance phase.  Each of the six modules that make up a course is composed of four to eight topics.  For instance, one of my modules on multiple regression includes the topic “Identifying and Handling Muliticollinearity”.  (Briefly: multicollearity occurs when you do regression with two or more variables that can substitute for each other; imagine predicting height using both left-foot-length and right-foot-length as data).  That section of the module consists of

  • A reading from their textbook on the subject
  • One 8 minute video from me on “identifying multicollinearity”
  • One 6 minute video from me on “handling multicollinerity”
  • A three minute video of me using our statistical software to show how it occurs in the software (I separate this out so we can change software without redoing the entire course)
  • A question or two on the weekly assignment.

It would be better if I also had a quiz to check understanding of the topic, along with further pointers to additional readings.

So my course, which I previously thought of as 12 lectures, is now 35 or so topics, each with readings, videos, and software demonstrations.  While there are some relationships between the topics, much is independent, so it would be possible, for instance, to pull out the simulation portion and replace it with other topics if desired.  Or we can now repackage the material as some supplementary material for executive education courses.  The possibilities are endless.

Putting all this together was a blast, and I now understand the structure of the course, how things fit together, and how to improve the course.  For instance, there are topics that clearly don’t fit in this course, and would be better elsewhere in the curriculum.  We can simply move those topics to other courses.  And there are linkages between topics that I did not see before I broke down the course this finely.

I look forward to doing this for our more “operations research” type courses (as some of my colleagues have already done).  Operations Research seems an ideal topic for this sort of structure.  Due to its mathematical underpinnings and need for organized thinking, students sometimes find this subject difficult.  By forcing the faculty to think about it in digestible pieces, I think we will end up doing a better job of educating students.

Creating this course was tremendously time consuming.  I had not taken my own advise to get most of the course prepared before the start of the semester, so I was constantly struggling to stay ahead of the students.  But next year should go easier:  I can substitute out some of the videos, extend the current structure with some additional quizzes and the like, adapt to any new technologies we add to the program, and generally engage in the continuous improvement we want in all our courses.

But perhaps next year, I won’t have to take a hiatus from blogging to get my teaching done!

 

COIN-OR needs a new web site

A long, long time ago (1995 to be exact), INFORMS asked for volunteers to put together its website.  While I was hesitant (I was an untenured assistant professor), I decided to apply, and I became the editor of INFORMS Online.  There are few decisions that had such wide-ranging, and unforeseen, effects.  I met people (like Brian Borchers, Matt Saltzman, and many others) who have been good friends and colleagues to this day.  I worked with an amazing staff, many of whom (like my good friend Mary Magrogan) are still with INFORMS.  And I learned a lot about how to distribute information to a large, distributed organization (getting significant information from the members would have to wait for later generations of web masters).  Eventually I became President of INFORMS (for the year 2002) and Vice President for IFORS.  And, it turned out, I eventually even got tenure.  There are few decisions I have made that have changed my life so much.

One of my favorite organizations, COIN-OR,  is looking for someone to help with their website issues.   COIN-OR is a major force for open source resources in operations research. While I won’t guarantee it will change your life, I highly recommend getting involved in organizations like COIN-OR:  you meet people, you learn a lot, and somehow, life seems to work out better when you are involved in things.  Here is the announcement:

General overview:

The COIN-OR foundation is a non-profit foundation that hosts 50+ open source software projects. Currently, the Web site is hand-crafted HTML (www.coin-or.org). Pages are hosted in subversion and checked out from there. Pages describing individual project are rendered from XML (see, e.g.,https://projects.coin-or.org/SYMPHONY/browser/conf/projDesc.xml and http://www.coin-or.org/projects/SYMPHONY.xml). Project source code is hosted in subversion with TRAC providing an integrated wiki and bug tracking (see, e.g., https://projects.coin-or.org/SYMPHONY). A mailman list serve is used for support, user feedback, etc. (see http://list.coin-or.org/mailman/listinfo/). Individual projects get Web space in the form of pages checked our from subversion (see, e.g., https://projects.coin-or.org/SYMPHONY/browser/html and http://www.coin-or.org/SYMPHONY/index.htm)

The general idea is to give the site a complete makeover to give it a more modern look and feel, social media integration, Web *.* capabilities. We implemented a test site in WordPress to play with ideas.

http://wptest.coin-or.org/

Specific design goals:

1. Move site to a CMS that will allow easier maintenance, ability to grant edit permission to individual pages, ability to edit in the browser, etc. Currently, WordPress seems to do all we need and we are familiar with WordPress. We’re open to other suggestions, however.
2. With the move to CMS, upgrade look and feel of the site and add capabilities, as follows:
— Implement forums for support and user feedback. Should include the ability to have general high-level forums, as well as individual forums for each project. Users should be able to create accounts on the site and post to the forums. Forums should be moderated (or at least should have the ability to moderate.
— Upgrade TRAC to the latest version, integrate support for git, and integrate the look and feel into the overall Web site.
— Implement single sign-on for the Web site (forums), the TRAC site, subversion. and git (so far, the best solution for this seems to be OpenID). Support ability to require e-mail address for valid registration and to capture basic demographic information (again, OpenID seems the easiest option).
— Implement download form that asks downloaders to fill out a form with basic demographic information (possibly requiring some sort of account creation).
— Support the ability to auto-generate project information pages from XML templates checked out from the subversion repos of individual projects, as with current site.
— Support blog(s) for posting news items. Each project should have its own blog, but these individual blogs could be hosted on the sites of individual projects (see 3 below).
— Support an events calendar.
3. Support the creation of individual sites for each project using the same CMS (WordPress allows this).
4. Ideally, create a separate site for the foundation itself.


Dr. Ted Ralphs
Associate Professor, Lehigh University
(610) 628-1280
ted ‘at’ lehigh ‘dot’ edu
coral.ie.lehigh.edu/~ted

If you are interested, contact Ted:  the COIN-OR group is a great group to work with!

Google Reader and Operations Research

I has been some months now since Google has announced the end of Google Reader.  I have gone through many of the stages of grief (getting stuck at “Anger” for a while) and am now entering acceptance.

Personally, I have no problem with Feedly or one of the other readers.  But there is another aspect of Google Reader that seems harder to replace.  On my blog, I have two sidebar entries:  a “From the OR blogs”, giving recent posts from the OR Blogroll, and a listing of the OR Blogroll itself.  I think both are pretty useful since they given automatic visibility to new (and existing!) blogs in OR.  I don’t think they get a ton of use, but they are nice to have.

Google Reader provided scripts for both the recent posts and the blogroll.  I need  a replacement that

  1. Allows easy addition/deletion of blogs
  2. Show recent posts from all the blogs
  3. Can also act as my personal reader

The last requirement precludes most of the wordpress scripts (I use wordpress on a local machine to handle this blog).  I do not want to keep, say, a feedly list of blogs along with a separate “ORBlogRoll” within wordpress.

Anyone found a Google Reader replacement that can do this?

The cutting plane method for matching is polynomial

Michael Mitzenmacher is a computer scientist at Harvard with a blog My Biased Coin.  As you might expect from the title, Michael works in the area of randomized algorithms, and even has a book on the subject.  His blog is an extremely useful guide to the what is happening in algorithms in CS (and what is happening in CS at Harvard, which is also quite interesting).  He often provides a summary of talks given at the big theory conferences (FOCS/STOC/etc.).  He just posted on this year’s FOCS (here and here).

There was one talk that caught my eye, summarized by a doctoral student:

[Editor: Fourth-year grad student Justin Thaler of Harvard contributes a summary of two unrelated talks.]

Paper Title: The Cutting Plane Method is Polynomial for Perfect Matchings.
Harvard’s own Karthekeyan Chandrasekaran talked about joint work with Laszlo A. Vegh and Santosh S. Vempala on cutting plane algorithms for matching problems. The cutting plane method is a popular algorithm for solving integer programs (IPs), used in commercial solvers. It works by starting with an LP relaxation of the given IP to obtain basic optimal solution x_0, and then iteratively adding constraints that are valid for integer solutions but violated by the basic optimum. It continues until the basic optimum is integral. The goal of this paper is to take a step toward explaining the practical efficiency of cutting plane methods, by giving an efficient cutting-plane algorithm for min-cost perfect matching (MWPM) –MWPM is known to be in P, but it was open (apparently for 30 years) whether there was a polynomial-time cutting-plane algorithm for this problem.
A brief summary of how they achieve this is as follows. They start with a natural, well-known LP relaxation of the MWPM problem, called the bipartite relaxation. This relaxation has the nice property that all basic optima x are half-integral, and the support of x is a disjoint union of edges and odd cycles. This makes it easy to find cuts (the cuts correspond to what are called blossom inequalities, see the paper for details). A major challenge, though, is that naively adding cuts will not preserve the half-integrality of intermediate LPs, so at each iteration they throw away some of the old cuts that were added earlier in the execution. They need to take considerable care in choosing which cuts to keep in order to guarantee half-integrality of intermediate LPs (and to ensure that their algorithm makes progress at a sufficiently high rate).
 This is pretty amazing.  First, it is wonderful that they were able to prove polynomiality.  It had bothered me that it seemed you might need an exponential number of cuts, even for something like matching.  I had looked at this 25 years ago when doing my doctorate, but didn’t have any particularly insightful ideas.
But the really amazing thing is that they were able to arrange their algorithm so they never had to work with anything worse than half-integral solutions.  This is astounding!  A bane of cutting plane approaches is the weird fractions that keep popping up, leading to numerical stability problems.  Here, they were able to keep things well under control.  And, by keeping to half-integral, maybe some old ideas I had about using generalized networks (networks with multipliers) might come back into play.   The approach certainly avoids the need for Gomory-Hu cut tree approaches to finding violated inequalities:  violated inequalities come straight out of  connected components.  This also harkens back to my dissertation where I had treated matching as a generalized network with side constraints.
So I checked out the paper that underlies the talk at arXiv, thinking I might try to implement some things this week and see how it works (CS theory people rarely implement:  you could make an entire career out of simply implementing what CS people suggest).  On the plus side, they reference my dissertation, so at least I am in the right ballpark.  On the down side:  it is looking a bit complicated!  Looks like I will have to reel in one of the bright doctoral students around to plow through this with me.
And I wonder what else this might be used for?

The Treasure has been Found!

I wrote previously about an “Analytics Treasure Hunt” organized by John Toczek.  I have just heard from John that the treasure has been found:

Just wanted to give you an update on the 2012 Analytics Treasure Hunt.  The treasure was found by two member team, Madineh Sarvestani and Shepard Wallace who successfully retrieved the treasure on Saturday, 8/18/2012 at about 10am EST.  The correct coordinates were 40.07284, -75.21907.  I’ve posted their pictures on www.puzzlor.com with a short note indicating that it has been found and that the contest is over for this year.

Here are the photos of the winning pair (Shepard looks quite pleased with the cash!):

I may be depressed but I could win some money through operations research

I got an email recently from John Toczek, an OR professional with Aramark who writes the PuzzlOR column for OR/MS Today):

You ran an article on a contest I created back in 2010 called the Analytics X Competition [http://mat.tepper.cmu.edu/blog/?p=1024] where contestants were trying to predict where homicides would occur in Philadelphia zip codes.  That contest ended some time ago but I am working on two new projects that I thought might interest you.

The first is a medical diagnostic site (www.whatsmydiagnosis.com) that uses the correlations between symptoms and diseases to predict diagnoses.  The site ranks the likelihood of all diseases in descending order based on the symptoms you select.  It’s in the proof of concept phase right now but I wanted to get the word out in order to get some feedback from the community so I can improve it.

The second is an analytics treasure hunt that is due to run in Analytics magazine on July 9th.  (www.puzzlor.com then click on “The 2012 Analytics Treasure Hunt” link.)  It’s basically a series of 5 puzzles that when completed form a latitude and longitude where I have hidden $100.  If you’ve ever heard of GeoCaching, it is similar to this.  My hope with this project was to get the word out about O.R. and hopefully attract new people to the field.

I checked out the medical diagnosis site.  After putting in my sex (Male), height (5’10” and above) and weight (over 180 lb:  seriously?  in a land of morbese obesity, over 180 is the maximum grouping?) and found …. a 42.5% likelihood of depression.  That’s a bit of a downer.  In fact, I’m feeling kinda depressed about the whole thing.  Hey! It works!

I wouldn’t take the site too seriously (as the About page there says: should not be used as professional medical advice) but perhaps as more data gets entered, the predictions will get better.

As for the treasure hunt, I am looking forward to the challenges (available July 9).  I hope the hunt is difficult, though.  I am currently in Vilnius, Lithuania and will be for most of the week.  Either I luck out and John has decided that Lithuania is the ideal place to hide his $100 or it will take a while for me to get to where the money is hidden.  That’s assuming I can solve his challenges, of course.

 

All Hail the Mighty Rose (and Mary and David and the rest of the gang at INFORMS)

A bit over a year ago, INFORMS took over sponsorship of OR-Exchange, a question and answer site for operations research and analytics.  And when I say “sponsorship”, I mean they agreed to host the site and provide all the infrastructure for the system.  It was a generous offer to a community that had been struggling to find a reliable home.

Since then, OR-Exchange has, in some ways, thrived.  There are more than 600 questions, more than two thousand answers, and scores of participants.  Almost every question gets some sort of answer, and often three or four useful responses.  The site has avoided (much) spam through the diligence of the administrators (users who receive enough karma through their engagement with the site).

But it has not all be “rosy” (a bad pun, for reasons you will see!).  The system response has been, charitably, atrocious, with countless errors, time-outs, and just plain slow days.  The INFORMS people tried, but nothing seemed to help much and the open-source community that created the underlying software (OSQA) couldn’t help enough to get things working well.

So those of us who believed in OR-Exchange put up with the slowness because the system was useful.  And fun.  But we did hope for a day when the system worked better, hoping that would encourage more to join us.

This week, that day has come.  Through the work of new INFORMS IT head-honcho Rose Futchko along with INFORMS people such as David Wirth, Mary Leszczynski, and (in earlier efforts) Herman Strom and undoubtedly others (let me know so I can add to the Hall of Heros), the problems seem to have been fixed.  The system is noticeably faster and more stable.  For proof, I offer the following giving the load times of the front page of OR-Exchange every hour for the past seven days (lower is better: every horizontal line marks 10 seconds).  See if you can figure out when the new system went in.

Of course, it might go all pear-shaped (in a wonderful expression I learned in New Zealand) over the next days, but things are looking awfully good (“Don’t jinx it Trick, ye eejit you!”).

If you haven’t yet discovered the joys of OR-Exchange, now would be a pretty good time.  You are far less likely to be greeted with a 500 error!

Super Exciting News on Super Polynomiality of LP Formulations of the TSP Polytope

Years ago, I spent a very pleasant couple of weeks in a group debunking a claimed linear programming formulation of the Traveling Salesman Problem.  I wrote on this before, and bewailed the fact that I was not smart enough to figure out there was a general theorem there:  Yannakakis showed that no symmetric linear programming formulation of polynomial size can formulate the TSP.  The “symmetric” part of the theorem was always bothersome since it seemed an unnecessary addition.  You can make a symmetric formulation “unsymmetric” by doing something goofy in adding a useless variable or similar, but the fundamental result still holds.   Can asymmetry work on a fundamental level? I worked on trying to remove the symmetric assumption, but had no success.  That’s not surprising since Yannakakis clearly would have not included the requirement if it was easy to remove.  Yannakakis is smarter than Trick.  Therefore Trick cannot remove the requirement.  Q.E.D.  But I still tried for a while.

Fortunately, research moves on, and people learn more and more, and finally enough gets proved that smart people can figure out how to move forward.   It appears that the symmetric requirement can be removed.  Sebastian Pokutta, and his coauthors Samuel Fiorini, Serge Massar, Hans Raj Tiwary, and Ronal de Wolf have a paper that proves that any linear programming formulation (symmetric or not) of the TSP is exponentially sized super polynomial.  Sebastian’s blog post has a very nice description of some recent history (including the astonishing result that sometimes the symmetry requirement does matter) and gives pointers to both the paper and to some slides.  I have not had time to go through things thoroughly (or at all, really) but the paper seems a trove of interesting results.  I feel like starting a Ph.D. seminar next week to study it!

Between Bill Cook’s soon-to-be-released TSP book and this news, 2012 is shaping up to the be the year of the Traveling Salesman Problem!