Referees considered harmful

When doing empirical work, researchers often mess up either in the design of the experiment or in the analysis of data.  In operations research, much of our “empirical work” is in computational testing of algorithms.  Is algorithm A faster than algorithm B?  “It depends” is generally the only honest answer.  It depends on the instance selection, it depends on the computing environment, it depends on the settings, etc. etc.   If we are careful enough, we can say things that are (within the limits of the disclaimers) true.   But even a a careful experiment can fall prey to issues.  For instance, throwing away “easy” instances can bias the results against whatever algorithm is used to determine easiness.  And don’t get me started on empirical approaches that test dozens of possibilities and miraculously find something “statistically significant”, to be duly marked with an asterisk in the table of results.    It is very difficult to truly do trustworthy empirical work.  And it is even harder to do such work when researchers cheat or reviewers don’t do their job.

For some fields, these issues are even more critical.  Operations research generally has some theoretical grounding:  we know about polytopes and complexity, and so on, and can prove theorems that help guide our empirical work.  In fields like Social Psychology (the study of people in their interactions with others), practically all that is known is due to the results of experiments.   The fundamental structure in this field is a mental state, something that can only be imprecisely observed.

Social psychology is in a bit of a crisis.  In a very real sense, the field no longer knows what is true.  Some of that crisis is due to academic malfeasance, particularly that of an influential researcher Diederik Stapel.  Stapel has been found inventing data for dozens of papers, as described by a  “Science Insider” column.

Due to data fraud by Stapel and others, the field has to reexamine much of what it thought was true.  Are meat eaters more selfish than vegetarians?  We thought so for a while, but now we don’t know.  A Dutch report on this goes into great detail on this affair.

But overt fraud is not the only issue, as outlined in the report.  I was particularly struck by the role paper reviewers played in this deceit:

It is almost inconceivable that co-authors who analysed the data intensively, or reviewers of the international “leading journals”, who are deemed to be experts in their field, could have failed to see that a reported experiment would have been almost infeasible in practice, did not notice the reporting of impossible statistical results, … and did not spot values identical to many decimal places in entire series of means in the published tables. Virtually nothing of all the impossibilities, peculiarities and sloppiness mentioned in this report was observed by all these local, national and international members of the field, and no suspicion of fraud whatsoever arose.

And the role of reviewers goes beyond that of negligence:

Reviewers have also requested that not all executed analyses be reported, for example by simply leaving unmentioned any conditions for which no effects had been found, although effects were originally expected. Sometimes reviewers insisted on retrospective pilot studies, which were then reported as having been performed in advance. In this way the experiments and choices of items are justified with the benefit of hindsight.

Not infrequently reviews were strongly in favour of telling an interesting, elegant, concise and compelling story, possibly at the expense of the necessary scientific diligence.

I think it is safe to say that these issues are not unique to social psychology.  I think that I too have, as a reviewer, pushed toward telling an interesting story, although I hope not at the expense of scientific diligence.   And perhaps I could have worked harder to replicate some results during the reviewing process.

I don’t think we in operations research are in crisis over empirical issues.  I am pretty confident that CPLEX 12.4 is faster than CPLEX 4.0 for practically any instance you can throw at it.  And some journals, like  Mathematical Programming Computation have attempted to seriously address these issues.  But I am also pretty sure that some things I think are true are not true, either due to fraud by the author or negligence by reviewers.

One important role of a reviewer is to be on the lookout for malfeasance or bias and to avoid allowing (or, worse, forcing) authors to present data in an untruthful way.  And I think many of us are not doing a great job in this regard.  I would hate to have to rethink the foundations of my field due to these issues.

Operations Research and a Baseball Job

Analytics is getting to be more and more important in sports, and sports teams and leagues are looking to people with analytical skills to fill key roles in their organizations.   The MIT Sports Analytics conference is a big deal, attracting more than 2000 attendees, with an active job placement service.  The MBAs at my own school (the Tepper School) now has a sports analytics club, with a speaker series, case competition and more (including fun things like fantasy sports competitions) and many of these exceptionally bright and ambitious students are eager for jobs in the sports industry.  While some of this may be due to the success of Moneyball, much more of this is due to the fact that computers and decision making have gotten much, much better in the last years, making analytics a key competitive advantage.  And when you get past dashboards and basic data analysis and visualization, you move into using data to make better decisions.  In other words, you move into operations research.

It is clear that many clubs in Major League Baseball get it.  I see it when talking to people with my local team, the Pittsburgh Pirates (a team that I am sure will break .500 any year now!), and I just got a job announcement that shows that the next closest team to me, the Cleveland Indians, get it too.  They are looking for a VP-Technology, but it is clear that they see this as a job involving decision making, not just infrastructure.  From the ad, the primary purpose is:

The Vice President of Technology is responsible for developing, implementing, measuring and maintaining
plans that advance the organization’s achievement of its guiding commitments through enhanced
Baseball Operations and business decision-making tools, increased effectiveness of systems, hardware,
technology infrastructure and improved fan experience through fan-centric technology implementations.

I love the “decision-making tools” in that description.  Sounds just right for an operations research person who also understands technology.

 

The cutting plane method for matching is polynomial

Michael Mitzenmacher is a computer scientist at Harvard with a blog My Biased Coin.  As you might expect from the title, Michael works in the area of randomized algorithms, and even has a book on the subject.  His blog is an extremely useful guide to the what is happening in algorithms in CS (and what is happening in CS at Harvard, which is also quite interesting).  He often provides a summary of talks given at the big theory conferences (FOCS/STOC/etc.).  He just posted on this year’s FOCS (here and here).

There was one talk that caught my eye, summarized by a doctoral student:

[Editor: Fourth-year grad student Justin Thaler of Harvard contributes a summary of two unrelated talks.]

Paper Title: The Cutting Plane Method is Polynomial for Perfect Matchings.
Harvard’s own Karthekeyan Chandrasekaran talked about joint work with Laszlo A. Vegh and Santosh S. Vempala on cutting plane algorithms for matching problems. The cutting plane method is a popular algorithm for solving integer programs (IPs), used in commercial solvers. It works by starting with an LP relaxation of the given IP to obtain basic optimal solution x_0, and then iteratively adding constraints that are valid for integer solutions but violated by the basic optimum. It continues until the basic optimum is integral. The goal of this paper is to take a step toward explaining the practical efficiency of cutting plane methods, by giving an efficient cutting-plane algorithm for min-cost perfect matching (MWPM) –MWPM is known to be in P, but it was open (apparently for 30 years) whether there was a polynomial-time cutting-plane algorithm for this problem.
A brief summary of how they achieve this is as follows. They start with a natural, well-known LP relaxation of the MWPM problem, called the bipartite relaxation. This relaxation has the nice property that all basic optima x are half-integral, and the support of x is a disjoint union of edges and odd cycles. This makes it easy to find cuts (the cuts correspond to what are called blossom inequalities, see the paper for details). A major challenge, though, is that naively adding cuts will not preserve the half-integrality of intermediate LPs, so at each iteration they throw away some of the old cuts that were added earlier in the execution. They need to take considerable care in choosing which cuts to keep in order to guarantee half-integrality of intermediate LPs (and to ensure that their algorithm makes progress at a sufficiently high rate).
 This is pretty amazing.  First, it is wonderful that they were able to prove polynomiality.  It had bothered me that it seemed you might need an exponential number of cuts, even for something like matching.  I had looked at this 25 years ago when doing my doctorate, but didn’t have any particularly insightful ideas.
But the really amazing thing is that they were able to arrange their algorithm so they never had to work with anything worse than half-integral solutions.  This is astounding!  A bane of cutting plane approaches is the weird fractions that keep popping up, leading to numerical stability problems.  Here, they were able to keep things well under control.  And, by keeping to half-integral, maybe some old ideas I had about using generalized networks (networks with multipliers) might come back into play.   The approach certainly avoids the need for Gomory-Hu cut tree approaches to finding violated inequalities:  violated inequalities come straight out of  connected components.  This also harkens back to my dissertation where I had treated matching as a generalized network with side constraints.
So I checked out the paper that underlies the talk at arXiv, thinking I might try to implement some things this week and see how it works (CS theory people rarely implement:  you could make an entire career out of simply implementing what CS people suggest).  On the plus side, they reference my dissertation, so at least I am in the right ballpark.  On the down side:  it is looking a bit complicated!  Looks like I will have to reel in one of the bright doctoral students around to plow through this with me.
And I wonder what else this might be used for?

Optimizing Angry Birds

Most operations research competitions look at problems that are, frankly, dull.  At least to the non-OR-connoisseur.   Whether it is optimizing computer usage, nurse scheduling,  financial portfolio creation, or rescheduling airplanes, there is a certain utilitarianism about operations research competitions.  There are some competitions that have some level of charm (predicting murders in Philadelphia perhaps).  But, to the outsider, most of competitions sound as exciting as eating one’s vegetables.   Good to do, but rarely filled with anticipation.

I note that our colleagues in artificial intelligence and planning have addressed this issue straight on by developing the Angry Birds Challenge.  If you do not know the Angry Birds game, get yourself down to the airport, sit next to any waiting business person with a tablet, and watch what they are doing:  chances are they are playing Angry Birds. Or ask any eight year old.

With more than a billion downloads, Angry Birds is one of the world’s leading wasters of computer cycles.  The game is popular enough that my wife and I went as game characters last Halloween, as you can see.

To summarize, your goal in Angry Birds is to slingshot birds towards various structures in an attempt to kill pigs lurking within (really!).  Your birds have different capabilities (some might drop bombs, others can split into multiple birds), and it is necessary to use those capabilities to get to all the pigs.

The game is not as mindless as it might appear.  The underlying physics engine is very good, so the collapse of the structures when hit by the birds generally feels right.  Killing the pigs takes foresight, planning, creativity (of a very specialized kind), and sometimes a little luck.  So, a natural question is: can you create a computer system to play angry birds?

For some human time-wasters, computer systems are very easy to create.   It is straightforward to create a sudoku-solver far better than any human.  For Angry Birds, it is not so clear.  There is no obvious brute-force search that has a hope of leading to a solution.  Even the problem representation seems to be a difficult issue.  This looks to be quite a challenging problem!

I don’t see how to use traditional operations research methods for this problem.  But it would be great if we could show that operations research is awesome enough to solve the big problem of our time:  how to get three stars on level 4-14!