Get Money from Google for Operations Research, in Europe anyway

Google seems to be getting the idea that operations research has a lot to add to the company.  This is a long way from the time where I got the following response from a google software engineer at an information systems conference: “Integer programming?  I tried it:  it doesn’t work”.  Oh, really?

This change is great for our field:  the line between operations research and computer science is extremely fuzzy and is getting fuzzier as “business analytics” unites traditionally CS areas such as data mining with traditionally OR areas such as optimization.  Operations Research needs to be seen as a key reason for the continuing success for companies like Google.

If you are in Europe, you can even get some money for a post-doc or other research expense from google, as per this announcement from Laurent Perron:

Object: Call for Proposal for Google Focused Grant Program on Mathematical Optimization and Combinatorial Optimization in Europe.
Deadline: November 25th (strict).
Contact: proposal should be sent to both Laurent Perron (lperron@google.com) and Michel Benard (benard@google.com).
Format: A proposal is a 3 pages document following the format described in http://research.google.com/university/relations/research_awards.html

The purpose of this program is to facilitate more interaction between Google and academia and also nurture stronger relations and partnerships with universities. The intent of this focused awards program is to support academic research aimed at improving the theory and applications of mathematical and combinatorial optimization (Operations Research, Constraint Programming, Meta-Heuristics). Google funds Research Awards unrestricted and retains no intellectual property from the research. We prefer if the results from the research are open sourced and widely published. Awards through this program are for one year in the range of $10K-$150K.
A typical grant will cover 1 Post-Doctoral student for 1 year. This program is restricted to Europe.

Areas that are of particular interest include (but are not limited to):

* Inference and relaxation methods: constraint propagation, cutting planes, global constraints, graph algorithms, dynamic programming, Lagrangean and convex relaxations, counting based inferences and heuristics, constraint relaxation based heuristics.

* Search methods: branch and bound, intelligent backtracking, incomplete search, randomized search, column generation and other decomposition methods, local search, meta-heuristics, large scale parallelism.

* Integration methods: static/dynamic problem decomposition, solver communication, transformations between models and solvers, collaboration between concurrent methods, models, and solvers.

* Modeling methods: comparison of models, symmetry breaking, uncertainty, dominance relationships, model compilation into different technologies (CP, LP, etc.), learning (CP, LP) models from examples.

* Innovative applications derived from OR techniques.

As a point of comparison, last year, we funded 9 grants in the following fields: Explanations in Constraint Programming, SAT techniques in Scheduling, Just-In-Time scheduling, Complex Bin-Packing, Parallel Resources in Scheduling, Convex Integer Programming, Ride sharing, Large Scale Mixed Integer Programming, and Territory Design.

Google values well written proposals that fit into the 3 page format + references. These proposals should include the following sections:
– A clear description of the problem the authors are trying to solve, and the potential impact they could have if the research is successful.
– An overview of the state of the art in this field and an explanation of why the proposed research is innovative.
– A precise understanding on how the authors are going to solve this problem.
– A convincing experimental method that will prove the authors are solving the real problem on realistic data instances, and that will measure the actual gains.
– The biography and a link to recent publications from the P.I. related to the proposal.

Finally, receiving a Google Research Award does not grant access to Google data. In some instances Google researchers may however be interested in collaborating on the problem formulation and the algorithmic solution.

Operations Research: The Sort of Decisions That Will Get You Fired

I just saw an ad for “Moneyball”, a new movie based on the book by Michael Lewis. A baseball manager (Billy Beane of the Oakland Athletics) used analytics (“Sabremetrics” in the baseball world) to choose players who were undervalued by the rest of the baseball world.  Beane had a constrained optimization problem:  he had to get as many wins as possible with a highly binding budget constraint.  His solution to that problem was to concentrate on statistics that seemed to be undervalued in the market, notably “on base percentage” (if you don’t know baseball, this gets a bit opaque, but getting on base is not as “sexy” as hitting home runs:  home run hitters are expensive; players that just get on base were cheap at the time).

There is a great line in the ad.  A colleague (the “stats guy”) of Beane says:

This is the type of decision that will get you fired!

Brad Pitt, playing Beane,  looks worried, but perseveres.  See the ad at about 25 seconds the official ad at about 18 seconds.

[Unofficial ad deleted.]

I love that line, since it really does sum up what operations research (and make no mistake: “Moneyball” is an operations research film) is all about. When you do operations research, you create models of reality. You do not create models of decisions. The decisions come from the models. And sometimes, the decisions don’t look at all like what you expected. And that is when it gets interesting.

Sometimes these unexpected decisions are due to modeling failures: you have forgotten a constraint, or a key modeling assumption turns out to not only be incorrect (assumptions almost always have some level of incorrectness) but critically incorrect. Optimization is really good at putting solutions right where the models are the weakest. And so you modify the model, not in order to change the decision, but in order to better represent reality. And you get new decisions. And you iterate between modeling and decisions until you reach a model that you believe represents reality. At that point, the decisions are of two types. They might tell you to do what you are doing, but do it better. And that is comforting and probably improves the decision making in the organization.

Or they tell you to do something completely different. And that is when you get to “Decisions that might get you fired.” That is when you need to decide whether you believe in your model and believe in the decisions it has generated. It would certainly be easy to change the model, not to reflect reality, but to force the decisions you believe are right. But if you really believe the model, then you need to avoid that easy path. You really need to decide whether you believe in the model and the resulting decisions.

I worked with a company a few years ago on their supply chain design. The results of the model came back over and over again saying two things: there were too many distribution centers, a result everyone believed, and it was far better for each distribution center to specialize in particular products, rather than have every center handle every product. The latter decision went deeply against the grain of the organization, and objection after objection was raised against the model. It would have been easy to put in a constraint “Every distribution center has to handle every product”. But there was no justification for this constraint except the ingrained belief of the client. In fact, we could show that adding the constraint was costing the organization a significant amount of money. Eventually, at least some of the organization bought into the decisions and began devising specialized distribution centers, but it was gut-wrenching, and perhaps career threatening. After all the discussion and fighting against the decisions, I am convinced those were the right choices: the organization had to change, not just improve.

“Operations Research: The Sort of Decisions That Will Get You Fired” doesn’t have the ring of “The Science of Better”. But the insights OR can get you may lead to radically different solutions than the incremental changes the SoB campaign implied. And those are the changes that can fundamentally change firms and organizations. And careers.

Another Operations Research Challenge: ROADEF, EURO, and Google

I am a big fan of “challenges” in operations research.  Almost twenty years ago, I ran a DIMACS Challenge on finding cliques, coloring graphs and solving satisfiability problems.  That challenge gave a clear picture of where we were then in those areas and showed the variety of approaches possible for those three problems.  I also  met a large number of people and took pride in creating something useful to many.

ROADEF (the French OR Society) has run a series of Challenges over the years.  These challenges have been inspired by real-world problems and are generally very rich in detail (details on past Challenges, including test instances, are available at the Challenge site).  The recently announced 2012 Challenge is no different.  The problem to be solved comes from Google, and involves assigning jobs to machines.

The aim of this challenge is to improve the usage of a set of machines. A machine has several resources as for example RAM and CPU, and runs processes which consume these resources. Initially each process is assigned to a machine. In order to improve machine usage, processes can be moved from one machine to another. Possible moves are limited by hard constraints, as for example resource
capacity constraints, and have a cost. A solution to this problem is a new process-machine assignment which satisfies all hard constraints and minimizes a given objective cost.

The problem description then goes on for 10 pages, including issues such as locations, services, conflicts, moving costs and much, much more.  This certainly looks like a real problem and one that a firm like Google would need to solve routinely and quickly.  I can also see a number of different approaches that might work.  I look forward to seeing what methods are successful for this.

This Challenge is sponsored by ROADEF, EURO, and Google.  Google, for all its success, has only recently started to use and support operations research (though the line between computer science and operations research is very fuzzy in this area), so I am delighted to see their support of this Challenge.  Now, how about some million dollar prizes to go with it … (though 20,000 euros is not to be sniffed at).

New Optimization Software Version: Gurobi

The INFORMS Practice Meeting has become a good place for optimization software firms to announce their new versions. Gurobi is first off the mark with an announcement of version 3.0. New aspects include better use of multiple cores in the barrier solver and what looks to be significant improvements to the mixed integer programming solver. Quadratic programming will wait for version 4.0, with an expected release in November, 2010.

Future plans include second order cone programming (SOCP), including a mixed integer version. I really think I should know more about SOCP, but it doesn’t seem to fit with me: I see mixed integer linear programs everywhere I look, but never say “Wow, now there is a SOCP”. Perhaps I’ll start seeing them in time for the Gurobi release a couple of versions down the road.

Free IBM/ILOG Software…

… if you are an academic.

As part of the “blue-washing” of ILOG, the academic licensing system for IBM’s OPL Studio, constraint programming system, and CPLEX has changed to become part of IBM’s Academic Initiative Program.  Here is the full announcement:

Effective February 15, 2010, IBM is offering no-charge full-version ILOG Optimization products via IBM’s Academic Initiative program (http://www.ibm.com/academicinitiative). This move builds on the availability of limited Teaching Editions available since August 2009, and now provides registered members with renewable one-year access to IBM ILOG OPL-CPLEX, CPLEX, CP Optimizer and CP for both teaching and research use. Registered members can access ILOG Optimization software at: https://www.ibm.com/developerworks/university/software/get_software.html, where they can search for ILOG Optimization or individual solvers by name. Depending on the search criteria, electronic assemblies will be retrieved for IBM ILOG OPL Development Studio, CPLEX, CP Optimizer or CP on all commercially available platforms. To run the software, members will also need to download a key from: https://www.ibm.com/developerworks/university/support/ilog.html, but are no longer required to install ILM. Note that as part of Academic Initiative, IBM also makes its professionally-developed training materials available for use in classrooms and labs at: https://www14.software.ibm.com/webapp/devtool/scholar/web/coursewarePickPage.do?source=ai-course-websphere.

I signed up for the program yesterday, and it was painless. Once I showed my inability to read simple instructions by missing the “how to download a key” section (an issue quickly and cheerfully handled by IBM), I was up and going with CPLEX. There are also some educational materials on both the CP and CPLEX side which look very well done.

Open Source = Geek?

The parent of SourceForge.net has decided to become Geeknet, Inc (how much did they have to pay for geek.net, I wonder). I have mixed feelings on this.  On one hand, I am trying to learn from Wil Wheaton and embrace my inner geek.  I have done this so well that my colleagues tell me my inner geek is leaking out quite a bit to my outer geek.  So celebrating geekdom is good, right?

Well, maybe not right.  I am involved in an operations research open source endeavor, COIN-OR.  Part of COIN-OR is convincing companies that open source is not too nerdy or experimental.  There are lots of good reasons to use open source software. Embracing your inner geek is probably not one of the more persuasive.  One vision of COIN-OR was to be “Sourceforge for operations research”.  Now should we become “Like operations research, only geekier?”.

The Sourceforge Blog has an amusing post on the top ten rejected names, including my favorite:

7. FLOSSdaily

I think the entry summarizes the issue correctly:

the consensus seems to be that, as long as we don’t change the name of sourceforge.net, it doesn’t really matter what we call the corporation, so I think we’re good.

But I would be worried about the next “branding” idea from the good people at Geeknet, Inc.  Perhaps their inner geek should stay a little more … inside.

Solver Foundation Version 2 Announced

Just in time for the INFORMS Meeting in San Diego, the Microsoft Solver Foundation folks have announced Solver Foundation Version 2. I am particularly excited about this since I have promised (and have been negligent about) a new course this year entitled “Operations Research Implementations” aimed at our MBA students. The idea is to go beyond the 20 variable, 20 constraint models into some “real” models. We (I am doing this with Willem van Hoeve) are looking at a number of optimization systems on which to base this course. Our students are extremely comfortable with Excel, and the screen shot from the post is quite provocative. The students are generally not programmers, but maybe this is something they can navigate.

Lots of new additions to Solver Foundation 2, including automatic handling of uncertainty, better MIP performance through Gurobi 2.0 as the default solver, constraint programming improvements and much more. I wonder if our MBAs are ready for Second Order Conic Programming?

Gurobi 2.0 Released

Version 2.0 of the Gurobi Optimizer has been released. Since it was only about a year ago that I first blogged about Gurobi, it has moved along very quickly. The “horse race” between Gurobi, IBM, and FICO doesn’t have a simple answer, but it is clear that Gurobi is competitive with the other two solvers, and is sometimes much faster.

I’ll leave it to Hans and others to report on the speed of the new version. Let me point out some interesting aspects on other issues:

  • There is an interesting academic license. You can get a free, no limits, one-week license. How useful is a license for one week? Well, the key point is that as long as you are hooked to an academic domain, you can renew the license as often as you like. And the signup and renewal does not require any human intervention. This is a great idea! Students (all of them!) can play around with (and do serious work with) the full version as much as they like. When they graduate, their ability to renew their license goes away. This approach to licensing from a commercial firm gives a lot of flexibility in the academic environment.
  • Gurobi now offers cloud computing. You can use Amazon Elastic Computing Cloud (EC2) to do your work. Here the pricing is by the hour. So if you need to do lots of computing in a short amount of time, you don’t need to buy a lot of licenses: you can let EC2 do the work for you (at what seems to be a pretty low per hour cost).
  • Gurobi now does “irreducible infeasible subset” (something that CPLEX has had for a while). This finds a small set of constraints and variables that cause infeasibility in a model. I find this extremely useful for finding bugs in my models. This also forms the basis for work I am doing on Benders’ approaches, so it is great to see it in Gurobi. [An email from Gurobi reminds me that the previous version of Gurobi had this for LP; the 2.0 version is for both LP and MIP.]

It is great to see the activity going on: advances by one solver will spur more efforts from the rest of the solver companies. And I really like Gurobi’s creativity when it comes to licenses and distribution.

Mittelmann’s Benchmarks CPLEX verus Gurobi

Hans Mittelmann has some new benchmarks comparing CPLEX 12.1 with GUROBI 1.1.2 on various mixed integer linear programming instances (I last wrote on these benchmarks last January with earlier versions of both codes:  be sure to check out the comments from that post since many of those comments apply to this also).  He covers both feasibility and optimization problems.

Checking the computation times for one set of optimizatin instances:

  problem     CPLEX1    GUROBI1     CPLEX4    GUROBI4
------------------------------------------------------
bc1            101         94         81         51
bienst2        165         58         87         27
dano3_5        148        618        276        362
mark._4_0      225         45         86         36
mark._5_0        f       6909          f       4368
mkc1             5        113          5
qap10           51         76         48         73
seymour1       178        285        116        161
swath2          13         10          5          6
swath3          26        563         49         53
30_05_100        7         11          4         11

The CPLEX1 and GUROBI1 use one processor; the …4 versions use 4.  The “f” means optimality was not proved in 7200 seconds (all times in seconds).   The above table is not representative, so be sure to check out Hans’ page.

A couple conclusions that I come up with (your mileage may vary):

  • Gurobi is a very impressive code:  whether CPLEX or Gurobi will be faster on a particular instance is not predictable, which is an impressive feat.
  • Parallel speedup is quite unpredictable.  Often there is no, or minimal, speedup;  sometimes there is huge speedup;  sometimes there is a slowdown.  None of this is surprising:  the initial linear relaxation is hard to parallelize, and the nondeterministic nature of tree search can lead to getting “lucky” or “unlucky” in the parallel search.

If we pull a few lines from the feasibility benchmarks:

============================================================
problem        CPLEX    FP-ab     FP-bfl    SCIP     GUROBI
------------------------------------------------------------
atlanta-ip      848        23        21       391       189
core4872-1529     1       413        52       374         1
ds                1         -      5350        38         3
germanrr        183        10         6       200        19
momentum1         1        23        17      6257         2
momentum2      6181        47        71      1504      4118
momentum3         1       350       627         -         4
msc98-ip        185        19        25      1754        33
neos16           47       236       241        43        72
neos-506428    1338      2969      4531      1739      2694

(FP are Feasibilty Pump approaches; SCIP is a fast non-commercial code).   Here the results are all over the map:  practically anything can win and anything can lose and there can be  three orders of magnitude difference between running times.

Now speed is not the only thing to look for in an optimization code (support, interfaces, price are a few other things to look at), but these benchmarks are very useful in trying to compare codes.  Thanks Hans!

Bluewashing Complete!

Two things happened to me last week with regards to IBM/ILOG (a topic I have written about quite a bit):

  1. I was on a conference call with an ILOG person and I said, jokingly, “Is that XXX from ILOG, an IBM Company?”.  His reply:  “Nope, just XXX from IBM”.  The “ILOG, an IBM Company” phrasing ended earlier this summer.
  2. I got an email from the PR people at ILOG (IBM, damn it, IBM!) saying in, essence, “Bluewashing for ILOG is complete”.   “Bluewashing”?  What the heck is that?  A web search suggests something about agreeing with the United Nations to not use child labor, but somehow I can’t see poor third world children toiling over ever-more-complicated polyhedral facets.

So last Thursday, I had a conference call with a number of IBM/ILOG people.  I don’t do many of these types of calls:  as is obvious from the blog, I am not a journalist, and I tend to get so caught up with the conversation that I don’t take notes or otherwise be anything other than an interested academic.  For instance, last spring, I had a wonderful private lunch with Pierre Haren (the CEO of ILOG turned Vice President at IBM).  Great conversation resulted in notes like “Optimization, constraint programming, integration”…  sigh…

Back to my call on Thursday.  “Bluewashing” turns out to be the phrase that IBM uses to describe the process of turning a new acquisition into part of IBM.  While the term sounds somewhat ominous, it seems much milder (anyone with an inside view to the contrary is welcome to comment!).  Once IBM owns code, for instance, it must be bluewashed to meet IBM standards.  If there is open source code included in acquired code, then it must be examined to determine if it meets IBM’s standards for using the code (is the license compatible with IBM usage and so on).  If not, then “bluewashed” code is substituted in.    IBM also has standard rules for things like academic licenses, so those had to be changed (made better, at least as far as support goes:  now academic licenses include full support).  People need to be “bluewashed” also, which I understood to mainly mean things like standardized benefits and so on (and not rubber hoses while the history of IBM is recited).

Coinciding with this is a change in corporate name.  ILOG is no longer “ILOG an IBM Company”;  ILOG is a brand name for a series of products that are part of IBM Websphere  (you can see this on the ILOG home page, where it used to say “ILOG an IBM Company”).  ILOG people are now simply IBM people.  Organizationally, things haven’t changed very much:  the former “ILOG people” are in an organization under Pierre Haren and have not been scattered throughout the IBM system.

A couple interesting aspects of my call.  Most of the conversation was with Lorenzo Delavegas who is lead of integration for the ILOG acquistion.   In a previous life, I ran the Carnegie Bosch Institute for Applied Studies in International Management, where one of the big research issues was how to successfully handle a merger or acquisition.  Most mergers and acquistions do not meet the rather rosy expectations that went into them, and many of them destroy shareholder value (see Daimler-Chrysler for an example).  IBM has specialists like Lorenzo oversee the integration of new companies to try to increase the chance of success.  Since he is not an operations research person, it was interesting to get his perspective on how ILOG fits within IBM.

The one phrase that was repeated multiple times was “stars lining up”.  It was obvious that the driver for IBM’s acquisition of ILOG was rules.  No one at IBM or ILOG has ever tried our intelligence by claiming anything else.  But optimization, in the form of CPLEX and the constraint programming software and in the people involved in creating and deploying that software, is now seen as a tremendous asset in IBM’s strategy, as shown in IBM’s Smarter Planet and IBM Business Analytics and Optimization.  These look to be great opportunities for operations research within IBM.  And from the call, I certainly feel pretty confident about CPLEX and the constraint programming codes as ILOG products from IBM.  Both the IBMers and ILOGers talked about how much easier it is to reach different communities through IBM.

I was also interested in how having CPLEX would affect IBM’s very strong support for open source, particularly through COIN-OR.  While I certainly did not get the feeling that the CPLEX team will be dropping everything to concentrate on open source efforts, I also ended up feeling that the acquisition is at worst neutral for open source optimization, and might well become an asset in time.

The ostensible reason for the PR call was the announcement of CPLEX 12 and other versions of ILOG’s software.  Most of this was announced at the INFORMS meeting in the spring (this is one reason to go to conferences:  you really do get a bit ahead of the crowd by being there).  The Excel interface has been changed since it was presented in the spring, I hear, so I look forward to checking that out.

It is not surprising that people on a PR call would be pretty upbeat about how things are going, but it was contagious.  IBM has always been strong in operations research, but I get the feeling that more of the company is seeing that strength.   And that can only be good for operations research.

My final comment for Lorenzo and the IBM group:  I would love to see the phrase “operations research” used in the Smarter Planet commercials!