Operations Research in the Caymans?

Through the magic of “Google Alerts“, I get email every time “operations research” (and a bunch of other terms, some of which are none of your business) appears in the press. I was delighted to see an article in the Cayman Net News about operations research. While we love New Zealand, I will say our first choice was a sabbatical year in the Caymans. But that sounded a little too “recreational”. This was clearly just a failure of imagination. So, Tracy Hargrave, thanks for the OR plug, and if you need a lecturer to explain OR down there, just drop me a note!

Algorithms and the Economist

Economist logoThe Economist, in its September 13th issue, has a nice article on how businesses use algorithms to work better. From the introduction:

ALGORITHMS sound scary, of interest only to dome-headed mathematicians. In fact they have become the instruction manuals for a host of routine consumer transactions. Browse for a book on Amazon.com and algorithms generate recommendations for other titles to buy. Buy a copy and they help a logistics firm to decide on the best delivery route. Ring to check your order’s progress and more algorithms spring into action to determine the quickest connection to and through a call-centre. From analysing credit-card transactions to deciding how to stack supermarket shelves, algorithms now underpin a large amount of everyday life.

They have a number of very nice examples, including UPS:

For its fleet of aircraft in America, the company uses an algorithm called VOLCANO (which stands for Volume, Location and Aircraft Network Optimiser). Developed jointly with the Massachusetts Institute of Technology (MIT), it is used by three different planning groups within UPS—one to plan schedules for the following four to six months, one to work out what kind of facilities and aircraft might be needed over the next two to ten years, and one to plan for the peak season between Thanksgiving and Christmas. Getting the scheduling wrong imposes a heavy cost: flying half-empty planes or leasing extra aircraft is an expensive business. UPSVOLCANO has saved the company tens of millions of dollars since its introduction in 2000.

This UPS example is one of my favorite classroom examples of how algorithmic advances can be a key driver in success for a firm.

Of course, the phrase “operations research” never appears. I guess if people find algorithms scary, operations research will drive them away screaming. The magazine unwittingly shows why operations research never seems to get its due. In a sidebar, “ant algorithms” are given as an example of a powerful algorithm “with great potential”. The great strength of “ant algorithms” is a great name and a wonderful analogy. As for potential, perhaps… I would love to see an ant algorithm that can come close to the integer programming techniques that underly VOLCANO and other UPS systems.

Thanks to the four or five people who wrote me about this article. Even without mentioning operations research, articles like this are good for our field: they teach people the importance of our sorts of approaches to problem solving.

Math, Poker, and Peter Winkler

Peter Winkler was kind enough to send me a copy of his new book Mathematical Mind-Benders. It is full of wonderful puzzles, at a level similar to the level aimed at by Martin Gardner (I have written about Peter’s work before): go and buy it! Here is a question (for which I will provide the answer in a day or so):

In poker, what is the best full house?

Comment: You may assume you are playing straight five-card stud poker (everyone gets five cards, all closed, no exchange] with (say) five companions, using a single normal deck. As a result of God owing you a favor, you are entitled to a full house, and you get to choose the full house you will get. Which should you choose?

This one I got (I get about 1 in 10 of the puzzles, mainly because either I have seen them or I know the relevant professional literature). I wonder how many professional poker players get it? Answer in a day or two. Comments with guesses welcome.

Pecha-Kucha: the Solution to a Conference Problem

I am involved in a number of small (100 attendees or so) conference series: CPAI-OR, PATAT, MISTA, and a number of others. One problem all of these smaller conferences have is getting enough people to attend. Typically, they are competitive, so only accept a limited (30 or so) number of papers. This means we get 30 or so attendees naturally, and then have to struggle to attract more “non-speakers”. Many universities will not fund travel when no talk is given, so getting an audience can be a struggle.

Large conferences like INFORMS have a related problem: there are just not enough rooms for everyone. At the last INFORMS conference in Pittsburgh, we ran 55 parallel sessions and packed 4 talks into practically every 90 minute session. It looked like we might not have enough space. How could we let everyone in who wants to talk (INFORMS conferences are decidedly not competitive: the organization believes that everyone who wants a chance to talk should be able to talk)?

One common way of handling these constraints is poster sessions, where speakers print out a precis of their work, stick it on posterboards, and then stand around hoping someone will come talk to them. For some fields, this works well, but it rarely works in OR. Often the poster sessions are poorly attended or are of somewhat less status.

Wired magazine has an article on a solution to all this: Pecha Kucha. Pecha Kucha was “invented” in Tokyo for architects to show their stuff off. It has now taken off as a form of performance art.

Let us now bullet-point our praise for Mark Dytham and Astrid Klein, two Tokyo-based architects who have turned PowerPoint, that fixture of cubicle life, into both art form and competitive sport. Their innovation, dubbed pecha-kucha (Japanese for “chatter”), applies a simple set of rules to presentations: exactly 20 slides displayed for 20 seconds each. That’s it. Say what you need to say in six minutes and 40 seconds of exquisitely matched words and images and then sit the hell down. The result, in the hands of masters of the form, combines business meeting and poetry slam to transform corporate clich into surprisingly compelling beat-the-clock performance art.

Perfect! Let’s run a conference where every presentation abides by these rules. No more half-hour detailed run-through of proofs (which no-one in the audience is following), but 6 minutes 40 seconds of operations research performance art. Put six of them in an hour-long session, have 20 minutes afterwards for mingling and one-on-one informal questions, ring a bell and move on to the next session. Seven such sessions a day allows time for lunch and a big-shot giving a plenary. Small conferences have room for 120 presentations over 3 days. INFORMS could run with only 20 parallel sessions.

I think it would be a bit exhausting, but I have no doubt attendees would get more out of the conference. And the presentations would then be perfect for YouTube and other uses.

I see from the site that there is a Pecha Kucha event in Auckland: I will have to get over to it (and think about how my research best fits into twenty 20-second slides).

Scheduling people in the Netherlands

John Poppelaars has started a blog entitled “OR at Work” about OR applications in practice. One of his posts is about employee scheduling, where a Dutch directive includes the following requirements:

In normal English the rule states that when an employee performs one or more resident on call duties, each period of 7 times 24 hours must contain at least 90 hours of compensatory rest. So far this is simple, we can easily add up all the rest periods an see if it matches the requested 90 hours. There are additional restrictions however on how the compensatory rest period is divided up over the 168 hours period. These restrictions state that there should be at least one period of 24 hours compensatory rest, four of at least 11 hours, one of at least 10 hours and finally one of at least 8 hours. 7 rest periods in total.

As John points out, such rules are really good for OR people: it makes things so complicated that only OR models have a chance of doing well. Of course, whether our own approaches work (let’s see, let x[10] be one if it is a ten hour rest, but not eleven, then x[10]= … hmmm… maybe a branch-and-price approach instead…) is a little unclear. But bring on complexity: the more people get frustrated, the more they need us!

Operations Research and Tattoos

Being in New Zealand, I am looking for the one big memento of the year. And naturally tattoos come to mind. Here, tattoos are not (necessarily) a sign of a an evening with too much drink, but are a an integral part of the Maori culture, and have been adopted by pakeha (those of European descent: New Zealand almost uniquely takes an indigenous term as a non-pejorative term for the later arriving group) as a sign of national pride. Of course, not just any tattoo would work. It would have to reflect my core values, my aspirations, the real me. So it would have to be operations research oriented. Now if I were a queueing person, I could just get Little’s Law (L= l w) (the little l being lambda), but I am not. I suppose I could get a two dimensional polytope with some objective lines put in. If I was more active in COIN-OR, their logo would look pretty nice, and I think my friend Robin would probably approve (she probably has one already!). Or with a skilled enough artist, perhaps one of Bob Bosch’s wonderful TSP arts: perhaps George Dantzig as TSP covering my back. But none of this feels right.

There is inspiration (some of it quite scary) from the rest of science in this wonderful post from the Loom. But there is also a horror story or two about.

INFORMS and YouTube videos

INFORMS has some videos on YouTube from the recent Edelman Awards. This is great for the field. The three videos are

  1. General OR, and the Edelman Award
  2. The 2007 Award (on more efficient ways to attack prostate cancer)
  3. A detailed video on the 2006 award (Warner Robbins on repairing planes)

So far the videos have been accessed a couple of dozen times each. Pass along the pointers, and let’s get that count up!

Netflix Prize Update

I previously wrote about the Netflix prize: come up with a better system to recommend movies based on a large amount of data, and win $1 million. Tim Spey has a wonderful article on the dataset and the competition (though I can’t see a couple of the graphs he talks about). It is clear the dataset is pretty strange:

  • Customer 2170930 has rated 1963 titles and given each and every one a rating of one (very bad). You would think they would have cancelled their subscription by now.
  • Five customers have rated over 10,000 of the 17,770 titles selected – and presumably they also have rated some of the others among the 60,000 or so titles Netflix had available when they released the ratings. Are these real people?
  • Customer 305344 had rated 17654 titles. Even though Netflix make it easy to rate titles that you have not rented from them (so they can get a handle on your preferences) can this be real?
  • Customer 1664010 rated 5446 titles in a single day (October 12, 2005).

The main point of the entry, however, is that it is unclear that these sort of recommender systems can be useful in predicting consumer preferences. One striking point made is that a naive algorithm (predict the average value so far) is not that much worse than the Netflix system:

A simple algorithm that uses the average rating for each title as the prediction – “let’s see, the average rating for the 104,000 customers who rated Mean Girls was 3.514, so I predict you will give it a rating of 3.514” – gets an RMSE [Root Mean Squared Error] of 1.0540. Netflx’s Cinematch algorithm has an RMSE of 0.9525. Netflix set the prize target at a 10% improvement over that, which is an RMSE 0.8563. So the range that recommendation systems can realistically cover – from naively simple to cutting-edge research – seems to be [a] narrow band

To put that in perspective, here is the effect that sort of decrease in error has:

Anyway, if the errors followed a normal distribution (which they don’t, but we’re talking back-of-envelope here) then if a customer actually rated a title as 2 (poor), an algorithm with an RMSE of 1.0 would predict somewhere between 1 and 3 about 70% of the time. Not bad, but not startling. If the algorithm gave ten recommended movies, then it would get on average seven out of ten within one unit of the customer’s actual rating. Meanwhile, the RMSE=0.8563 algorithm would get 7.6 out of ten. While this is an improvement, and while it may be a remarkable technical accomplishment, it does not seem to be exactly a revolutionary leap compared to the really simple algorithms as far as customers go.

In short, would a customer even notice the difference? He concludes:

I’m no futurist, but I see little evidence from the first 300 days of the Netflix Prize that recommender systems are the magic ingredient that will reveal the wisdom of crowds.

This is an excellent blog entry that really goes to the heart of the value (or lack thereof) in these sorts of models.