American Express and Data Mining

I teach a data mining course here to our MBA students.  It is a popular course with about 70% of the students taking it at some point during their two years with us.  Since I am an operations research guy, I concentrate on the algorithms, but we spend a lot of time talk on the use of data mining, and possible pitfalls.  The New York Times today has a wonderful story illustrating the pitfalls.  American Express has been using shopping patterns to reduce customer’s credit limits.  This, in itself, is not surprising, but a letter the company sent out implies that it was basing the evaluation on the stores and companies the customer used, rather than a more direct measure of consumer ability to repay a debt.

“Other customers who have used their card at establishments where you recently shopped,” one of those letters said, “have a poor repayment history with American Express.”

Wow!  Shop at the Dollar Mart, you are not a careful shopper reacting to an uncertain financial world, but rather a poor credit risk who should be jettisoned before defaulting (I don’t know if Dollar Mart is one of the “bad” establishments: American Express has not released a list of companies that are signs of imminent financial doom). That is, of course, what data mining results come down to, but it is rare for a company to admit it.  Not surprisingly, customer’s who received such a letter became a little irate.  Check out newcreditrules.com for one person’s story.

American Express says it is no longer using store shopping information, but it will continue to use the results of data mining in its credit decisions.

In one presentation to analysts, it noted that people with multiple residences and multiple mortgages used to be a good bet. Now, the reverse is true.

In a good economy, lots of data mining was used to “help” customers by identifying new products or offers that might appeal to them.  Now, it seems that more data mining uses the customer’s data against their own interests.  I suspect we will see more stories of this type.

Gurobi versus CPLEX benchmarks

Hans Mittelmann has released some benchmark results comparing CPLEX 11.2 with the first version of Gurobi‘s code (1.02) in both sequential and parallel (4 processor) mode.  Mosek’s sequential code is also included in the test.  Let me highlight some of the lines:

=================================================================
s problem     CPLEX1    GUROBI1     MOSEK1     CPLEX4    GUROBI4
-----------------------------------------------------------------
    air04          9         18         49          7         13
   mzzv11         89        116        434         80        116
      lrn         42        104          f         36        996
ns1671066       1828          1       2474         26          1
ns1688347       1531        315          f        716        258
ns1692855       1674        322          f        688        234
     acc5         25        247       4621         30         33

These are not chosen to be typical, so be sure to look at the full list of benchmark results.  Concentrating on the first two columns, Gurobi’s code seems to be holding its own against CPLEX, and sometimes has a surprising result.  The ns1671066 result certainly looks like Gurobi is doing something interesting relative to CPLEX.

It is also striking how often the speedup is more than a factor of 4.  That same ns1671066 instances that CPLEX had such trouble on sequentially certainly looks much more solvable with 4 threads.  But sometimes there is a significant slowdown (see Gurobi on lrn).  Perhaps someone more skilled in this area can explain how we should interpret these benchmarks.

The official release for Gurobi is still two months away:  it will be interesting to see how the released version works.

Over- versus Under-Planning

My wife and I have been arguing recently about my family’s tendancy to over-plan.  The Trick way does tend to have plans with lots of contingencies, which perhaps a reason operations research appeals to me.  The Weyers (my wife Ilona’s surname) approach is a little more … take it as things come.

The Think-OR blog has a wonderful OR joke that pretty well summarizes how us over-planners work.  I think it is instructive that both Ilona and I think that the story vindicates our chosen position.

End of INFORMS Resources page? No really!

Last year, we had a brief discussion on the value of the INFORMS Resources page.  Since then, things have simply got worse.  Spammers overwhelm the system, keeping things updated is horrendously difficult, and it is unclear if more than a handful of people are interested in it.   As I write on the page:

As announced last year, it is unclear whether this resource page will continue. On one hand I started this back in 1994, so it is sad to see it go after 15 years. On the other hand, the internet has changed a lot since then. There was no Google back then, so simply finding stuff was hard to do. Now, it seems that the age of “hand edited” links is at an end (if it wasn’t so five years ago). Keeping these pages up to date is ferociously difficult. And the spammers are unrelenting (and I don’t have the heart to change software again to combat them). So, there is every possibility that these pages will go away on April 15, 2009. The only thing that can stop this is finding someone to take over the administration of this area with energy and enthusiasm to do something new. If you are interest contact me (Michael Trick) at trick@cmu.edu. And, if you believe in the site, it would be useful to be sure your site is here and is accurate. Even if it is only in place for a few months, it would be useful to have! Further discussion of this on my blog.

If you have submitted a site recently and it is has neither appeared nor have you received an explanation, please resubmit: we had some problems with the system that have now been corrected. Our apologies for the inconvenience.

Anyone interested in taking this on?  Or think they can convince me this is a great use of my time?

Larry Wein on Post Traumatic Stress

I missed Larry Wein’s op-ed in the New York Times on post-traumatic stress disorder (PTSD), so thanks to Güzin Bayraksan for pointing it out in her blog.  Entitled “Counting the Walking Wounded”, the piece argues that the number of soldiers expected to get PTSD is quite a bit higher than previous estimates (which were in the range of 15%):

We found that about 35 percent of soldiers and marines who deploy to Iraq will ultimately suffer from P.T.S.D. — about 300,000 people, with 20,000 new sufferers for each year the war last.

If you check out the blog for my son Alexander, you will see that we are in the midst of our own “traumatic stress”:  nothing like a soldier in Iraq, of course, but between the passing of my mother and the destruction of our kitchen by a broken pipe, not to mention starting up teaching again and a few other things in my life, I feel mini-PTSD coming on!  Fortunately, playing catch with Alexander is pretty good therapy.

A Sheriff Goes to Jail for Not Using Operations Research

An Alabama sheriff spent time in jail for not feeding his prisoners enough.  From the CNN Report:

A federal judge ordered a north Alabama sheriff jailed this week, saying the lawman intentionally served jail inmates “woefully insufficient” meals in order to pocket more than $200,000.

At issue is an Alabama law that attorneys for the inmates claim provides sheriffs with an incentive to skimp on feeding inmates. Under the law, sheriffs are permitted to keep — as personal income — money left over after purchasing food for inmates.

The state provides sheriffs with $1.75 per day per inmate for food, according to the Alabama Attorney General’s Office. However, in a March 2008 opinion, the office affirmed that sheriffs may legally keep what is left over.

A typical dinner was two hot dogs or meat patties; a slice of bread; and mixed vegetables or baked beans, the judge wrote.

Of course, if Sheriff Bartlett had studied operations research, he would have immediately recognized his issue as the well-known “diet problem” in linear programming.  A quick google search would have led him to the NEOS page on the subject.  Some clicks, and the magic of linear programming gives an optimal diet of cost just 96 cents:

The Optimized Menu

The cost of this optimal diet is $0.96 per day.

The Solution and Cost Breakdown by Food

Food Servings Cost ($)
Carrots,Raw 0.24 0.02
Peanut Butter 3.60 0.25
Popcorn,Air-Popped 4.82 0.19
Potatoes, Baked 3.54 0.21
Skim Milk 2.17 0.28

2000 calories, and it meets all the nutritional requirements. I bet the Sheriff wasn’t making 79 cents off each prisoner per day with his plan! I would love to see the lawyers get a hold of this: “But judge: potatoes with peanut butter are the perfect food! Carrots on the side, and milk to wash it down. We even give them popcorn for movie night! What’s not to like?”

ACM Fellows and Operations Research

The ACM (the Association for Computing Machinery) has announced 44 new Fellows.  A number of them are well-known in the operations research community (some are just plain well-known:  can it be that Stephen Cook of NP-completeness fame was not a Fellow before now?).  These include:

  • Tuomas Sandholm, Carnegie Mellon.  Tuomas does an amazing number of things very well.  He is an ideal faculty member raising money, training students, and so on.  He also runs a company:  CombineNet.  Much, but not all, of his work is in combinatorial auctions, and he has really revolutionized that area.  He also windsurfs.  He is someone I am jealous of.
  • Michel Goemans, MIT.  Has done amazing work in approximation algorithms.  I am not a huge fan of approximation algorithms:  I like solving real problems and it is not clear that a factor of 2 approximation is of much use, much less a log x/log log x approximation.  And while many approximation papers give lip-service to the applications of the models they work with, very rarely are approximation algorithms implemented.  That said, Michel does what I admire more than anything:  he finds truly new and creative approaches to problems.  When Michel finds a new approach (which he seems to do every two years or so), it is worth learning about because his approaches generally lead to a very rich a productive research vein.  I am jealous of Michel too.
  • Sanjeev Arora, Princeton.  Unlike Tuomas and Michel, I do not know Sanjeev personally, so I cannot be jealous of him.  But I certainly know his work on randomized approaches to hard problems.

So congrats to all the new Fellows!

OR Forum paper on Personal Decisions

There is a new paper and discussion at the OR Forum.  Raph Keeney published  a neat paper entitled “Personal Decisions are the Leading Cause of Death” in Operations Research, where he argues that the choices people make (eating, drinking, etc.) cause more deaths than anything else.  There are some very insightful commentaries about this, and I hope the paper and commentaries lead to an interesting discussion.  Check it out!

This paper was the subject of a Newsweek article, and I suspect it will show up more in the media than most OR papers.

IBM Completes Purchase of ILOG

IBM has now completed its purchase of ILOG.  From the “ILOG, an IBM Company” press release:

ARMONK,NY and PARIS – January 6, 2009. IBM (NYSE: IBM) today announced the completion of its approximately $340 million USD (EUR 215 million) tender offer for the shares of ILOG (NASDAQ: ILOG) (PARIS: ILO) (ISIN: FR0004042364). The tender offer, announced July 28, 2008, was finalized after IBM acquired all outstanding stock in ILOG and satisfied the other conditions of the offer.

The justification for the purchase is quite interesting (they work on the wording with each news release):

ILOG technology will add significant capability across IBM’s software platform and the addition of BRMS [business rules management system] will provide another dimension of leadership for the IBM BPM portfolio. This includes improved rules and business optimization for WebSphere and Information Management offerings, better visualization for WebSphere, Lotus and Tivoli products and solutions, as well as enhanced optimization and efficient supply chain management assets for planning and scheduling within a service oriented architecture (SOA).

The ILOG BRMS products help businesses increase the agility of their decision making by letting them adapt and respond dynamically. Based on applied mathematics and computer science, the ILOG optimization products allow enterprises to cope with many operating constraints. The software enables them to transform business objectives, resources and operational constraints into best possible action plans and schedules to enhance service, revenue and profits.

It is very nice to see the prominent place optimization now holds in the press releases. ILOG will be part of IBM’s Websphere, the group most interested in the business rules aspect of the acquisition.

The “Service Oriented Architecture”  emphasis is interesting.  SOA has been part of the previous releases, but this release really emphasizes that aspect.   The final line of the acquisition news release goes so far as to say:

For more information, visit http://www.ibm.com/soa. [the SOA page of IBM.]

though I am not sure where that “more information is”, since the ILOG acquisition does not appear on that page, nor even on the “news release” area of that page (but maybe that is in process).  I had whined before when news releases pointed to Websphere, but I guess this is an improvement.  Certainly some of the leaders in our field believe SOA is a huge opportunity for operations research.

Welcome to “ILOG, an IBM Company”.  Great to see that you will keep your identity, and may you continue to broadly advance the use of operations research in practice.

Added Jan 6, 8PM: The IBM SOA page now has a link to some more information about the acquisition with the link now under “Business Process Management” with ILOG described in the first line as “ILOG, a recognized industry leader in Business Rule Management Systems (BRMS), strengthens IBM value for customers and partners.”  Sigh….

On “On Explaining Operations Research to Others”

Jim Orlin, professor of OR at MIT, and coauthor of one of my favorite books (Network Flows: Theory, Algorithms, and Applications with Ravi Ahuja and Tom Magnanti), has just started a blog.  It looks like it will be a little more wide-ranging than this one, including education and politics along with operations research. His first OR post is on a great topic:  what do you say when someone asks “So what is Operations Research?”  Most of us go “well, ummm…, its kinda something with computers” and we lose a great opportunity to educate.  So what should we say?

Jim’s preferred definition is “The science of decision making” (I go with something similar:  “The science of better decision making”, showing a certain loyalty to the INFORMS “Science of Better” campaign).  He follows that up, in the best OR tradition, with an algorithm:

Algorithm for describing operations research to a friend or colleague.

Step 1. Find out a system about which the other person is both interested and knowledgeable. (e.g, sports, entertainment, communication, travel, or anything relating to a person’s job.

Step 2. Develop a plausible scenario based on the system in Step 1; e.g., scheduling sports teams, designing wireless phone systems to provide for the best possible reception, or designing queuing systems at Disneyworld. (I have found that it is very useful to give an example that addresses a problem at the other person’s work that he or she just told you was important.)

Step 3. Explain how operations research can be used to find an excellent solution for the scenario in Step 2 or provide very useful information for the scenario in Step 2.

I like his algorithm, and think it is probably more effective than my version, where Step 1 is “Find a system about which I am knowledgeable and interested…”.  My approach tends to lead to a lot of “Oh, look at the time, I must be going!” followed by frantic rushing out of the room.   But Jim’s approach does require a bit more thinking on one’s feet:  “Oh… so you are interested in the novels of Jane Austen.  Well, operations research is, ummm….”

Welcome to the Blog-OR-sphere, Jim!