Open Source = Geek?

The parent of SourceForge.net has decided to become Geeknet, Inc (how much did they have to pay for geek.net, I wonder). I have mixed feelings on this.  On one hand, I am trying to learn from Wil Wheaton and embrace my inner geek.  I have done this so well that my colleagues tell me my inner geek is leaking out quite a bit to my outer geek.  So celebrating geekdom is good, right?

Well, maybe not right.  I am involved in an operations research open source endeavor, COIN-OR.  Part of COIN-OR is convincing companies that open source is not too nerdy or experimental.  There are lots of good reasons to use open source software. Embracing your inner geek is probably not one of the more persuasive.  One vision of COIN-OR was to be “Sourceforge for operations research”.  Now should we become “Like operations research, only geekier?”.

The Sourceforge Blog has an amusing post on the top ten rejected names, including my favorite:

7. FLOSSdaily

I think the entry summarizes the issue correctly:

the consensus seems to be that, as long as we don’t change the name of sourceforge.net, it doesn’t really matter what we call the corporation, so I think we’re good.

But I would be worried about the next “branding” idea from the good people at Geeknet, Inc.  Perhaps their inner geek should stay a little more … inside.

Congratulations to Tom Magnanti

magnantiTom Magnanti has been appointed President of the Singapore University of Technology and Design, Singapore’s so-called “fourth university”.  Tom, of course, is one of the preeminent researchers and administrators in operations research.  He has written lots of influential papers and books.  His book with Bradly and Hax on Applied Mathematical Programming had a huge effect on me (take that Russ Ackoff) and is the sort of book the field badly needs now.

Tom was President of INFORMS while I was on the Board (he was President two before me, so we had some pretty in-depth interactions) and he was President of IFORS when I joined that board.  I know him well and he is a heck of a nice (and effective!) guy.

The new university is an exciting operation:  what would you do if you were given free reign to create a top-notch engineering and design school from scratch?  Tom, with his experience as Dean of Engineering at MIT, is a fantastic choice to lead this endeavor.  It is great to have an operations researcher in charge, and Tom is another example of operations research as a path to administrative success.

Russell Ackoff passed away

ackoff1Russell Lincoln Ackoff passed away October 29, 2009.  Ackoff was one of the most controversial researchers in operations research.  A prolific writer, his early work was pure operations research, as it was understood in the early 1960s.  Ackoff was an early (1956-57) president of the Operations Research Society of America (ORSA, a forerunner of INFORMS) and co-wrote an influential textbook on operations research with Churchman and Arnoff in 1962.

By the late 1970s, Ackoff was disillusioned with operations research, penning articles with titles such as “The Future of Operational Research is Past”.  Kirby has a nice article outlining his changing views.

I have a number of Ackoff’s books and enjoy his writing very much.  I particularly enjoy “The art of problem solving (accompanied by Ackoff’s Fabes)“, which contains a number of stories that I use in my classes.

I was a teenager when Ackoff split from operations research, and the issues Ackoff brings up do not resonate with me.  There are lots of ways to solve problems.  Not every business problem is a linear program.  But not every business problem can be solved by bringing together a dozen people to draw circles and arrows.  Operations Research has its place, as does the direction Ackoff went in.

Criticism from a source as well respected as Ackoff has been salutary for the field, I believe.  Understanding the limits of what we do is as important as understanding our successes.  While perhaps Ackoff could have headed off in new directions with a little less rancor, our field is richer for having had him both within it, and outside of it.

But I think the best of operations research is still to come.

Algorithmic Voting Theory, Venice, and a Talk on Old/New Papers

I just got back from Venice, where I attended a conference on Algorithmic Decision Theory.  This is a new conference series (optimistically numbered the first conference, implying at least a second) revolving around issues in uncertainty in decision making, preference solicitation, learning and other issues.  From the conference web page:

A new unique event aiming to put together researchers and practitioners coming from different fields such as Decision Theory, Discrete Mathematics, Theoretical Computer Science and Artificial Intelligence in order to improve decision support in the presence of massive data bases, combinatorial structures, partial and/or uncertain information and distributed, possibly inter-operating decision makers. Such problems arise in several real-world decision making problems such as humanitarian logistics, epidemiology, risk assessment and management, e-government, electronic commerce, and the implementation of recommender systems.

This area has been very active, particularly in computer science where there are  a host of applications.

I was asked to give one of the invited talks, and spoke on “An Operations Research Look at Voting”.  I was a little worried about giving this talk, since my main qualifications come from papers published twenty years ago.  When I was a doctoral student at Georgia Tech, I worked with John Bartholdi and Craig Tovey on computational issues in voting.  Being the first to look at those issues, we got to prove some of the basic results in the area.  These include

  1. For some natural voting rules, it is NP-hard to determine who the winner is.
  2. For some natural voting rules, it is NP-hard to determine how to manipulate the rule (where manipulation means misrepresenting your preferences so as to get a preferred outcome).
  3. For some natural voting rules, optimally using the powers of a chair to form subcommittees or otherwise manipulate the voting process is NP-hard.

We published this work in Social Choice and Welfare (after getting it rejected from more mainstream economics journals) where … it was soundly ignored for 15 years.  No one referred to the papers; no one followed up on the papers;  no one cared about the papers at all!

This work was my job talk in 1987/88, and it got me a wonderful job here at the Tepper School (then GSIA), Carnegie Mellon.  And, without Google Scholar, it was not obvious that the papers were being ignored, so they added to my vita, and made it a bit easier to pass through the various steps.

But I did like the work a lot, and regretted (and still regret) that my economist colleagues did not take computational limits more seriously in their models.

votingcitesBut then something amazing happened about 5 years ago:  people started referring to the papers!  The references were mainly in computer science, but at least the papers were being recognized. The counts of these papers in the Web of Science (formerly Science Citation Index) are particularly striking.  In the associated graph, the x-axis is years since publication;  the y-axis is the number of references in Web of Science in that year (Google scholar numbers are higher of course, but there is a bias towards more recent papers there).  In my talk, I compare that graph to my “normal” papers, which reach a peak after 4 or 5 years then decrease.   It is really gratifying to see the interest in these papers along with all the really cool new results people are getting.

I closed off the talk with some work I have done recently on creating voting trees.  Suppose there are three candidates, “a”, “b”, and “c”, and you really like candidate “a”.  Many voting systems proceed as a series of comparisons between two alternatives (think of standard parliamentary procedure).  If you are the chairperson, you might try to bring the candidates forward so as to increase the chances of “a” winning.  In fact, if you set the agenda to be “b” against “c” and the winner against “a”, then “a” will win as long as “a” beats someone (and no one beats everyone).  In this problem, the goal is to do the manipulation without knowing other voters’ preferences.

lextree4Can you do this for four candidates?  If you want “a” to win, “a” must be in the top cycle: the group of candidates (perhaps the entire set of candidates) who all beat all the other candidates.  The “Condorcet winner” is the minimal top cycle:  if some candidate beats all the other candidates one-on-one, then that candidate must win any voting tree it occurs in.  So, assuming “a” is in the top cycle, can you create a voting  tree so that “a” wins with four candidates?  The answer is yes, but it is a bit complicated:  first “a” goes against “c” with the winner against “d” then the winner against “b” who plays the winner of (“a” goes against “b” with the winner against “d” …) ….  actually, the minimum tree has 14 leaves!  I am biased, but I think the tree is beautiful, but it goes to show how hard it is to manipulate agendas without knowledge of others’ preferences.  I am in the process of generating the trees on 4 candidates:  there is a very natural rule (“Copeland winner with Copeland loser tie break”:  see the presentation for definitions) that requires more than 32 leaves (if an implementation tree for it exists).

Sanjay Srivastava and I made a conjecture almost 15 years ago that would imply that this sort of manipulation would be possible no matter how many candidates.  Little progress has been made but I think it is still a great problem (the economists know this as implementation by backwards induction and characterizing rules implementable on trees is an important problem in social choice/mechanism design).

If you want more details on all this, here are my slides.  The references are

  • Small Binary Voting Trees M.A. Trick, Small Binary Voting Trees, First International Workshop on Computational Social Choice,
    Amsterdam, Netherlands, 500-511 (2006).
  • Sophisticated Voting Rules S. Srivastava and M.A. Trick, Sophisticated voting rules: The two tournament case, Social Choice and
    Welfare, 13: 275-289 (1996).
  • How hard is it to control an election?, with C. A. Tovey and M. A. Trick. A slightly revised version of this appeared in Mathl. Comput. Modelling (Special Issue on Formal Theories of Politics) 16(8/9):27-40 (1992).
  • The computational difficulty of manipulating an election, J.J. Bartholdi, C. A. Tovey and M. A. Trick (1989); Social Choice and Welfare 6:227-241 (1989).
  • Voting schemes for which it can be difficult to tell who won the election, J.J. Bartholdi, C. A. Tovey and M. A. Trick; Social Choice and Welfare 6:157-165 (1989).

Oh, and Venice is a beautiful place to visit.  But you might have heard that elsewhere.

Hey Buddy, Can I Sell You an NP Hard Problem?

In keeping with issues of economics and computational power, there is a very neat paper out of Princeton by Arora, Barak, Brunnermeier, and Ge entitled “Computational Complexity and Information Asymmetry in Financial Products“.  Can you embed an NP-hard problem into the pricing problem for a financial instrument?  As the authors point out, the answer is clearly “yes” and the issue is really how naturally you can do it.  For instance, I could have an instrument that pays $1000 if the answer to a particular, large traveling salesman problem is less than 1 million miles, and $0 otherwise.  Pricing such an instrument involves solving an NP-complete problem, but no one would argue that this implies anything about real financial instruments.  The authors give another, similarly artificial, example:

Consider for example a derivative whose contract contains a 1000 digit integer n and has a nonzero payoff if the unemployment rate next January, when rounded to the nearest integer, is the last digit of a factor of n. A relatively unsophisticated seller can generate such a derivative together with a fairly accurate estimate of its yield (to the extent that unemployment rate is predictable), yet even Goldman Sachs would have no idea what to pay for it. This example shows both the difficulty of pricing arbitrary derivatives and the possible increase in asymmetry of information via
derivatives.

Nobody is actually interested in embedding an NP-Hard problem in a financial instrument in that way.  If the pricing is not clear, and there is obvious information asymmetry, buyers will simply assume they are getting ripped off and walk away (see adverse selection, below).

This paper does something much more interesting.  It develops a system where the firm offering financial assets seems to divide multiple assets up fairly but actually does so in a biased way.  In order to distinguish this tampering from non-tampering, an outside analyst has to solve a densest subgraph problem (an NP-hard problem).

In finance and economics, there is a issue called adverse selection.  Why am I hesitant to buy a used car?  People will only sell their lemons.  Why should we be cautious in hiring faculty who work at other schools?  The other school knows more about them, and will choose not to compete if they want to get rid of them.  Why should I be cautious when a bank chooses 10 mortgages to resell out of their portfolio of 10,000?  They know where the lemons are and will choose to dump them given the chance.

What the authors are saying is that even when a company sells all of their assets in groups in an apparently random way, it is possible to hide manipulation.  So adverse selection can occur in a way that is hard to determine.  Maybe the buyers will assume no adverse selection and we get to dump our lemons!  You can read the paper (or this blog posting from Andrew Appel) for some more information.

I have somewhat mixed feelings about this result (though I have only lived with it for a few hours:  this may take more time to settle my views).  On one hand, I really think that people in economics and finance need to be more aware of computational limits in their models.  On the other hand, it is not clear that NP-hardness is a particularly good defense here.  The authors have a somewhat breathless gloss on what NP-hardness means:

Computational complexity studies intractable problems, those that require more computational resources than can be provided by the fastest computers on earth put together.

Ummm… OK.  Sanjeev Arora is one of the world’s experts in complexity.  He is a Godel Prize Winner and a Fellow of the ACM.  He even, most tellingly, has a wikipedia entry.  I still don’t think this is the world’s best definition of intractable in the NP-hardness sense.  In particular, if a group put out 1000 groupings of financial instruments, and I needed to solve the densest subgraph problem on the resulting instance, I would work very hard at getting an integer program, constraint program, dynamic program, or other program to actually solve the instance (particularly if someone is willing to pay me millions to do so).  If the group then responded with 10,000 groupings, I would then simply declare that they are tampering and invoke whatever level of adverse selection correction you like (including just refusing to have anything to do with them).  Intractable does not mean unsolvable, and not every size instance needs more computing than “the fastest computers on earth put together”.

NP-hardness talks about worst case behavior as problem size grows.  Those of us in operations research spend a lot of time solving NP-hard problems like routing, timetabling, and many other problems because we really want solutions to instances of a particular size and with particular structure.  Bill Cook will solve practically any instance of the NP-hard Traveling Salesman Problem that you like (particularly if the financial incentives are right) as long as you keep it to no more than 100,000 cities or so.  I’d be happy to help a financial firm solve densest subgraph instances if it really mattered to them, NP-hardness be damned!

Of course, if this paper takes off, there might be real demand for operations researchers in looking at NP-Hard problems for the financial industry.  And that would be great for our field!

Thanks to my finance colleague Bryan Routledge for pointing out the paper, and for providing the excellent (IMHO) title to this entry.

INFORMS Fellows Luncheon

From the INFORMS Blog:

I just got back from the Fellows Luncheon.  The INFORMS Fellows are recognized for having made significant contributions to the field of operations research and the management sciences (be it in research, practice, service, administration, or education).  It is an extremely impressive group, and I very much enjoy the lunch, since conversation around the table is generally both insightful and entertaining.

I was President of INFORMS the year the Fellows program began, so I got to welcome the inaugural group.  To get the Fellows program started off, some classifications of people were automatically made Fellows.  So, for instance, all the past winners of the John von Neumann Theory Prize were automatically made Fellows.  When it came to past Presidents of the organization, the rule was pretty explicit:

[Fellows would be] all past Presidents of TIMS, ORSA, and INFORMS up to but not including Michael Trick (*)

Ummmm… OK.  I think I was the only person explicitly declared not to be a Fellow!  It made sense at the time “Don’t want to vote for yourself, you know!”, and they did make me a Fellow a few years later.

Now, no one gets in automatically:  every new Fellow is selected by the Selection Committee.  This year’s class is a very impressive group:  Aharon Ben-Tal, Srinivas Bollapragada, Margaret Brandeau, Awi Federgruen, Nimrod Megiddo, David B. Montgomery, Michael Pinedo, Kathryn E. Stecke, John Tomlin, Garrett van Ryzin, and C.F. Jeff Wu.  The fact that eleven were made Fellows is not arbitrary:  the number is limited by a certain fraction of the size of the membership.  When I checked the list of Fellows, I was struck by some of the amazing people who are not yet Fellows:  we still have years and years of amazing classes to induct.

You get to be a Fellow by getting nominated, and then getting elected by the selection committee (which is voted on by the current Fellows).  If you know someone who should be a Fellow (or think you should be!), the next round of nominations will be due next summer.

A few points that struck me during the lunch

  1. The more members we have, the more Fellows we can elect;  this process would be easier if we had more members
  2. It would be nice for Fellows to do something more than have a nice lunch and beget more Fellows:  the group is a great, underutilized resource
  3. It was fantastic to see a number of the older Fellows who came in specially for the lunch.   Our field has a great history (and future!) and it was good to be reminded of that history with the extremely impressive people in the room.

(*) Not the exact wording, but it was pretty close to that!

Modeling as a Teachable Skill

New post on the INFORMS Blog on a panel discussion I attended on how to teach modeling:

I just attended a nice “panel discussion” on Teaching the Art of Modeling, put together by Jim Orlin (MIT), Stephen Powell and Rob Shumsky (both from Dartmouth).  This was not your normal INFORMS session!  The panelists decided to do this as an “active learning” session, so audience members had to work throughout the session.  The first exercise was to think about how to model a hypothetical, but real-sounding problem:  “Suppose the Red Cross was considering paying people for their blood donations.  How would you provide them with a model that could help them understand the tradeoffs.”  That (paraphrased) was all the information we got.  We were then given 10 minutes or so to work individually on addressing this.  The idea would be that this would be the first 10 minutes of whatever multiple-hour process we would go through to get a “real” model.  Where would you start?

For many, the starting point was brainstorming:  putting down a whole set of issues to be considered and items that might go into the model.  For others, it was graphing some of the anticipated relationships between key issues.  Others still used techniques such as influence diagrams to help organize their thoughts.  Be a hard-core mathematical programming, I thought in terms of objectives, variables and constraints, and was pretty far along with my nonlinear, nonconvex mixed integer program when time was called.

Stephen Powell then asked some audience members what they did, eliciting the strategies given above.  He has experimented with this problem with students and learned a number of things about what they do (presumably either inexperienced or novice at modeling).  First, even for students who have taken modeling courses, it is surprising how little of what we teach gets used in this context.  Students, when faced with a fuzzy modeling problem, often do some combination of the following:

  1. They grab on to data like a lifeboat, prompty multiplying or dividing every number in sight in the hope of getting the “right answer” the professor is looking for.  The Red Cross example has no numbers, so they might make some up just to get going.
  2. They dispense with modeling and go straight to the answer: “This is a bad idea because …”
  3. They adopt inefficient methods and are unable to step back and recognize how inefficient they have become.
  4. They overuse brainstorming relative to any aspect of structured problem solving that they might have been taught.

If there is a column of numbers, you can bet that many students will immediately run a regression!

After discussing these results (there are a couple papers in the Journal of the Operational Research Society on “How Novices Formulate Models” that covers this), Jim and Rob were given a problem new to them (on a model for deciding on the best morgtage to get) and they showed how an influence diagram approach would be used to begin understanding and modeling.

Powell and his co-author Robert Batt have a book entitled Modeling for Insight (Wiley:  one of the exhibitors here) .

It was great to see a session that required the audience to do some work!  While I was not completely convinced by the modeling approach presented (give me my objective, variables, and constraints!), I was convinced about active learning as a way to make 90 minutes go by much faster and in a much more effective way.

Moving on to San Diego (both my blog and I)

I’ll be guest blogging at the INFORMS Conference in San Diego, so I’ll be posting over there for the next few days. There are 12 guest bloggers, so the conference should get some pretty good coverage.  I’ve got a news feed on my main page sidebar trying to track the blog, twitter feed, hash tags, and so on.

I have put in my first post on the extra preconference steps needed on the social network side:

When I started attending INFORMS (actually ORSA/TIMS) meetings in the 80’s, the preconference steps were clear:  pack, find the (paper) airline tickets, print out my talk onto transparencies, go to conference, wander around wondering what was going on.  There are few extra steps now:

  1. Follow @informs09 on twitter
  2. Check out the #informs09 tag on twitter, and remember to use it on own posts.
  3. Sign up for the daily e-news to stay up to date on conference news and activities.
  4. Subscribe to the INFORMS 09 blog (or remember to check back a lot!) to follow the official news and the twelve (!) guest bloggers.
  5. Join the Conference Linkedin Group (122 members and counting).
  6. Pack
  7. Find memory stick with talk
  8. Print out boarding passes
  9. Go to sunny San Diego

The extra steps are worth it.  No more wandering around wondering what is happening!  See you soon!

Solver Foundation Version 2 Announced

Just in time for the INFORMS Meeting in San Diego, the Microsoft Solver Foundation folks have announced Solver Foundation Version 2. I am particularly excited about this since I have promised (and have been negligent about) a new course this year entitled “Operations Research Implementations” aimed at our MBA students. The idea is to go beyond the 20 variable, 20 constraint models into some “real” models. We (I am doing this with Willem van Hoeve) are looking at a number of optimization systems on which to base this course. Our students are extremely comfortable with Excel, and the screen shot from the post is quite provocative. The students are generally not programmers, but maybe this is something they can navigate.

Lots of new additions to Solver Foundation 2, including automatic handling of uncertainty, better MIP performance through Gurobi 2.0 as the default solver, constraint programming improvements and much more. I wonder if our MBAs are ready for Second Order Conic Programming?

P=NP in the New York Times

The New York Times has a nice article on the recent Communications of the ACM article on P=NP (article at author Lance Fortnow’s site). I had read Fortnow’s article last month, and really liked it. In fact, I am surprised I didn’t blog it: I guess I was too busy worrying about false proofs of P=NP (or not).

The Times article does a good job of summarizing the problem in a non-technical way:

The challenge, in its simplest, but not most understandable phrasing, is to determine whether or not P equals NP. P stands for the class of problems that can be solved in polynomial time, or reasonably quickly. NP stands for the class of problems that can be verified in polynomial time — quickly. If it can be shown that P=NP, then it is possible that the world will be a very different place.

The main thrust of the New York Times article is simply how popular the paper is:

The editors of the journal said they were tickled to have a hit article on their hands.

”I don’t think we’ve ever had an article that started off with this kind of a bang,” said Moshe Y. Vardi, a Rice University computer scientist who is editor in chief of the journal. ”Our e-mail blast went out and thousands of people felt they had to click.”

The author of the article, Lance Fortnow, a computer scientist at Northwestern University McCormick School of Engineering, initially told Dr. Vardi that the article would be a short one. ”Still open,” he writes, was in first reaction to writing about the state of the work on solving the problem.

But the article does point to one possible new direction for attacking the problem:

There remains a glimmer of hope, he noted. An esoteric branch of mathematics known as algebraic geometry may offer an avenue to prove or disprove the problem, which was first outlined by Stephen A. Cook, a University of Toronto computer scientist, in 1971.

That prospect feels a bit intimidating. As Dr. Vardi said, “It’s a bit scary because we have to start learning a very difficult mathematical field.”

The Times did a good job on this one. Readable and accurate: good work John Markoff! And thanks Ted for the pointer.

Note added As a reddit commentator noted, “algebraic geometry” is hardly “an esoteric branch of mathematics”, so perhaps there is room for improvement in the article.