Careful with Wolfram|Alpha

Wolfram|Alpha is an interesting service. It is not a search engine per se. If you ask it “What is Operations Research” it draws a blank (*) (mimicking most of the world) and if you ask it “Who is Michael Trick” it returns information on two movies “Michael” and “Trick” (*). But if you give it a date (say,  April 15, 1960), it will return all sorts of information about the date:

Time difference from today (Friday, July 31, 2009):
49 years 3 months 15 days ago
2572 weeks ago
18 004 days ago
49.29 years ago

106th day
15th week

Observances for April 15, 1960 (United States):
Good Friday (religious day)
Orthodox Good Friday (religious day)

Notable events for April 15, 1960:
Birth of Dodi al-Fayed (businessperson) (1955): 5th anniversary
Birth of Josiane Balasko (actor) (1950): 10th anniversary
Birth of Charles Fried (government) (1935): 25th anniversary

Daylight information for April 15, 1960 in Pittsburgh, Pennsylvania:
sunrise | 5:41 am EST\nsunset | 6:59 pm EST\nduration of daylight | 13 hours 18 minutes

Phase of the Moon:
waning gibbous moon (*)

(Somehow it missed me in the famous birthdays: I guess their database is still incomplete)

It even does simple optimization

min {5 x^2+3 x+12}  =  231/20   at   x = -3/10 (*)

And, in discrete mathematics, it does wonderful things like generate numbers (permutations, combinations, and much more) and even put out a few graphs:

This is all great stuff.

And it is all owned by Wolfram who define how you can use it. As Groklaw points out, the Wolfram Terms of Service are pretty clear:

If you make results from Wolfram|Alpha available to anyone else, or incorporate those results into your own documents or presentations, you must include attribution indicating that the results and/or the presentation of the results came from Wolfram|Alpha. Some Wolfram|Alpha results include copyright statements or attributions linking the results to us or to third-party data providers, and you may not remove or obscure those attributions or copyright statements. Whenever possible, such attribution should take the form of a link to Wolfram|Alpha, either to the front page of the website or, better yet, to the specific query that generated the results you used. (This is also the most useful form of attribution for your readers, and they will appreciate your using links whenever possible.)

And if you are not academic or not-for-profit, don’t think of using Wolfram|Alpha as a calculator to check your addition (“Hmmm… is 23+47 really equal 70? Let me check with Wolfram|Alpha before I put this in my report”), at least not without some extra paperwork:

If you want to use copyrighted results returned by Wolfram|Alpha in a commercial or for-profit publication we will usually be happy to grant you a low- or no-cost license to do so.

“Why yes it is. I better get filling out that license request!  No wait, maybe addition isn’t a ‘copyrighted result’.  Maybe I better run this by legal.”

Groklaw has an interesting comparison to Google:

Google, in contrast, has no Terms of Use on its main page. You have to dig to find it at all, but here it is, and basically it says you agree you won’t violate any laws. You don’t have to credit Google for your search results. Again, this isn’t a criticism of Wolfram|Alpha, as they have every right to do whatever they wish. I’m highlighting it, though, because I just wouldn’t have expected to have to provide attribution, being so used to Google. And I’m highlighting it, because you probably don’t all read Terms of Use.

So if you use Wolfram|Alpha, be prepared to pepper your work with citations (I have done so, though the link on the Wolfram page says that the suggested citation style is “coming soon”: I hope I did it right and they do not get all lawyered up) and perhaps be prepared to fill out some licensing forms.  And it might be a good idea to read some of those “Terms of Service”.

(*) Results Computed by Wolfram Mathematica.

Michel Balinski IFORS Distinguished Lecture

The IFORS Distinguished Lecturer for the INFORMS meeting was Michel Balinski of Ecole Polytechnique and CNRS, Paris. Michel spoke on “One-Vote, One-Value: The Majority Judgement”, a topic close to my heart. In the talk, Michel began by discussing the pitfalls of standard voting (manipulation, “unfair” winners, and so on). He then spent most of his talk on a method he proposes for generating rankings and winners. For an election on many candidates (or a ranking of many gymnasts, or an evaluation of many wines: the applications are endless), have the electors (judges, etc.) rate each candidate on a scale using terms that are commonly understood. So a candidate for president might be “Excellent, Very Good, Good, Acceptable, Reject”. Then, the evaluation of a candidate is simply the median evaluation of the electors. The use of median is critical: this limits the amount of manipulation a voter can do. If I like a candidate, there is limited effect if I greatly overstate my liking: it cannot change the overall evaluation unless my evaluation is already under that of the median voter.

Michel then went on and discussed some tiebreaking rules (to handle the case that two or more candidates are, say “Very Good” and none “Excellent”). I found the tie-breaking rules less immediately appealing, but I need to think about these more.

Michel had done an experiment on this by asking INFORMS participants to do an evaluation of possible US Presidential candidates (not just Obama and McCain, but also Clinton, Powell, and a number of others). The result (on a small 129 voter sample) put Obama well ahead, but I do suspect some selection bias at work.

This work will be the basis of a book to be published at the end of the year, and there is a patent pending on the voting system (which I found a little strange: what would it mean to use a patented voting system?).

I didn’t get the URLs at the end of the talk.  If anyone got them, can you email me with them?  A quick web search only confused me more.

Thanks Ashutosh for this pointer.

Added Oct 20. Michel Balinski kindly wrote and provided the following references:

Michel Balinski and Rida Laraki, “Le jugement majoritaire : l’expérience d’Orsay,” Commentaire no. 118, été 2007, pp. 413-419.

One-Value, One-Vote: Measuring, Electing, and Ranking (tentative title), to appear 2009.

Michel Balinski et Rida Laraki, A theory of measuring, electing and ranking,
Proceeding of the National Academy of Sciences USA
, May 22, 2007, vol. 104, no. 21, pp. 8720-8725.

Michel Balinski et Rida Laraki, “Election by Majority Judgement: Experimental Evidence.”
Cahier du Laboratoire d’Econométrie de l’Ecole Polytechnique, December 2007, n° 2007-28

Open Access at Springer

Despite some philosophical/moral issues, I do a fair amount of work with commercial publishers. I just was co-Program Chair for the CPAIOR conference, and was delighted to publish the conference volume in Springers Lecture Notes in Computer Science series (despite the fact that LNCS has been dropped from the ISI indexing). I am considering starting a journal with Elsivier. Overall, while I prefer journals offered by professional societies (like INFORMS), I recognize the role commercial publishers play in supporting a field.

Still, I was taken aback by Springer‘s “attempt” at open access when I recently had a paper accepted in Annals of Operations Research. I had the option to make the paper open access. Here is what I was offered:

Upon publication, your article will be available to all subscribers of this journal. If that is what you want, click the button ‘No Open Access’ below. However, if you want your article to be available to everyone, wherever they are, whether they subscribe or not, then you should publish with Open Access. Springer operates a program called Open Choice that offers authors the option of having their articles published with Open Access in exchange for an article processing fee. The standard fee is US$3000. If you want to order Open Access, please click the button ‘Yes, I order Open Access’ below.

Fascinating, and somewhat appalling (though I might feel differently if I was paying in euros). I wonder if they ever get anyone to take that choice. I actually now feel worse about publishing with Springer: I would rather they simply take the position that publishing with them involves a transfer of copyright than to be offered such terms. It looks to me that this is just an attempt to provide an “open access” veneer rather than a serious attempt to face the intellectual property issues of commercial publishing.

Operations Research in the Air

My colleague Steve Spear has a posting on the “Against Monopoly” blog (not against the board game, but commentary on intellectual property issues) regarding a New Yorker article entitled “In the Air” by Malcolm Gladwell. Gladwell makes the point that many advances have been simultaneously made by multiple groups. From the discovery of dinosaur bones to the telephone to cancer treatments, there seems to be something in the air that gives the right time for a discovery. The New Yorker article gives a number of examples:

This phenomenon of simultaneous discovery—what science historians call “multiples”—turns out to be extremely common. One of the first comprehensive lists of multiples was put together by William Ogburn and Dorothy Thomas, in 1922, and they found a hundred and forty-eight major scientific discoveries that fit the multiple pattern. Newton and Leibniz both discovered calculus. Charles Darwin and Alfred Russel Wallace both discovered evolution. Three mathematicians “invented” decimal fractions. Oxygen was discovered by Joseph Priestley, in Wiltshire, in 1774, and by Carl Wilhelm Scheele, in Uppsala, a year earlier. Color photography was invented at the same time by Charles Cros and by Louis Ducos du Hauron, in France. Logarithms were invented by John Napier and Henry Briggs in Britain, and by Joost Bürgi in Switzerland.

“There were four independent discoveries of sunspots, all in 1611; namely, by Galileo in Italy, Scheiner in Germany, Fabricius in Holland and Harriott in England,” Ogburn and Thomas note, and they continue:

The law of the conservation of energy, so significant in science and philosophy, was formulated four times independently in 1847, by Joule, Thomson, Colding and Helmholz. They had been anticipated by Robert Mayer in 1842. There seem to have been at least six different inventors of the thermometer and no less than nine claimants of the invention of the telescope. Typewriting machines were invented simultaneously in England and in America by several individuals in these countries. The steamboat is claimed as the “exclusive” discovery of Fulton, Jouffroy, Rumsey, Stevens and Symmington.

We see this in our own field when multiple groups seem to solve longstanding open problems almost simultaneously, often years after the problem was formulated. Such happenings inevitably lead to accusations and recriminations, with cries of plagiarism and other nefarious goings-on. Of course, ideas are stolen and the stress of publication can lead to short-cuts, and these are rightly decried, but I think there is something to this “in the air” phenomenon. Many simultaneous discoveries may be just that: simultaneous discoveries.

The Against Monopoly blog points out that these simultaneous “inventions” are often the outcome of a tremendous amount of public buildup. The “invention” is then just a small step, with a corresponding willingness to fight a patent battle. As the blog says:

I certainly came away from the article believing even more strongly in the Boldrin-Levine [see here] contention that intellectual property rights just aren’t necessary when you have the shoulders of giants to stand on.

In operations research, we saw this with our most famous patent issue: AT&T’s patent for “Karmarkar’s Algorithm” for linear programming. I remember the INFORMS (ORSA/TIMS at the time) conference when this came out. Researchers were canceling their talks on the simplex algorithm, since it no longer seemed relevant. Doctoral students were ruing their choices, and giving up hope for a successful academic career since they had bet on the wrong horse. Of course, the algorithm had no such effect. Research on effective implementations of the simplex algorithm was spurred by the competition, and research on interior point algorithms moved quickly to respond. The field has been greatly enhanced by having competing techniques. The patent didn’t help this advance, and was not financially successful for AT&T (I do not believe), but Karmarkar’s algorithm was a beginning not an end.

But it wasn’t even a beginning. “Karmarkar’s” algorithm was actually a well-known nonlinear programming algorithm in disguise, with its roots dating to the 60s (Karmarkar announced “his” algorithm in 1984). This equivalence to known work doesn’t take away from what Karmarkar did. Believing that an approach more suitable for nonlinear programming would be efficient and effective for linear programming was a big step. And it made a huge difference on the field. But, in keeping with Stigler’s Law (no invention is truly named after its inventor), Karmarkar’s algorithm could really take on many other names.