Wired has a nice article on the teams competing to win the Netflix Prize (thanks for the pointer, Matt!). I think the most interesting aspect is how the “competition” turned into a cooperation:
Teams Bellkor (AT&T Research), Big Chaos and Pragmatic Theory combined to form Bellkor’s Pragmatic Chaos, the first team to qualify for the prize on June 26 with a 10.05 percent improvement over Netflix’s existing algorithm. This triggered a 30-day window in which other teams were allowed to try to catch up.
As if drawn together by unseen forces, over 30 competitors — including heavy hitters Grand Prize Team, Opera Solutions and Vandelay Industries, as well as competitors lower on the totem pole — banded together to form a new team called, fittingly, The Ensemble.
In fact, with a bit more time, all those groups might have come together:
As much as these teams collapsed into each other during the contest’s closing stages, they might have mated yet again to ensure that everyone on both qualifying teams would see some of the $1 million prize. Greg McAlpin of The Ensemble told Wired.com his team approached Bellkor’s Pragmatic Chaos and asked to join forces (he later clarified that he was still part of Vandelay Industries at this point), but was spooked by AT&T’s lawyers.“We invited their whole team to join us. The first person to suggest it was my 11-year-old son,” said McAlpin. “I thought it sounded like a really good idea, so I e-mailed the guys from Bellkor’s Pragmatic Chaos… [but] they said AT&T’s lawyers would require contracts with everyone to make agreements about who owns intellectual property… and it would take too long.”
Data mining methods can easily be combined. A simple way is simply to have algorithms vote on the outcome. This can often result in much better answers than any individual technique. The Ensemble clearly did something like that:
To combine competitors’ algorithms, The Ensemble didn’t have to cut and paste much code together. Instead, they simply ran hundreds of algorithms from their 30-plus members (updated) and combined their results into a single set, using a variation of weighted averaging that favored the more accurate algorithms.
This is a great example of the intellectual advances that result from long-term “competitions”. After a while, it is no longer competitors against competitors but rather all the competitors against the problem. I don’t know if the Netflix Challenge was specifically designed to result in this, but it has turned out wonderfully. It is much less likely that the groups would have gotten together if the contest was simply “Who has the best algorithm on August 1, 2009”. The mix of initial competition (to weed out the bad ideas) and final cooperation (to get the final improvements) was extremely powerful here.
The winner announcement is expected in September.