Data Mining, Operations Research, and Predicting Murders

John Toczek, who writes the PuzzlOR column for OR/MS Today  (example), has put together a new operations research/data mining challenge in the spirit of, though without the million dollar reward of, the Netflix Prize.  The Analytics X Prize is a  fascinating problem:

Current Contest – 2010 – Predicting Homicides in Philadelphia

Philadelphia is a city with 5.8 million people spread out over 47 zip codes and, like any major city, it has its share of crime.  The goal of the Analytics X Prize is to use statistical techniques and any data sets you can find to predict where crime, specifically homicides, will occur in the city.  The ability to accurately predict where crime is likely to occur allows us to deploy our limited city resources more effectively.

What I really like about this challenge is how open-ended it is. Unlike the Netflix Prize, there is no data set to be analyzed. It is up to you to determine what might be an interesting/useful/important data set. Should you analyze past murder rates? Newspaper articles? Economic indicators? Success in this might require a team that mixes those who understand societal issues with data miners and operations researchers. This, to me, makes it much more of an operations research challenge than a data mining challenge.

I also like how the Prize handles evaluation: you are predicting the future, so murders are counted after your submission. Unless you have invented time travel, there is no way to know the evaluation test set, nor can you game it like you could in the Netflix Prize (at the risk of overfitting).

I asked John why he started this prize, and he replied:

I started this project about a year ago when trying to think of ways to
attract students and people from other professions into the OR field. I
write an article in ORMS Today called the PuzzlOR which I originally
started in hopes of attracting more students to our field. OR can be a bit
overwhelming when you first get into it so I wanted a way to make it easier
for the newcomers. The puzzles I wanted to run were getting a bit out of
hand in their complexity so I needed some other place to house them.

Plus, I thought it would be good advertising for the OR field in general
and would have positive impact on the city where I live.

He’s already gotten good local press for the project. The Philadelphia City Paper ran a nice article that mentions operations research prominently:

Operations research may not sound sexy; it focuses on analytics and statistics — determining which data in a gigantic data haystack is most relevant — in order to solve big problems.

There is a monetary prize involved: $20 each month plus $100 at the end of the year. It is probably a good thing that this is not a million dollar prize. Since entries are judged based on how well they do after submission, too high a prize might lead to certain … incentives … to ensure the accuracy of your murder predictions.

4 thoughts on “Data Mining, Operations Research, and Predicting Murders”

  1. This is a great idea. The actual setup is not the best thoug. The pages go down frequently and some sort of sample answer might be handy as many people are asking how to format the answers.

    Still its a nice idea. Would robberies or some more common event have a greater signal to noise ratio though?

Leave a Reply

Your email address will not be published. Required fields are marked *