Comment Spam

Brian Hayes, author of American Scientist‘s computing science column and author of the bit-player blog, has a very nice article on issues with spam in the comments of his blog.  My blog is nowhere near as popular as Brian’s, but I too attract a reasonable amount of spam.  Some spam is easy:  autogenerated, lots of links, easy to identify by things like Akismet.  This is the stuff I never see, which is good since it runs about 100/day.  The software simply handles things.

But other spam takes a bit more effort.  As Brian points out, there is a market for people to browse the web, putting in comments with an included link back to whoever is doing the paying.  Some of this is obvious:  “Good post.  I like your blog.” with a return link to a hairdresser is just not credible.  Some other posts are harder to be sure about:  they look on topic, but are a bit off (see the “Blue Fire” comment for an example).  Perhaps it is a language barrer?  Perhaps I am not smart enough to see the connection?  But it is fascinating to see.  As Brian wonders:

I’m both fascinated and appalled to learn that the Internet economy can support this activity. What’s the going rate for writing comment spam? Is it worth a penny to get your link briefly exposed to the vast daily readership of bit-player.org? How about a tenth of a penny?

I get about one of these a day on my site.  This is less than I used to get since I have closed down comments on older posts.  Depending on how bad the comment is (and the atrociousness of the linked site), I have four levels of response to these comments (I moderate all comments from “unknown” people):

  1. SPAM!  I mark it spam, which I hope goes into Akismet’s algorithm so that similar stuff is more likely to be marked spam (ideally on more than just my machine).
  2. I delete it.  Nice try, but this one’s not getting by me.  But you are welcome to try again.
  3. I edit it to remove the link and let the comment through (like I did with “Blue Fire”).  I suspect this is most frustrating to the commentator who I presume does not get paid for his/her efforts.
  4. I let it through, link and all.  The link needs to have some relevance to operations research in this case.  And maybe someone gets paid a penny or two.  This doesn’t happen very often!

It is nice to see that the reaching the readership of Michael Trick’s Operations Research Blog has some value to someone.  But you are going to have to read up on the world of operations research if you want to get past my filters!

Without Operations Research, Gridlock!

In many applications, it can be difficult to measure the effect of an operations research project.  For instance, my colleagues and I provide schedules for Major League Baseball.  What is the value added by the operations research we do?  MLB doesn’t have the time, energy or money to handle multiple schedulers in parallel:  they decided five or so years ago that they liked our schedules, and they have been using us since.  We have gotten better over time.  What is the value now?  Who knows?  The non-operations research alternatives no longer provide schedules, and the process, in any case, was not “here’s a schedule, evalute it!”:  it is much more interactive.

gridlockOnce in a while, circumstances come together to show the value of operations research.  If you have been trying to drive anywhere in Montgomery County, Maryland (northwest of Washington, D.C.), you have had a chance to see how traffic systems work without the coordinating effect of operations research systems.  A computer failure messed up the synchronization of traffic signals.  From a Washington Post article:

A computer meltdown disrupted the choreography of 750 traffic lights, turning the morning and evening commutes into endless seas of red brake lights, causing thousands of drivers to arrive at work grumpy and late, and getting them home more frustrated and even later.

The traffic signals didn’t stop working.  They continued, but they no longer changed the time spent “green” in each direction based on time, and they no longer coordinated their “green” cycles along the main corridors:

The system, which she described as “unique” in the Washington region, is based on a Jimmy Carter-era computer that sends signals to traffic lights all over the county. On weekday mornings, it tells them to stay green longer for people headed to work. And in the evenings, it tells them to stay green longer for people headed home.

It also makes them all work together — green-green-green — to promote the flow of traffic. That happens automatically, and then the engineers use data from hundreds of traffic cameras and a county airplane to tweak the system. When there is an accident, breakdown or water main break, they use the computer to adjust signal times further and ease the congestion around the problem.

It’s great when it works, a disaster when it fails.

Of course, without operations research, which determines the correct times and coordinates it across the network, it would be a disaster all the time.  Here’s hoping they get back to the “optimized world” soon (as seems to be the case).

Open Source = Geek?

The parent of SourceForge.net has decided to become Geeknet, Inc (how much did they have to pay for geek.net, I wonder). I have mixed feelings on this.  On one hand, I am trying to learn from Wil Wheaton and embrace my inner geek.  I have done this so well that my colleagues tell me my inner geek is leaking out quite a bit to my outer geek.  So celebrating geekdom is good, right?

Well, maybe not right.  I am involved in an operations research open source endeavor, COIN-OR.  Part of COIN-OR is convincing companies that open source is not too nerdy or experimental.  There are lots of good reasons to use open source software. Embracing your inner geek is probably not one of the more persuasive.  One vision of COIN-OR was to be “Sourceforge for operations research”.  Now should we become “Like operations research, only geekier?”.

The Sourceforge Blog has an amusing post on the top ten rejected names, including my favorite:

7. FLOSSdaily

I think the entry summarizes the issue correctly:

the consensus seems to be that, as long as we don’t change the name of sourceforge.net, it doesn’t really matter what we call the corporation, so I think we’re good.

But I would be worried about the next “branding” idea from the good people at Geeknet, Inc.  Perhaps their inner geek should stay a little more … inside.