Skip to content

Citations in Management Science and Operations Research

The Tepper School, in its promotion and tenure cases, has had more conversation about (if not emphasis on) citation counts for papers. This is partially a “Google Scholar” effect: the easier it is to find some data, the more people will rely on that data. Those of us who bring notebook computers to the tenure cases can immediately add new “data” to the discussion. I fear I have been a big part of this trend here, using Google Scholar as a quick measure of “effect”. I have even written about this on the blog, in an effort to find highly cited OR papers. The software “Publish or Perish” by Harzing.com has been invaluable in this regard: it generates results from Google Scholar, collates them, combines likely double entries, and sorts them in a number of ways. Through them, I can learn immediately that my h index is 19 (not quite: it doesn’t combine some papers, so my h index is closer to 16), and that a paper Anuj Mehrotra and I wrote on graph coloring is my highest cited “regular” paper, and that paper is the third most cited paper ever in INFORMS Journal on Computing. I can even search the citations of my nemeses (“Hah! I knew that paper was never going to lead to anything!”). What a great way to spend an afternoon!

But does any of this mean anything? In the current (March-April 2008) issue of Interfaces, Malcolm Wright and J. Scott Armstrong take a close look at citations in an article entitled “Verification of Citations: Fawlty Towers of Knowledge?” (Interfaces, 38(2): 125-139). They talk about three types of errors (and I recognize the risks that this blog summary may end up committing some of these errors, including the count of errors! [In fact, the initial post of this entry misspelled the title.]):

  1. Failure to include relevant studies
  2. Incorrect references
  3. Quotation errors

Much of the article involves a paper written by Armstrong and Overton in 1977 on overcoming non-response bias in surveys. The overlap in authors means that the authors probably understand what the paper meant to say but the authors may have a certainly lack of objectivity on the subject. Despite the objectivity issue, the article makes for stunning reading.

The most persuasive of the arguments regards “Quotation Errors”. While it is not new to note that many authors don’t read all of the papers in their references, it is amusing to see how many people can’t even get the basic ideas right:

A&O is ideal for assessing the accuracy of how the findings were used because it provides clear operational advice on how to constructively employ the findings. We examined 50 papers that cited A&O, selecting a mix of highly cited and recently published papers. …

Of the articles in our sample, 46 mentioned differences between early and late respondents. This indicates some familiarity with the consequences of the interest hypothesis. However, only one mentioned expert judgment, only six mentioned extrapolation, and none mentioned consensus between techniques. In short, although there were over 100 authors and more than 100 reviewers, all the papers failed to adhere to the A&O procedures for estimating nonresponse bias. Only 12 percent of the papers mentioned extrapolation, which is the key element of A&O’s method for correcting nonresponse bias. Of these, only one specified extrapolating to a third wave to adjust for nonresponse bias.

The paper was also not referred to correctly in many cases:

We examined errors in the references of papers that cite A&O. To do this, we used the ISI Citation Index (in August 2006). We expected this index to underrepresent the actual error rate because the ISI data-entry operators may correct many minor errors. In addition, articles not recognized as being from ISI-cited journals do not have full bibliographic information recorded; therefore, they will also omit errors in the omitted information. Despite this, we found 36 variations of the A&O reference. Beyond the 963 correct citations, we found 80 additional references that collectively employed 35 incorrect references to A&O. Thus, the overall error rate was 7.7 percent.

Their discussion of “missing references” was not convincing to me (though it is unclear how to do this in an objective way). The authors did some google searches and checked how often some of their key ideas were missing. Since they found about a million results for “(mail OR postal) and survey” AND (results OR findings), and only 24,000 of those mention (error OR bias), of which only 348 mention Armstrong OR Overton, they conclude that their work on bias is not well represented in real surveys. It doesn’t take much experience with google search to believe that the baseline of a million pages does not correspond to one million surveys (and the vast majority of internet surveys have zero interest in accuracy to the level of A&O). Their work with Google Scholar had similar results, and I have similar concerns over the relevance of the baseline search. But I certainly believe this qualitatively: there are many papers that should provide more relevant references (particularly to my papers!).

The authors have a solution to the “quotation error” issue that is both simple and radical:

The problem of quotation errors has a simple solution: When an author uses prior research that is relevant to a finding, that author should make an attempt to contact the original authors to ensure that the citation is properly used. In addition, authors can seek information about relevant papers that they might have overlooked. Such a procedure might also lead researchers to read the papers that they cite. Editors could ask authors to verify that they have read the original papers and, where applicable, attempted to contact the authors. Authors should be required to confirm this prior to acceptance of their paper. This requires some cost, obviously; however, if scientists expect people to accept their findings, they should verify the information that they used. The key is that reasonable verification attempts have been made.
Despite the fact that compliance is a simple matter, usually requiring only minutes for the cited author to respond, Armstrong, who has used this procedure for many years, has found that some researchers refuse to respond when asked if their research was being properly cited; a few have even written back to say that they did not plan to respond. In general, however, most responded with useful suggestions and were grateful that care was taken to ensure proper citation.

Interesting proposal, and one most suitable for things like survey articles that attempt to cover a field. I am not sure how I would react to requests of this type: I suspect such requests might fall through the cracks amongst all the other things I am doing. But if the norms of the field were to change…

The article has a number of commentaries. I particularly liked the beginning of the Don Dillman article:

In 1978, I authored a book, Mail and Telephone Surveys: The Total Design Method (Dillman 1978). According to the ISI Citation Indexes, it has now been cited in the scientific literature approximately 4,000 times. When reviewing a summary of its citations, I discovered citations showing publication dates in 24 different years, including 1907 (once) and 1908 (three times). Citations erroneously listed it as having been published in all but three of the years between 1971 and 1995; there were 102 such citations. In addition, 10 citations showed it as having been published in 1999 or 2000. I attribute the latter two years to authors who intended to cite the second edition— although I had changed the title to Mail and Internet Surveys: The Tailored Design Method (Dillman 2000).

I discovered 29 different titles for the book, including mail descriptors such as main, mall, mial, mailed, and mailback. The telephone descriptor also had creative spellings; they included telephon, teleophone, telephones, telephone, and elephone. Not surprisingly, my name was also frequently misspelled as Dillon, Dilman, Dill, and probably others than I was unable to find. I also discovered that I had been given many new middle initials. A similar pattern of inaccuracies has also emerged with the second edition of the book.

I do believe that technology can help with some of these issues, particularly with incorrect references. But including relevant references and correctly summarizing or using cited references is an important part of the system, and it is clear that the current refereeing system is not handling this well.

Papers and commentary like this are one reason I wish Interfaces was available to the wider public ( I found an earlier version of the base article here, but no link to the commentaries), even if just for a short period.  The article and commentaries are worth searching around for.

{ 1 } Comments

  1. Anne-Wil Harzing | April 28, 2008 at 8:30 pm | Permalink

    Hi Michael,

    Thanks for referring to the Publish or Perish program. I hope it will be useful to many of your readers.

    On the topic of your post, this is highly amusing reading, but:

    a. this does not mean citation analysis is completely invalid.

    b. the identification of this problem is by no means new.

    * Any user of Thomson ISI or Google Scholar knows there are loads of incorrect references, especially of course to highly cited works (there are for instance hundreds and hundreds of miscitations to Hofstede’s Culture’s Consequences in ISI, GS fares a bit better in combining obvious misspellings)

    * There are a substantial number of articles in Management (and other disciplines) assessing the problems with accurate referencing. Interestingly, the authors cite many science papers, but none of the previous papers in management.

    I even have two papers on this topic myself 🙂

    Harzing, A.W.K. – 1995 – The persistent myth of high expatriate failure rates, International Journal of Human Resource Management, vol. 6, May, pp. 457-475

    Harzing, A.W.K. – 2002 – Are our referencing errors undermining our scholarship and credibility? The case of expatriate failure rates, Journal of Organizational Behavior, vol. 23/1, pp. 127-148.