## Which Average do you Want?

Now that I am spending a sentence as an academic administrator in a business school (the Tepper School at Carnegie Mellon University), I get first-hand knowledge of the amazing number of surveys, questionnaires, inquiries, and other information gathering methods organizations use to rank, rate, or otherwise evaluate our school. Some of these are “official”, involving accreditation (like AACSB for the business school and Middle States for the university). Others are organizations that provide information to students. Biggest of these, for us, is *Business Week*, where I am happy to see that our MBA program went up four positions from 15th to 11th in the recent ranking. Us administrators worry about this so faculty don’t have to.

Responding to all these requests takes a huge amount of time and effort. We have a full-time person whose job is to coordinate these surveys and to analyze the results of them. Larger schools might have three or four people doing this job. And some surveys turn out to be so time-intensive to answer that we decline to be part of them. Beyond Grey Pinstripes was an interesting ranking based on sustainability, but it was a pain to fill out, which seems to be one reason for its recent demise.

As we go through the surveys, I am continually struck by the vagueness in the questions, even for questions that seem to be asking for basic, quantitative information. Take the following commonly asked question: “What is the average class size in a required course?”. Pretty easy, right? No ambiguity, right?

Let’s take a school with 4 courses per semester, and two semesters of required courses. Seven courses are “normal”, classes run in 65 student sections, while one course is divided into 2 half-semester courses, each run in 20 student seminars (this is not the Tepper School but illustrates the issue). Here are some ways to calculate the average size:

A) A student takes 9 courses: 7 at 65 and 2 at 20 for an average of 55.

B) If you weight over time, it is really 8 semester-courses: 7 at 65 and 1 at 20 for an average of 59.4

C) There are about 200 students, so the school offers 21 sections of 65 student classes and 20 sections of size 20 for an average of 43.

Which is the right one? It depends on what you are going to use the answer for. If you want to know the average student experience, then perhaps calculation B is the right one. An administrator might be much more concerned about calculation C, and that is what you get if you look at the course lists of the school and take the average over that list. If you look at a student’s transcript and just run down the size for each course, you get A.

We know enough about other schools that we can say pretty clearly that different schools will answer this in different ways and I have seen all three calculations being used on the same survey by different schools. But the surveying organization will then happily collect the information, put it in a nice table, and students will sort and make decisions based on these numbers, even though the definition of “average” will vary from school to school.

This is reminiscent of a standard result in queueing theory that says that the system view of a queue need not equal a customer’s view. To take an extreme example, consider a store that is open for 8 hours. For seven of those hours, not a single customer appears. But a bus comes by and drops off 96 people who promptly stand in line for service. Suppose it takes 1 hour to clear the line. On average, the queue length was 48 during that hour. So, from a system point of view, the average (over time) queue length was (0(7)+48(1))/8=6. Not too bad! But if you ask the customers “How many people were in line when you arrived?”, the average is 48 (or 47 if they don’t count themselves). Quite a difference! What is the average queue length? Are you the store or a customer?

Not surprisingly, if we can get tripped up on a simple question like “What’s your average class size?”, filling out the questionnaires can get extremely time consuming as we figure out all the different possible interpretations of the questions. And, given the importance of these rankings, it is frustrating that the results are not as comparable as they might seem.