I had blogged recently about a Tim Harford column. In that column, he had spoken about the controversy surrounding the Ease of Doing Business rankings, and ruminated about why the controversy was, in a sense, inevitable.
Alex Selby-Boothroyd, the head of data journalism at The Economist magazine, has a section in one of their newsletters titled “How to compile an index”:
In any ranking of our Daily charts, it is no small irony that some of the most viewed articles will be those that use indices to rank countries or cities. The cost-of-living index from the EIU that we published last week is a case in point. It was the most popular article on our website for much of the week. Readers came to find out not just which city was the world’s most expensive, but also where their own cities were placed. The popularity of such lists is unsurprising: most people take pride in where they live and want to see how it compares with other places, and there’s also a desire to “locate yourself within the data”. But how are these rankings created?Source: Off The Charts Newsletter From The Economist
Alex makes the same point that Tim did in his column – rankings just tend to be more viral. What that says about us as a society is a genuinely interesting question, but we won’t go down that path today. We will learn instead the concepts behind the creation of an index.
There are, as the newsletter mentions, two different kinds of indices you want to think about. One is relatively speaking simpler to work with, because it is quantitative. Now, if you are just about beginning your journey into the dark arts of stats and math, you might struggle to wrap your head around the fact that making something quantitative makes it simpler. And trust me, I know the feeling. And I’ll get to why qualitative data is actually harder in just a couple of paragraphs.
But for the moment, let’s focus on the cost-of-living index that the excerpt was referring to:
EIU correspondents visit shops in 173 cities around the world and collect multiple prices for each item in a globally comparable basket of goods and services. These prices are averaged and weighted and then converted from local currency into US dollars at the prevailing exchange rate. The overall value is then indexed relative to New York City’s basket, the cost of which is set at 100.Source: Off The Charts Newsletter From The Economist
Here are some questions you should be thinking about for having read the paragraph:
- Why these 173 cities and no other? Has the list changed over time? Whether yes or no, why?
- How does one decide upon a “globally comparable basket of goods and services”. No such list can ever be perfect, so how does one decide when it is “good enough”?
- How are these prices averaged and weighted? Weighted by what?
- Why does The Economist magazine not use the purchasing power parity adjust exchanged rate?
- Why New York City’s basket? Why no other city?
I do not for a minute mean to suggest that these should be your only questions – see if you can come up with more, and try and bug your friends and stats professor with these questions. Even better, see if you can do this as an in-class exercise!
Not all indices are so straightforward. Sometimes they are used to measure something more subjective. The EIU has another index that ranks cities by the quality of life they provide. For this, in-country experts assess more than 30 indicators such as the prevalence of petty crime, the quality of public transport or the discomfort of the climate for travellers. Each indicator is assigned a qualitative score: acceptable, tolerable, uncomfortable, undesirable or intolerable. These words are assigned a numerical value and a ranking begins to emerge. The scoring system is fine-tuned by giving different weightings to each category (the EIU weights the “stability” indicators slightly higher than the “infrastructure” questions, for example). Further tweaking of the weights might be required, such as when the availability of health care becomes more important during a pandemic.Source: Off The Charts Newsletter From The Economist
You see why qualitative data is more problematic? Just who, exactly, are in-country experts? Experts on what basis? As decided by whom?
I should be clear – this is in no way a criticism of the methodology used by The Economist. In fact, in the very next paragraph, the newsletter explains the problems with a qualitative index. And in much the same vein, I am simply trying to explain to you why a qualitative index is so problematic, regardless of who tries to build one.
But the problem is a real one! Expertise in matters such as these is all but impossible to assess accurately, and the inherent biases of these experts are also going to get baked into these assessments. And not just biases, their moods and state of mind are also going to be baked into these assessments. Again, this is not a criticism, it is inevitable.
And the biggest problem of them all: the subjectivity of not the experts, but rather the scale itself!
Qualitative rankings are built on subjective measures. Perhaps “tolerable” means almost the same to someone as “uncomfortable”—whereas “intolerable” might feel twice as bad as “undesirable”? On ordinal scales the distance between these measures is subjective—and yet they have to be assigned a numerical score for the ranking to work.Source: Off The Charts Newsletter From The Economist
Statistical analysis of qualitative data is problematic, and I cannot begin to tell you how often statistical tools are misapplied in this regard. If you are learning statistics for the first time, take it from me: spend hours understanding the nature of the data you are working with. It will save you hours of rework later.
And finally, have fun exploring some of The Economist’s own indices (if these happen to behind a paywall, my apologies!):