What is common to Butyrylcholinesterase and Vitamin D, or why English is an underrated skill in statistics

Today’s blog post title is in the running for the longest title that I have come up with, but let’s ignore this particular bit of potential trivia and get on with it.

Today’s story really begins with the tragic tale of Sally Clark. It is a very lengthy extract, from a piece I wrote along with a friend some months ago. Lengthy, but fascinating:

In November of the year 1999, an English Solicitor named Sally Clark was convicted on two charges of murder, and sentenced to life imprisonment. This tragic case is notable for many reasons — one of those reasons was the fact that her alleged victims were her own sons. Another was the fact that both were toddlers when they died.
The cause of death in both cases was initially attributed to sudden infant death syndrome (SIDS), also known as cot death in the United Kingdom. We did not know then, and do not know until this day, about the specific causes of SIDS. But suspicion grew on account of the fact that two children from the same family had died due to unspecified causes, and shortly after the death of her second child, Sally Clark was arrested, tried and convicted.
One of the clinching pieces of evidence was expert testimony provided by the pediatrician Professor Sir Roy Meadow. He put the odds of two children from the same family dying of SIDS at 1 in 73 million — in other words, an all but impossible eventuality. On the back of this testimony, and others, Sally Clark was convicted of the crime of murdering her own sons, and sent to prison for life.
One cannot help but ask the question: how did Sir Roy Meadow arrive at this number of 1 in 73 million? Succinctly put, here is the theory: for the level of affluence that Sally Clark’s family possessed, the chance of one infant dying of SIDS was 1 in 8543. This was simply an empirical observation. What then, were the chances that two children from the same family would die of SIDS?
The answer to this question, statisticians tell us, depends on whether the two deaths are independent of each other. If one assumes that they are, then the probability of two deaths in the same family is simply the multiplicative product of the two probabilities. That is, 1 in 8543 multiplied by itself, which is 1 in 73 million and that would be enough to convince any “reasonable man” that the deaths were deliberate and could not have been just coincidence.
But on the other hand, if the two events are not independent of each other — say, for example, that there are underlying genetic or environmental reasons that we simply are not aware of just yet — then it is entirely possible that multiple children from the same family may die of SIDS. In fact, given a SIDS death in a family, research shows that the likelihood of a second SIDS death goes up.
Sally Clark’s convictions were overturned on her second appeal, and she was released from prison. She died four years later due to alcohol poisoning.

https://www.scconline.com/blog/post/2021/12/14/data-analysis-an-essential-skill-for-the-legal-community/

We’ll get back to this truly tragic tale, but let’s go off on a tangent for a second.


Today’s a day for extracts from my own earlier work, it would seem, for I have another one for your consideration:

Us teaching type folks love to say that correlation isn’t causation. As with most things in life, the trouble starts when you try to decipher what this means, exactly. Wikipedia has an entire article devoted to the phrase, and it has occupied space in some of the most brilliant minds that have ever been around.
Simply put, here’s a way to think about it: not everything that is correlated is necessarily going to imply causation.
For example, this one chart from this magnificent website (and please, do take a look at all the charts):

https://econforeverybody.com/2021/05/19/correlation-causation-and-thinking-things-through/
https://www.tylervigen.com/spurious-correlations

Hold on to this line of thinking, and let’s get back to the tragic Sally Clark story, but with a twist towards the rather more optimistic side of things.


Great news, right? We’ve found what causes SIDS!

Well, that’s where it gets tricky, and we go off on yet another tangent.


Do Vitamin D supplements help? We know that sunlight gives us Vitamin D, and that’s A Good Thing. So if we don’t get enough sunlight, hey, let’s get Vitamin D injections or supplements:

In interpreting vitamin D-related study results, correlation should not be understood as causation. Diets composed of vitamin D–rich foods such as dairy products and salmon also contain high levels of other healthy nutrients. Those who have a high vitamin D level are likely to participate in active outdoor activities and exercises, to be interested in health issues, and to have a healthy lifestyle. Without considering these confounders, misleading results can be obtained. In the study by Kim et al.,4) a univariate analysis revealed a correlation between a low vitamin D level and a low quality of life score; however, its significance was lost when age, sex, income, education level, and disease state were considered.
Sometimes, correlations shown in cross-sectional studies are used as evidence for requiring vitamin D supplements. A recent increasing trend of taking vitamin D supplements may be due to these effects.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4961851/

What if Vitamin D is just a marker? That is, what if sunlight causes a lot of good things in our bodies, and it also causes Vitamin D levels in our body to go up? So it’s not sunlight that causes vitamin D to go up, and vitamin D that causes an increase in our wellbeing. Maybe it’s sunlight causing an uptick in our wellbeing and causing an increase in Vitamin D levels in our body? They (health and vitamin D levels) may just be correlated, without there being any causation.

What I’m about to say is important: I’m not a doctor. All I’m saying is, I’ve been confused often enough about correlation and causation to wonder about whether vitamin D causes good health. It is correlated, there’s no arguing with that. But causation? Ah, that’s another (very tricky) thing altogether.


And now that we have the mis-en-place of this blogpost done, let’s get the dish together.

Butyrylcholinesterase doesn’t necessarily cause SIDS in infants. Infants who die of SIDS stop breathing (for reasons that are still not understood clearly), and these infants have low levels of Butyrylcholinesterase. Butyrylcholinesterase may not even cause breathing to stop in infants. It is just a marker – there is correlation there, but we don’t know if there is causation.

In fact, the paper’s title itself says as much:

“Butyrylcholinesterase is a potential biomarker for Sudden Infant Death Syndrome”

But the tweet above speaks about how we’ve found the cause, and that’s not quite right.

Again, please don’t misunderstand me – the fact that this has been discovered is awesome, it is fantastic, and the joy, the relief and the euphoria should absolutely be there.

But:

Sally Clark lost her life at least in part to a fundamental misunderstanding of statistical theory, and we still don’t know what causes SIDS. We understand it better, but there is a ways to go.


The most underrated skill in statistics is the English language.

Words matter, and we all (myself included!) need to be more careful about what exactly we mean when we speak about statistics.

And thank god we’re closer to figuring out how to deal with the horrible, horrible thing that SIDS is.

But if you’re teaching or learning statistics, tread very, very carefully.

Reflections on Whole Numbers and Half Truths

Single narratives have never been able to explain all of India.

S, Rukmini. Whole Numbers and Half Truths: What Data Can and Cannot Tell Us About Modern India (p. 220). Kindle Edition.

There is this line that is often quoted when big picture discussions about India take place, and it is only a matter of time before it comes up: whatever you say about India, the opposite is also true. The quote is attributed to Joan Robinson, and I can’t help but wonder if I will end up creating a paradox of sorts by agreeing wholeheartedly with it.

But I do agree with the spirit of the quote, which is why that one line extract from Rukmini S’s book, Whole Numbers and Half Truths, resonated so much with me. All countries are complex and complicated, but India takes the game to giddying heights.

Take a look at this map, a version of which is present in Rukmini’s book:

https://en.wikipedia.org/wiki/List_of_states_and_union_territories_of_India_by_fertility_rate

What is India’s TFR? First, for those uninitiated in the art and science of demography, what is TFR? It stands for Total Fertility Ratio, or as Hans Rosling used to put it, babies per woman. Well, it’s 2.0, which is good, because roughly speaking, two parents giving birth to two children will mean we’re at the replacement rate (note that this is a very basic way of thinking about it, but useful as a rough approximation).

But as any student of statistics ought to tell you, that’s only half the story (or half the truth). Uttar Pradesh, Bihar and Jharkhand are well above the so-called replacement rate, and that will have implications for labor mobility, taxation, political representation and so, so much more in the years to come.

Data then, is only half the story. How is the data collected? If it is a sampling exercise rather than a census, how was the sampling done? Has the sampling method changed over time? If so, are earlier data collection exercises comparable with current ones?

How should one think about the data that has been collected? What does it mean, and how much does context matter? For example:

‘That’s data about marriage, madam,’ he said—not about love. ‘I think if your data asked people if they have ever fallen in love with someone from another caste or religion, many will say yes. I see that all around me among my friends. But when it comes to getting married, most of us are not yet ready to leave our families. That’s why your data looks like that,’ he said. As for the rest? ‘There is a lot we will not admit to someone doing a survey. But things are changing. At least for some of us,’ he said.

S, Rukmini. Whole Numbers and Half Truths: What Data Can and Cannot Tell Us About Modern India (pp. 127-128). Kindle Edition.

Rukmini’s excellent book is, in one sense, a deep reflection on the data that we have, have had, and would like to have where India is concerned. It speaks about how data has been collected, which are the agencies and institutions involved, how these have changed (and been changed) over time, and with what consequences.

But it also is a reflection on a truism that many economists and statisticians underrate: data can only take you so far. As the subtitle of her book puts it, it is an analysis of what data can and cannot tell you about modern India.

And what data leaves out is often as fascinating as what it includes:

Yet, most people know little about the NCRB’s processes and methodology. For instance, the NCRB follows a system known as the ‘principal offence rule’. Instead of all the Indian Penal Code (IPC) sections involved in an alleged crime making it to the statistics, the NCRB only picks the ‘most heinous’ crime from each FIR for their statistics. I stumbled upon this then unknown fact in an off-the-record conversation with an NCRB statistician in the months after the deadly sexual assault of a physiotherapy student in Delhi in September 2012. In the course of that conversation, I learnt that the crime that shook the country would have only made it to the NCRB statistics as a murder, and not as a sexual assault, because murder carries the maximum penalty. This, I was told, was to prevent the crime statistics from being ‘artificially inflated’: ‘If the FIR is for theft, there will be a[n IPC] section for assault also, causing hurt also. If you include all the sections, people will think these are separate crimes and the numbers will seem too huge,’ he told me. After I reported this,2 the NCRB for the first time began to include the ‘principal offence rule’ in its disclaimer.3

S, Rukmini. Whole Numbers and Half Truths: What Data Can and Cannot Tell Us About Modern India (p. 13). Kindle Edition.

The paragraph that follows this one is equally instructive in this context, but the entire book is full of such Today-I-Learnt (TIL) moments. Even for those of us involved in academia, there is much to learn in terms of nuance and context by reading this book. If you are not in academia, but are interested in learning more about this country, recommending this book to you is even easier!

Rukmini’s books spans ten chapters on ten different (but obviously related) aspects of India. We get to learn how Indians tangle (or quite often choose not to!) with the cops and the courts, how we perceive the world around us, why Indians vote the way they do in the first three chapters. The next three are about how (and with whom) we live our lives, and how we earn and spend our money. The next trio is about how and where we work, how we grow and age and where Indians live. The final chapter is about India’s healthcare system.

Each chapter makes us familiar with the data associated with each of these topics, but each chapter is also a reflection on the fact that data can only take us so far. When you throw into the mix the fact that the data will always (and sometimes necessarily) be imperfect, we’re left with only one conclusion – analyze the data carefully, but always bear in mind that the reality will always be more complex. Data is, at the end of the day, an abstraction, and it will never be perfect.


One reason I liked the book so much is because of its brevity. Each of these chapters can and should be be a separate book, and condensing them into chapters can’t have been an easy task. But not only has she managed it, she has managed to do so in a way that is lucid, thought-provoking and informative. Two out of these three is a good achievement, to achieve all three and that across ten chapters is a rare ol’ achievement.

If I’m allowed to be greedy, I would have liked a chapter on the world of data that the RBI collects, and to its credit does share with us via its website. But it does so in a way that is best described as unintuitive. In fact, a book on how data sharing practices with the citizenry need to improve out of sight where government portals across all verticals and at all levels are concerned would be a great sequel (hint, hint!).


I’d strongly recommend this book to you, and I hope you enjoy reading it as much as I did.

We will be hosting Rukmini on the Gokhale Institute campus this coming Friday, the 29th of April. The event will be from 5.30 pm to 7.00 pm at the Kale Hall. She and I will speak about the book for about an hour, followed by a Q&A session with the audience.

If you are in Pune, please do try and make it!

Should students of law be taught statistics?

I teach statistics (and economics) for a living, so I suppose asking me this question is akin to asking a barber if you need a haircut.

But my personal incentives in this matter aside, I would argue that everybody alive today needs to learn statistics. Data about us is collected, stored, retrieved, combined with other data sources and then analyzed to reach conclusions about us, and at a pace that is now incomprehensible to most of us.

This is done by governments, and private businesses, and it is unlikely that we’re going to revert to a world where this is no longer the case. You and I may have different opinions about whether this intrusive or not, desirable or not, good or not – but I would argue that this ship has sailed for the foreseeable future. We (and that’s all of us) are going to be analyzed, like it or not.

And conclusions are going to be made about us on the basis of that analysis, like it or not. This could be, for example, a computer in a company analyzing us as a high value customer and according us better service treatment when we call their call center. Or it could be a computer owned by a government that decides that we were at a particular place at a particular time on the basis of the footage from a security camera.

In both of these cases (and there are millions of other examples besides), there is no human being who makes these decisions about us. Machines do. This much is obvious, because it is now beyond the capacity of our species to deal manually with the amount of data that we generate on a daily basis. And so the machines have taken over. Again, you and I may differ on whether this is a good thing or a bad thing, but the fact is that it is a trend that is unlikely to be reversed in the foreseeable future.

Are the conclusions that these machines reach infallible in nature? Much like the humans that these machines have replaced, no. They are not infallible. They process information much faster than we humans can, so they are definitively better in handling much more data, but machines can make errors in classification, just like we can. Here, have fun understanding what this means in practice.

Say this website asks you to draw a sea turtle. And so you start to draw one. The machine “looks” at what you’ve drawn, and starts to “compare” it with its rather massive data bank of objects. It identifies, very quickly, those objects that seem somewhat similar in shape to those that you are drawing, and builds a probabilistic model in the process. And when it is “confident” enough that it is giving the right answer, it throws up a result. And as you will have discovered for yourself, it really is rather good at this game.

But is it infallible? That is, is it perfect every single time? Much like you (the artist) are not, so also with the machine. It is also not perfect. Errors will be made, but so long as they are not made very often, and so long as they aren’t major bloopers, we can live with the trade-off. That is, we give up control over decision making, and we gain the ability to analyze and reach conclusions about volumes of data that we cannot handle.

But what, exactly, does “very often” mean in the previous paragraph? One error in ten? One in a million? One in an impossibly-long-word-that-ends-in-illion? Who gets to decide, and on what basis?

What does the phrase “major blooper” mean in that same paragraph? What if a machine places you on the scene of a crime on the basis of security camera footage when you were in fact not there? What if that fact is used to convict you of a crime? If this major blooper occurs once in every impossibly-long-word-that-ends-in-illion times, is that ok? Is that an acceptable trade-off? Who gets to decide, and on what basis?


If you are a lawyer with a client who finds themselves in such a situation, how do you argue this case? If you are a judge listening to the arguments being made by this lawyer, how do you judge the merits of this case? If you are a legislator framing the laws that will help the judge arrive at a decision, how do decide on the acceptable level of probabilities?

It needn’t be something as dramatic as a crime, of course. It could be a company deciding to downgrade your credit score, or a company that decides to shut off access to your own email, or a bank that decides that you are not qualified to get a loan, or any other situation that you could come up with yourself. Each of these decisions, and so many more besides, are being made by machines today, on the basis of probabilities.

Should members of the legal fraternity know the nuts and bolts of these models, and should we expect them to be experts in neural networks and the like? No, obviously not.

But should members of the legal fraternity know the principles of statistics, and have an understanding of the processes by which a probabilistic assessment is being made? I would argue that this should very much be the case.

But at the moment, to the best of my knowledge, this is not happening. Lawyers are not trained in statistics. I do not mean to pick on any one college or university in particular, and I am not reaching a conclusion on the basis of just one data point. A look at other universities websites, conversations with friends and family who are practicing lawyers or are currently studying law yields the same result. (If you know of a law school that does teach statistics, please do let me know. I would be very grateful.)


But because of whatever little I know about the field of statistics, and for the reasons I have outlined above, I argue that statistics should be taught to the students of law. It should be a part of the syllabus of law schools in this country, and the sooner this happens, the better it will be for us as a society.

The Economist on How To Compile an Index

I had blogged recently about a Tim Harford column. In that column, he had spoken about the controversy surrounding the Ease of Doing Business rankings, and ruminated about why the controversy was, in a sense, inevitable.

Alex Selby-Boothroyd, the head of data journalism at The Economist magazine, has a section in one of their newsletters titled “How to compile an index”:

In any ranking of our Daily charts, it is no small irony that some of the most viewed articles will be those that use indices to rank countries or cities. The cost-of-living index from the EIU that we published last week is a case in point. It was the most popular article on our website for much of the week. Readers came to find out not just which city was the world’s most expensive, but also where their own cities were placed. The popularity of such lists is unsurprising: most people take pride in where they live and want to see how it compares with other places, and there’s also a desire to “locate yourself within the data”. But how are these rankings created?

Source: Off The Charts Newsletter From The Economist

Alex makes the same point that Tim did in his column – rankings just tend to be more viral. What that says about us as a society is a genuinely interesting question, but we won’t go down that path today. We will learn instead the concepts behind the creation of an index.

There are, as the newsletter mentions, two different kinds of indices you want to think about. One is relatively speaking simpler to work with, because it is quantitative. Now, if you are just about beginning your journey into the dark arts of stats and math, you might struggle to wrap your head around the fact that making something quantitative makes it simpler. And trust me, I know the feeling. And I’ll get to why qualitative data is actually harder in just a couple of paragraphs.

But for the moment, let’s focus on the cost-of-living index that the excerpt was referring to:

EIU correspondents visit shops in 173 cities around the world and collect multiple prices for each item in a globally comparable basket of goods and services. These prices are averaged and weighted and then converted from local currency into US dollars at the prevailing exchange rate. The overall value is then indexed relative to New York City’s basket, the cost of which is set at 100.

Source: Off The Charts Newsletter From The Economist

Here are some questions you should be thinking about for having read the paragraph:

  • Why these 173 cities and no other? Has the list changed over time? Whether yes or no, why?
  • How does one decide upon a “globally comparable basket of goods and services”. No such list can ever be perfect, so how does one decide when it is “good enough”?
  • How are these prices averaged and weighted? Weighted by what?
  • Why does The Economist magazine not use the purchasing power parity adjust exchanged rate?
  • Why New York City’s basket? Why no other city?

I do not for a minute mean to suggest that these should be your only questions – see if you can come up with more, and try and bug your friends and stats professor with these questions. Even better, see if you can do this as an in-class exercise!


Not all indices are so straightforward. Sometimes they are used to measure something more subjective. The EIU has another index that ranks cities by the quality of life they provide. For this, in-country experts assess more than 30 indicators such as the prevalence of petty crime, the quality of public transport or the discomfort of the climate for travellers. Each indicator is assigned a qualitative score: acceptable, tolerable, uncomfortable, undesirable or intolerable. These words are assigned a numerical value and a ranking begins to emerge. The scoring system is fine-tuned by giving different weightings to each category (the EIU weights the “stability” indicators slightly higher than the “infrastructure” questions, for example). Further tweaking of the weights might be required, such as when the availability of health care becomes more important during a pandemic.

Source: Off The Charts Newsletter From The Economist

You see why qualitative data is more problematic? Just who, exactly, are in-country experts? Experts on what basis? As decided by whom?

I should be clear – this is in no way a criticism of the methodology used by The Economist. In fact, in the very next paragraph, the newsletter explains the problems with a qualitative index. And in much the same vein, I am simply trying to explain to you why a qualitative index is so problematic, regardless of who tries to build one.

But the problem is a real one! Expertise in matters such as these is all but impossible to assess accurately, and the inherent biases of these experts are also going to get baked into these assessments. And not just biases, their moods and state of mind are also going to be baked into these assessments. Again, this is not a criticism, it is inevitable.


And the biggest problem of them all: the subjectivity of not the experts, but rather the scale itself!

Qualitative rankings are built on subjective measures. Perhaps “tolerable” means almost the same to someone as “uncomfortable”—whereas “intolerable” might feel twice as bad as “undesirable”? On ordinal scales the distance between these measures is subjective—and yet they have to be assigned a numerical score for the ranking to work.

Source: Off The Charts Newsletter From The Economist

Statistical analysis of qualitative data is problematic, and I cannot begin to tell you how often statistical tools are misapplied in this regard. If you are learning statistics for the first time, take it from me: spend hours understanding the nature of the data you are working with. It will save you hours of rework later.

And finally, have fun exploring some of The Economist’s own indices (if these happen to behind a paywall, my apologies!):

On the Etymology of Risk

I often like to begin classes on statistics by talking about the etymology of the word average, and it is such a lovely story:

Everybody associated with transporting goods by sea had to deal with the chance that only a part of the consignment would actually reach the intended destination. There was always the chance that a part of the consignment would go bad, or needed to be jettisoned, or some such. Who bears the loss of this part of the total consignment? Should it be the sending merchant, the receiving merchant or should it be the captain of the ship?

Thus, when for the safety of a ship in distress any destruction of property is incurred, either by cutting away the masts, throwing goods overboard, or in other ways, all persons who have goods on board or property in the ship (or the insurers) contribute to the loss according to their average, that is, according to the proportionate value of the goods of each on board. [Century Dictionary]

https://www.etymonline.com/word/average

The latter half of that excerpt above is nothing but “sigma x by n” – the total losses divided by the number of people involved. This, of course, is nothing but the formula for average. But the word itself comes from the word loss, but in Arabic – awargi, or awariya. Or as I like to tell my students, you’re really speaking Arabic when you’re saying “average”.


Psyche.co has a lovely essay on both the etymology of, and the emergence of the concept of, risk. Authored by Karla Mallette, it is a lovely little rumination on both the meaning of the word, and how it has evolved over time.

The first known usage of the Latin word resicum – cognate and distant ancestor of the English risk – occurs in a notary contract recorded in Genoa on 26 April 1156. The captain of a ship contracts with an investor to travel to Valencia with the sum invested. The contract allocates the ‘resicum’ to the investor.

https://psyche.co/ideas/how-12th-century-genoese-merchants-invented-the-idea-of-risk

This is entirely speculative on my part, because I know next to nothing about Latin, but a simple Google search for the meaning/etymology of resicum tells me that it means “that which cuts, rock, crag”. If one agrees with the notion that ship voyages at the time must have been fraught with risk, then the etymology of risk begins to make eminent sense – the entirety of the prospective profit from such a voyage can end up being cut down to zero. One could earn all of it, or one could get none of it – that, of course, is the risk involved in such a structure.


The essay remains of interest beyond just this point:

Before the innovation of the resicum, captain and crew took on the risks of the journey alone: only they would shoulder the burdens (and pocket the profits). But resicum shared out potential profit and loss among a broader community. It put a number on contingency, and in so doing it rationalised risk.

https://psyche.co/ideas/how-12th-century-genoese-merchants-invented-the-idea-of-risk

In this context, one needs to realize that the author is talking not about the original meaning of the word resicum. Rather, she is implying that resicum has a modern, institutional meaning now – the idea that resicum (or risk) is being diversified. The captain doesn’t bear the risk alone, although he does bear part of it (typically 25% in those times). Somewhat analogous to what we could call sweat equity these days, I suppose. The rest of the risk, or resicum, is parcelled out to investors who are willing to stump up the cost of the voyage. If the captain comes back empty handed, they lose their investment. If the captain comes back from the voyage, his ship laden with precious cargo, then the investors reap the benefits of having funded the voyage.

This arrangement was called resicum, and it seems to have meant an arrangement which had the ability (but not the guarantee) to provide sustenance.

Historians believe that resicum derived from an Arabic word, al-rizq. The Arabic rizq is Quranic. It refers to God’s provision for creation. This verse, for instance, uses the noun and a verb derived from the same lexical root, and refers to the sustenance that God provides for all of creation: ‘And how many a creature does not carry its own provision [rizq]! God provides for them and for you: he is the All-Hearing, the All-Knowing.’ During the Middle Ages, the word was used to name the daily subsistence pay given to soldiers. In the dialect of al-Andalus (Arab Spain), it referred to chance or good fortune. Rizq, it seems, bounced from port to port around the Mediterranean, until it landed on the worktable of a scribe in Genoa recording a strategy used to share out the risk of trans-Mediterranean trading ventures by betting against catastrophe.

https://psyche.co/ideas/how-12th-century-genoese-merchants-invented-the-idea-of-risk

So from providing for, to meaning good fortune, to our modern understanding of the word risk, the word has been on quite a journey, and is in fact a good way to understand all of what risk means.


A little postscript: I came across this article via The Browser. And second, if you haven’t read it, Against the Gods: The Remarkable Story of Risk by Peter Bernstein is a good introductory book to read about the topic.

Bibek Debroy on loopholes in the CPC

That’s the Civil Procedure Code.

The average person will not have heard of Dipali Biswas or Nirmalendu Mukherjee and may not be aware of the case decided by the Supreme Court on October 5, 2021. The case was decided by a division bench, consisting of Hemant Gupta and V Ramasubramanian and the judgment was authored by Justice V Ramasubramanian. Justice Ramasubramanian observed (not part of the judgment), “Not to be put off by repeated failures, the appellants herein, like the tireless Vikramaditya, who made repeated attempts to capture Betal, started the present round and hopefully the final round.” Other than smiling about a case that took 50 years to be resolved and making wisecracks about “tareekh pe tareekh”, shouldn’t we be concerned about rules and procedures (all in the name of natural justice) that permit a travesty of justice?

https://indianexpress.com/article/opinion/columns/civil-procedure-code-loopholes-justice-delay-7617291/

I know (alas) next to nothing about the law, but there were two excerpts in this article that I wanted to highlight as a student of statistics and economics. We’ll go with statistics first.

Whenever I start to teach a new course, I always tell my students that there are two kinds of errors I can make. I can either make sure that I complete the syllabus, irrespective of whether everybody has understood it or not. Or I can make sure that everybody has understood whatever I have taught, irrespective of whether the syllabus is completed or not. Speed versus thoroughness, if you will – and both cannot be optimized for at the same time. If you’re wondering, I prefer to err on the side of making sure everybody has understood, even if it comes at the cost of an incomplete syllabus.

This is, of course, closely related to formulating the null hypothesis and asking which type of error one would rather avoid. And the reason I bring it up, is because of this exceprt:

Innumerable judgments have quoted the maxim, “justice hurried is justice buried”. By the same token, justice tarried is also justice buried and inordinate delays mean the legal system doesn’t provide adequate deterrence to mala fide action. In my view, for most civil cases, the moment issues are framed, one can predict the outcome within a range, with a reasonable degree of certainty. (Obviously, I don’t mean constitutional cases before the Supreme Court.) With no disrespect to the legal system, I think AI (artificial intelligence) is capable of delivering judgments in such cases, freeing court time for non-trivial cases.

https://indianexpress.com/article/opinion/columns/civil-procedure-code-loopholes-justice-delay-7617291/

“Justice hurried is justice buried” and “Justice tarried is justice buried” are both problems, and optimizing for one means not optimizing for the other. What Bibek Debroy is saying here is that what we have ended up choosing to optimize for the former. We make sure that every case has the opportunity to be heard at great length, and with sufficient maneuvering room for both parties.

And that’s great, but the opportunity cost is the fact that sometimes judgments can take over fifty years (and counting!).

And what is Bibek Debroy’s solution? When he suggests that AI is capable of delivering judgments in such cases, he is not saying that the AI will give a perfect judgment every time. He is not even saying that one should use AI (I think the point is rhetorical, although of course I could be wrong). He is saying that the gains in efficiency are worth the occasional case being incorrectly judged. In other words, he is optimizing for justice tarried is also justice buried – he would rather avoid the error of taking up too much time for each case, and would (presumably) be fine paying the price of having the occasional case being misjudged.

It is up to you to agree or disagree with him, or with me when it comes to how I conduct classes. But all of us should be cognizant of the opportunity costs when we decide which error we’d rather avoid!


And economics second:

Litigants and lawyers (at least on one side of a civil case) have no incentive to finish a case fast (Does the judiciary have it?).

https://indianexpress.com/article/opinion/columns/civil-procedure-code-loopholes-justice-delay-7617291/

This is more of a question (or rumination) on my part – what are the incentives of the judiciary? I can imagine scenarios in which those “on one side of a civil case” can use both official rules and underhanded stratagems to delay the eventual judgment. And since there is no incentivization in terms of speedier resolutions, are we just left with a system that is geared towards moving along ponderously forever more?

And if so, how might this be changed for the better? This is, and I’m not joking, (more than) a trillion dollar question.


And finally, as a bonus, culture:

My friend Murali Neelakantan makes the point here that isn’t really about incentive design at all, that the problem is more rooted in how we, the people of India, use and abuse the provisions of the CPC.

That takes me into even deeper and ever more unfamiliar waters, so I shall think more about this before trying to write about it!

The Data and The Narrative

This week is Back to College at the Gokhale Institute. A podcast that I started a couple of years ago has become a tradition of sorts at the start of each semester at the BSc programme.

For about a week, we have people come and speak to us. All of them answer a simple question in a variety of ways. And that question is this: what would you do differently if you got the chance to go back to college? It’s a simple question, and can be answered in myriad ways. Here are some of the past talks, if you’re interested.

There’s one theme that has come up in all of the talks so far, and often enough for me to want to emphasize on it further. All of the speakers have spoken about the importance of doing the analysis, but also having the ability to build a story around it. Most folks are perhaps good at one, but not the other, and rarely both.

As an economist, almost all of the speakers have said, we have nowadays the ability to build models and run regressions. Building out a more sophisticated model, tweaking it, refining it, is either already possible, or can be learnt relatively easily. But where we lose out on, as young economists entering the workforce, is in our ability to explain what we’ve done.

I often say in my classes on statistics that the most underrated skill that a statistician possesses is the English language. I usually get confused laughter by way of response, but I am, of course, getting at much the same point. Unless you have the ability to explain what your model implies for the business problem at hand, you haven’t really done your work. And when I say explain, I mean using the English language.

Each of our speakers for the week so far have made the same point in their own way. Technical ability is table stakes. The differentiator is the ability to expand on what you’ve done, in a way that resonates with the listener. And resonance means the ability to tell a story about how what you’ve done is A Good Thing For The Business.

There are many other lessons to have come out of this week’s talks, and more, I’m sure, to come. But this is worth internalizing and working upon for all of us (myself included): it’s about the analysis and the narrative.

JEP, p-values and tests of statistical significance

The Summer 2021 issue of the Journal of Economic Perspectives came out recently:

I have been the Managing Editor of the Journal of Economic Perspectives since the first issue in Summer 1987. The JEP is published by the American Economic Association, which decided about a decade ago–to my delight–that the journal would be freely available on-line, from the current issue all the way back to the first issue. You can download individual articles or the entire issue, and it is available in various e-reader formats, too. Here, I’ll start with the Table of Contents for the just-released Summer 2021 issue, which in the Taylor household is known as issue #137.

https://conversableeconomist.wpcomstaging.com/2021/07/29/summer-2021-journal-of-economic-perspectives-available-online/

(JEP is a great journal to read as a student. If you’re looking for a good place to start, may I recommend the Anomalies column?)

Of particular interest this time around is the section on statistical significance. This paper, in particular, was an enjoyable read.


And reading that paper reminded of a really old blogpost written by an ex-colleague of mine:

The author starts off by emphasizing the importance of developing a statistical toolbox. Indeed statistics is a rich subject that can be enjoyed by thinking through a given problem and applying the right kind of tools to get a deeper understanding of the problem. One should approach statistics with a bike mechanic mindset. A bike mechanic is not addicted to one tool. He constantly keeps shuffling his tool box by adding new tools or cleaning up old tools or throwing away useless tools etc. Far from this mindset, the statistics education system imparts a formula oriented thinking amongst many students. Instead of developing a statistical or probabilistic thinking in a student, most of the courses focus on a few formulae and teach them null hypothesis testing.

https://radhakrishna.typepad.com/rks_musings/2015/09/mindless-statistics.html

If you are a student of statistics, and think that you “get” statistics, please read the post in its entirety. Don’t worry if you get confused – that is, in a way, the point of that post. It challenges you by asking a very simple question: do you really “get” statistics? And the answer is almost always in the negative (and that goes for me too!)


And my final recommendations du jour is this (extremely passionately) written tirade:

We want to persuade you of one claim: that William Sealy Gosset (1876-1937)—aka “Student” of “Student’s” t-test—was right, and that his difficult friend, Ronald A. Fisher (1890-1962), though a genius, was wrong. Fit is not the same thing as importance. Statistical significance is not the same thing as scientific importance or economic sense. But the mistaken equation is made, we find, in 8 or 9 of every 10 articles appearing in the leading journals of science, economics to medicine. The history of this “standard error” of science involves varied characters and plot twists, but especially R. A. Fisher’s canonical translation of “Student’s” t. William S. Gosset aka “Student,” who was for most of his life Head Experimental Brewer at Guinness, took an economic approach to the logic of uncertainty. Against Gosset’s wishes his friend Fisher erased the consciously economic element, Gosset’s “real error.” We want to bring it back.

https://www.deirdremccloskey.com/docs/jsm.pdf

Although it might help by reading this review first:

However, thanks to an arbitrary threshold set by statistics pioneer R.A. Fisher, the term ‘significance’ is typically reserved for P values smaller than 0.05. Ziliak and McCloskey, both economists, promote a cost-benefit approach instead, arguing that decision thresholds should be set by considering the consequences of wrong decisions. A finding with a large P value might be worth acting upon if the effect would be genuinely clinically important and if the consequences of failing to act could be serious.

https://www.nature.com/articles/nm0209-135

Statistics is a surprisingly, delightfully conceptual subject, and I’m still peeling away at the layers. Every year I think I understand it a little bit more, and every year I discover that there is much more to learn. The symposium on statistical significance in this summer’s issue of the JEP, RK’s blogpost and Deirdre McCloskey’s paper are good places to get started on unlearning what you’ve been taught in stats.

On Confidence Intervals

As with practically every other Indian household, so with mine. Trudging back home after having written the math exam was never much fun.

It wasn’t fun because most of your answers wouldn’t tally with those of your friends. But it wasn’t fun most of all because you knew the conversation that waited for you at home. Damocles had it easy in comparison.

“How was the exam?”, would be the opening gambit from the other side.

And because Indian kids had very little choice but to become experts at this version of chess very early on in life, we all know what the safest response was.

“Not bad”.

Safe, you see. Non-committal, and just the right balance of being responsive without encouraging further questioning.

It never worked, of course, because there always were follow-up questions.

“So how much do you think you’ll get?”

There are, as any kid will tell you, two possible responses to this. One brings with it temporary relief, but payback can be hellish come the day of the results. This is the Blithely ConfidentTM method.

“Oh, it was awesome! I’ll easily get over 90!”

The other response involves a more difficult conversation at the present juncture, but as any experienced negotiator will tell you, expectations setting is key in the long run.

“Not sure, really.”

Inwardly, you’re praying for a phone call, a doorbell ring, the appearance of a lizard in the kitchen – anything, really, that will serve as a distraction. Alas, miracles occur all too rarely in real life.

“Well, ok”, the pater would say, “Give me a range, at least.”


We’ve all heard the joke where the kid goes “I’ll definitely get somewhere between 0 and 100!”.

Young readers, a word of advice: this never works in real life. Don’t try it, trust me.

But joke apart, there was a grain of truth in that statement. That was the range that I (and every other student) was most comfortable with.

Or, in the language of the statistician, the wider the confidence interval, the more confident you ought to be that the parameter will lie within it.1


What range should one go with? 0-100 is out unless you happen to like a stinging sensation on your cheek.

You’re reasonably confident that you’ll pass – it wasn’t that bad a paper. And if you’re lucky, and if your teacher is feeling benevolent, you might even inch up to 80. So, maybe 40-80?

“I’ll definitely pass, and if I’m lucky, could get around 60 or so”, you venture.

“Hmmm,” the pater goes, ever the contemplative thinker. “So around 60, you’re saying?”

“Well yeah, around that”, you say, hoping against hope that this conversation is approaching the home stretch now.


“Around could mean anything!”, is the response. “Between 50 and 70, or between 40 and 80?! Which is it?!”

And that, my friends, is the intuition behind confidence intervals. Your parents are optimizing for accurate estimates (a narrower range), and you want to tell them that sure, you can have a narrower range – but the price they must pay is lesser confidence on your part.

And if they say, well, no, we want you to be more confident about your answer, you want to tell them that sure, I can be more confident – but the price they must pay is lower accuracy (a broader range).

And sorry, you can’t have both.

(Weird how parents get to say that all the time, but children, never!)

But be careful! This little story helps you get the intuition only. The truth is a little more subtle, alas:

The confidence interval can be expressed in terms of samples (or repeated samples): “Were this procedure to be repeated on numerous samples, the fraction of calculated confidence intervals (which would differ for each sample) that encompass the true population parameter would tend toward 90%

https://en.wikipedia.org/wiki/Confidence_interval#Meaning_and_interpretation

Or, in the case of our little story, this is what an Indian kid could tell their parents:

Were I to give the math exam a hundred times over, I would score somewhere between 50 and 70 about ninety times. And I would score between 40 and 80 about 95 times.


Now, if you ask where we get those specific sets of numbers from ( [50-70, {90}] , [40-80, {95}] ) , that takes us into the world of computation and calculation. Time to whip out the textbook and the calculator.

But if you are clear about why broader intervals imply higher confidence, and narrow intervals imply lower confidence, then you are now comfortable about the intuition.

And I hope you are clear, because that was my attempt in this blogpost.


Kids, trust me. Never try this at home.

But please, do read the Wikipedia article.

  1. Statisticians reading this, I know, I know. Let it slide for the moment. Please.[]

Probability, Expected Value…

… in No Country For Old Men