What is common to Butyrylcholinesterase and Vitamin D, or why English is an underrated skill in statistics

Today’s blog post title is in the running for the longest title that I have come up with, but let’s ignore this particular bit of potential trivia and get on with it.

Today’s story really begins with the tragic tale of Sally Clark. It is a very lengthy extract, from a piece I wrote along with a friend some months ago. Lengthy, but fascinating:

In November of the year 1999, an English Solicitor named Sally Clark was convicted on two charges of murder, and sentenced to life imprisonment. This tragic case is notable for many reasons — one of those reasons was the fact that her alleged victims were her own sons. Another was the fact that both were toddlers when they died.
The cause of death in both cases was initially attributed to sudden infant death syndrome (SIDS), also known as cot death in the United Kingdom. We did not know then, and do not know until this day, about the specific causes of SIDS. But suspicion grew on account of the fact that two children from the same family had died due to unspecified causes, and shortly after the death of her second child, Sally Clark was arrested, tried and convicted.
One of the clinching pieces of evidence was expert testimony provided by the pediatrician Professor Sir Roy Meadow. He put the odds of two children from the same family dying of SIDS at 1 in 73 million — in other words, an all but impossible eventuality. On the back of this testimony, and others, Sally Clark was convicted of the crime of murdering her own sons, and sent to prison for life.
One cannot help but ask the question: how did Sir Roy Meadow arrive at this number of 1 in 73 million? Succinctly put, here is the theory: for the level of affluence that Sally Clark’s family possessed, the chance of one infant dying of SIDS was 1 in 8543. This was simply an empirical observation. What then, were the chances that two children from the same family would die of SIDS?
The answer to this question, statisticians tell us, depends on whether the two deaths are independent of each other. If one assumes that they are, then the probability of two deaths in the same family is simply the multiplicative product of the two probabilities. That is, 1 in 8543 multiplied by itself, which is 1 in 73 million and that would be enough to convince any “reasonable man” that the deaths were deliberate and could not have been just coincidence.
But on the other hand, if the two events are not independent of each other — say, for example, that there are underlying genetic or environmental reasons that we simply are not aware of just yet — then it is entirely possible that multiple children from the same family may die of SIDS. In fact, given a SIDS death in a family, research shows that the likelihood of a second SIDS death goes up.
Sally Clark’s convictions were overturned on her second appeal, and she was released from prison. She died four years later due to alcohol poisoning.


We’ll get back to this truly tragic tale, but let’s go off on a tangent for a second.

Today’s a day for extracts from my own earlier work, it would seem, for I have another one for your consideration:

Us teaching type folks love to say that correlation isn’t causation. As with most things in life, the trouble starts when you try to decipher what this means, exactly. Wikipedia has an entire article devoted to the phrase, and it has occupied space in some of the most brilliant minds that have ever been around.
Simply put, here’s a way to think about it: not everything that is correlated is necessarily going to imply causation.
For example, this one chart from this magnificent website (and please, do take a look at all the charts):


Hold on to this line of thinking, and let’s get back to the tragic Sally Clark story, but with a twist towards the rather more optimistic side of things.

Great news, right? We’ve found what causes SIDS!

Well, that’s where it gets tricky, and we go off on yet another tangent.

Do Vitamin D supplements help? We know that sunlight gives us Vitamin D, and that’s A Good Thing. So if we don’t get enough sunlight, hey, let’s get Vitamin D injections or supplements:

In interpreting vitamin D-related study results, correlation should not be understood as causation. Diets composed of vitamin D–rich foods such as dairy products and salmon also contain high levels of other healthy nutrients. Those who have a high vitamin D level are likely to participate in active outdoor activities and exercises, to be interested in health issues, and to have a healthy lifestyle. Without considering these confounders, misleading results can be obtained. In the study by Kim et al.,4) a univariate analysis revealed a correlation between a low vitamin D level and a low quality of life score; however, its significance was lost when age, sex, income, education level, and disease state were considered.
Sometimes, correlations shown in cross-sectional studies are used as evidence for requiring vitamin D supplements. A recent increasing trend of taking vitamin D supplements may be due to these effects.


What if Vitamin D is just a marker? That is, what if sunlight causes a lot of good things in our bodies, and it also causes Vitamin D levels in our body to go up? So it’s not sunlight that causes vitamin D to go up, and vitamin D that causes an increase in our wellbeing. Maybe it’s sunlight causing an uptick in our wellbeing and causing an increase in Vitamin D levels in our body? They (health and vitamin D levels) may just be correlated, without there being any causation.

What I’m about to say is important: I’m not a doctor. All I’m saying is, I’ve been confused often enough about correlation and causation to wonder about whether vitamin D causes good health. It is correlated, there’s no arguing with that. But causation? Ah, that’s another (very tricky) thing altogether.

And now that we have the mis-en-place of this blogpost done, let’s get the dish together.

Butyrylcholinesterase doesn’t necessarily cause SIDS in infants. Infants who die of SIDS stop breathing (for reasons that are still not understood clearly), and these infants have low levels of Butyrylcholinesterase. Butyrylcholinesterase may not even cause breathing to stop in infants. It is just a marker – there is correlation there, but we don’t know if there is causation.

In fact, the paper’s title itself says as much:

“Butyrylcholinesterase is a potential biomarker for Sudden Infant Death Syndrome”

But the tweet above speaks about how we’ve found the cause, and that’s not quite right.

Again, please don’t misunderstand me – the fact that this has been discovered is awesome, it is fantastic, and the joy, the relief and the euphoria should absolutely be there.


Sally Clark lost her life at least in part to a fundamental misunderstanding of statistical theory, and we still don’t know what causes SIDS. We understand it better, but there is a ways to go.

The most underrated skill in statistics is the English language.

Words matter, and we all (myself included!) need to be more careful about what exactly we mean when we speak about statistics.

And thank god we’re closer to figuring out how to deal with the horrible, horrible thing that SIDS is.

But if you’re teaching or learning statistics, tread very, very carefully.