Oomph Vs Precision

Let’s assume that you get a call from a relative. Said relative has enjoyed packing away the carbs for years on end, and is built along, shall we say, fairly generous proportions. But also – and this is the good news – said relative would now like to shed some kilos.

You, trained in statistics and fitness, have been enlisted as an important team member of Team Let’s Lose Some Weight.

And so off you go to do your research online, and after barely half an hour of ferreting around on Google, you come back and announce that there are two pills that will get the job done. While both have some unfortunate and unavoidable side-effects, both are also guaranteed to help you lose weight. You don’t need to do anything else, you tell your relative. No exercise, no diet, no lemon water in the morning, none of that jazz. Pop the pills, and you’re guaranteed to lose weight.

But, you go on to say, there’s also bad news.

What’s the bad news, asks the relative.

Well, you say. The first pill – Oomph is its name – it will help you lose nine kgs on average in a month.

Nine kgs, goes the old relative. And you call that bad news?!

Hang on, you say. Not so fast. Yes, nine kgs, but focus on the next two words, no? “On average”.

And what does that mean, asks the old relative testily.

Well, you say, always glad to show off your knowledge of statistics, it means you could lose anywhere between 4.5 kgs to 13.5 kgs. On average you will lose about nine kgs. But could be more, could be less.

Almost never, you say, will it be less than 4.5 kgs. And almost never, you say, will be more than 13.5 kgs. But somewhere within that range, you go on reassuringly, you will lose the kgs.

Ah, goes the old relative. Like that. OK, then. And what about the second pill?

Ah, the second pill, you say. Well, in the case of the second pill, you will lose only 2.25 kgs.

Pshaw, snorts the old relative. Peanuts compared to the first one, no?

Well, yes, true, you concede. But on the other hand, on average it will be somewhere between 2.05 kgs and 2.45 kgs.

Ah, says the old relative. So what you mean to say is that the first pill gives great results, but more uncertainty. And the second pill gives not so great results, but less uncertainty.

Couldn’t have put it better myself, you say. Well done.

Well, that leaves only one question then, doesn’t it?

As an expert in statistics, which one do you recommend?


How I wish I could take credit for this example, since it is such a great question to think about. It is, alas, not my own. It is taken from a lovely book, which happens to be the subject of both this post, and quite a few others in this week. The name of the book is “The Cult of Statistical Significance“, by Stephen T. Ziliak and Deirdre N. McCloskey.

And let’s leave aside for the moment the question of what the book is about – although we’ll get to it, don’t you worry. But for the moment, please do think about the answer to the question: which one do you recommend?

Remember, one helps you lose somewhere between 4.5 to 13.5 kgs, while the other helps you lose somewhere between 2.05 to 2.45 kgs. Do you pick the pill with the greater uncertainty (but more weight loss), or do you pick the pill with the lesser uncertainty (but lesser weight loss)?

This is, of course, a topic that has been discussed before on this blog. We’re talking, all statisticians will tell you, about the signal to noise ratio:

A clear signal with insane amounts of noise ain’t a good thing, and an unclear signal with next to no noise is also not a good thing.

https://atomic-temporary-112243906.wpcomstaging.com/2021/05/18/on-signals-and-noise/

So the first pill – Oomph is its name – has a clear signal (9 kgs!), but also insane amounts of noise (whaddya mean, somewhere between 4.5 kgs to 13.5 kgs?!).

And the second pill – Precision is its name – has an unclear signal (only 2.25 kgs?!), but also next to no noise (wow, plus or minus 200 grams only!).

So we have a situation where we must choose between a not-so-good thing and another not-so-good thing. So what do we choose? What should we choose?


So here’s the thing.

If you’ve been trained in statistics (as I have), you should be saying, choose the pill with the higher signal to noise ratio.

Stop right here, if you’ve been trained in statistics, and tell me if you agree or disagree with me. If you disagree with me, please tell me why. Has to be the signal-to-noise ratio, no?

Right, so let’s go ahead and calculate the signal to noise ratio. Here’s the formula we will use:

https://www.press.umich.edu/186351/cult_of_statistical_significance

The hypothesized null effect is, in both cases, zero. In the first case, the observed effect (on average) is 9, and the variation is 4.5. In the second, the observed effect (on average) is 2.25, and the variation is .2

So: (9-0)/4.5 and (2.25-0)/.2

Giving us, effectively, a signal to noise ratio of 2 and 10, respectively.


Well, you confidently tell your old relative, I’ve run the tests and done the analysis. And my conclusion is that you should take the second pill.

Precision, you mean?

And why is that?

You sigh. Deeply. Explaining statistics to laypeople is such a chore, but someone has got to do it.

Because, you say in your best professorial tone, the signal to noise ratio is the highest in the case of the second pill. Not just higher, you say patiently – five times higher. 10 compared to 2! It’s not even close.


But this old relative of yours is nothing if not curmudgeonly and commonsensical.

So you mean to tell me, goes the o.r., that with Precision, the one that you’re recommending

… my best case scenario is that I will lose 2.45 kgs.

But, in the case of Oomph, the one that you’re not recommending…

… my worst case scenario is that I will lose 4.5 kgs.

Have I got that right?


For an uncomfortably long period of time, there is a strained silence. And then in a small voice, you say that you will get back to the old relative, and off you go to learn more about where you went wrong while learning statistics for all these years.

So where did you go wrong?

There’s good news, and here’s bad news.

The good news is that you didn’t go wrong. You learnt correctly.

The bad news? Statistics itself took a wrong turn, and hasn’t corrected itself since.

How? We’ll find out soon enough, stay tuned.

On Signals and Noise

Have you ever walked out of a classroom as a student wondering what the hell went on there for the past hour? Or, if you are a working professional, have you ever walked out of a meeting wondering exactly the same thing?

No matter who you are, one of the two has happened to you at some point in your life. We’ve all had our share of monumentally useless meetings/classes. Somebody has droned on endlessly about something, and after an eternity of that droning, we’re still not sure what that person was on about. To the extent that we still don’t know what the precise point of the meeting/class was.

One of the great joys in my life as a person who tries to teach statistics to students comes when I say that if you have experienced this emotion, you know what statistics is about. Well, that’s a stretch, but allow me to explain where I’m coming from.


Image taken form here: https://en.wikipedia.org/wiki/Z-test

Don’t be scared by looking at that formula. We’ll get to it in a bit.


Take your mind back to the meeting/class. When you walked out of it, did you find yourself plaintively asking a fellow victim, “But what was the point?”

And if you are especially aggrieved, you might add that the fellow went on for an hour, but you’re still not sure what that was all about. What you’re really saying is that there was a lot of noise in that meeting/class, but not nearly enough signal.

You’re left unsure about the point of the whole thing, but you and your ringing ears can attest to the fact that a lot was said.


Or think about a phone call, or a Whatsapp call. If there is a lot of disturbance on the call, it is likely that the call won’t last for very long, and you may well be unclear about what the other person on the call was trying to say.

What you’re really saying is that there was a lot of noise on the call, but not nearly enough signal.


That is what the signal-to-noise ratio is all about. The clearer the signal, the better it is. The lower the noise, the better it is. And the ratio is simply both things put together.

A class that ends with you being very clear about what the professor said is a good class. A good class is “high” on the signal that the professor wanted to leave you with. And if it is a class in which the professor didn’t deviate from the topic, didn’t wander down side-alleys and didn’t spend too much time cracking unnecessary jokes, it is an even better class, because it was “low” on disturbance (or to use another word that means the same thing as disturbance: noise).


That, you see, is all that the formula up there is saying. How high is the signal (x less mu), relative to the noise (sigma, or s). The higher the signal, and the lower the noise, the clearer the message from the data you are working with.

And it has to be both! A clear signal with insane amounts of noise ain’t a good thing, and an unclear signal with next to no noise is also not a good thing.

And all of statistics can be thought of this way: what is the signal from the data that I am examining, relative to the noise that is there in this dataset. That is one way to understand the fact that the formula can look plenty scary, but this is all it is really saying.

Even this monster, for example:

https://www.statsdirect.co.uk/help/parametric_methods/utt.htm

Looks scary, but in English, it is asking the same question: how high is the signal, relative to the noise. It’s just that the formula for calculating the noise is exuberantly, ebulliently expansive. Leave all that to us, the folks who think this is fun. All you need to understand is the fact that this is what we’re asking:


What is the signal, relative to the noise?


And finally speaking of noise, that happens to be the title of Daniel Kahneman’s latest book. I have just downloaded it, and will get to it soon (hopefully). But before recommending to you that you should read it, I wanted to explain to you what the title meant.

And if you’re wondering why I would recommend something that I haven’t read yet, well, let me put it this way: it’s Daniel Kahneman.

High signal, no noise.