Reflections on Whole Numbers and Half Truths

Single narratives have never been able to explain all of India.

S, Rukmini. Whole Numbers and Half Truths: What Data Can and Cannot Tell Us About Modern India (p. 220). Kindle Edition.

There is this line that is often quoted when big picture discussions about India take place, and it is only a matter of time before it comes up: whatever you say about India, the opposite is also true. The quote is attributed to Joan Robinson, and I can’t help but wonder if I will end up creating a paradox of sorts by agreeing wholeheartedly with it.

But I do agree with the spirit of the quote, which is why that one line extract from Rukmini S’s book, Whole Numbers and Half Truths, resonated so much with me. All countries are complex and complicated, but India takes the game to giddying heights.

Take a look at this map, a version of which is present in Rukmini’s book:

https://en.wikipedia.org/wiki/List_of_states_and_union_territories_of_India_by_fertility_rate

What is India’s TFR? First, for those uninitiated in the art and science of demography, what is TFR? It stands for Total Fertility Ratio, or as Hans Rosling used to put it, babies per woman. Well, it’s 2.0, which is good, because roughly speaking, two parents giving birth to two children will mean we’re at the replacement rate (note that this is a very basic way of thinking about it, but useful as a rough approximation).

But as any student of statistics ought to tell you, that’s only half the story (or half the truth). Uttar Pradesh, Bihar and Jharkhand are well above the so-called replacement rate, and that will have implications for labor mobility, taxation, political representation and so, so much more in the years to come.

Data then, is only half the story. How is the data collected? If it is a sampling exercise rather than a census, how was the sampling done? Has the sampling method changed over time? If so, are earlier data collection exercises comparable with current ones?

How should one think about the data that has been collected? What does it mean, and how much does context matter? For example:

‘That’s data about marriage, madam,’ he said—not about love. ‘I think if your data asked people if they have ever fallen in love with someone from another caste or religion, many will say yes. I see that all around me among my friends. But when it comes to getting married, most of us are not yet ready to leave our families. That’s why your data looks like that,’ he said. As for the rest? ‘There is a lot we will not admit to someone doing a survey. But things are changing. At least for some of us,’ he said.

S, Rukmini. Whole Numbers and Half Truths: What Data Can and Cannot Tell Us About Modern India (pp. 127-128). Kindle Edition.

Rukmini’s excellent book is, in one sense, a deep reflection on the data that we have, have had, and would like to have where India is concerned. It speaks about how data has been collected, which are the agencies and institutions involved, how these have changed (and been changed) over time, and with what consequences.

But it also is a reflection on a truism that many economists and statisticians underrate: data can only take you so far. As the subtitle of her book puts it, it is an analysis of what data can and cannot tell you about modern India.

And what data leaves out is often as fascinating as what it includes:

Yet, most people know little about the NCRB’s processes and methodology. For instance, the NCRB follows a system known as the ‘principal offence rule’. Instead of all the Indian Penal Code (IPC) sections involved in an alleged crime making it to the statistics, the NCRB only picks the ‘most heinous’ crime from each FIR for their statistics. I stumbled upon this then unknown fact in an off-the-record conversation with an NCRB statistician in the months after the deadly sexual assault of a physiotherapy student in Delhi in September 2012. In the course of that conversation, I learnt that the crime that shook the country would have only made it to the NCRB statistics as a murder, and not as a sexual assault, because murder carries the maximum penalty. This, I was told, was to prevent the crime statistics from being ‘artificially inflated’: ‘If the FIR is for theft, there will be a[n IPC] section for assault also, causing hurt also. If you include all the sections, people will think these are separate crimes and the numbers will seem too huge,’ he told me. After I reported this,2 the NCRB for the first time began to include the ‘principal offence rule’ in its disclaimer.3

S, Rukmini. Whole Numbers and Half Truths: What Data Can and Cannot Tell Us About Modern India (p. 13). Kindle Edition.

The paragraph that follows this one is equally instructive in this context, but the entire book is full of such Today-I-Learnt (TIL) moments. Even for those of us involved in academia, there is much to learn in terms of nuance and context by reading this book. If you are not in academia, but are interested in learning more about this country, recommending this book to you is even easier!

Rukmini’s books spans ten chapters on ten different (but obviously related) aspects of India. We get to learn how Indians tangle (or quite often choose not to!) with the cops and the courts, how we perceive the world around us, why Indians vote the way they do in the first three chapters. The next three are about how (and with whom) we live our lives, and how we earn and spend our money. The next trio is about how and where we work, how we grow and age and where Indians live. The final chapter is about India’s healthcare system.

Each chapter makes us familiar with the data associated with each of these topics, but each chapter is also a reflection on the fact that data can only take us so far. When you throw into the mix the fact that the data will always (and sometimes necessarily) be imperfect, we’re left with only one conclusion – analyze the data carefully, but always bear in mind that the reality will always be more complex. Data is, at the end of the day, an abstraction, and it will never be perfect.


One reason I liked the book so much is because of its brevity. Each of these chapters can and should be be a separate book, and condensing them into chapters can’t have been an easy task. But not only has she managed it, she has managed to do so in a way that is lucid, thought-provoking and informative. Two out of these three is a good achievement, to achieve all three and that across ten chapters is a rare ol’ achievement.

If I’m allowed to be greedy, I would have liked a chapter on the world of data that the RBI collects, and to its credit does share with us via its website. But it does so in a way that is best described as unintuitive. In fact, a book on how data sharing practices with the citizenry need to improve out of sight where government portals across all verticals and at all levels are concerned would be a great sequel (hint, hint!).


I’d strongly recommend this book to you, and I hope you enjoy reading it as much as I did.

We will be hosting Rukmini on the Gokhale Institute campus this coming Friday, the 29th of April. The event will be from 5.30 pm to 7.00 pm at the Kale Hall. She and I will speak about the book for about an hour, followed by a Q&A session with the audience.

If you are in Pune, please do try and make it!