Should students of law be taught statistics?

I teach statistics (and economics) for a living, so I suppose asking me this question is akin to asking a barber if you need a haircut.

But my personal incentives in this matter aside, I would argue that everybody alive today needs to learn statistics. Data about us is collected, stored, retrieved, combined with other data sources and then analyzed to reach conclusions about us, and at a pace that is now incomprehensible to most of us.

This is done by governments, and private businesses, and it is unlikely that we’re going to revert to a world where this is no longer the case. You and I may have different opinions about whether this intrusive or not, desirable or not, good or not – but I would argue that this ship has sailed for the foreseeable future. We (and that’s all of us) are going to be analyzed, like it or not.

And conclusions are going to be made about us on the basis of that analysis, like it or not. This could be, for example, a computer in a company analyzing us as a high value customer and according us better service treatment when we call their call center. Or it could be a computer owned by a government that decides that we were at a particular place at a particular time on the basis of the footage from a security camera.

In both of these cases (and there are millions of other examples besides), there is no human being who makes these decisions about us. Machines do. This much is obvious, because it is now beyond the capacity of our species to deal manually with the amount of data that we generate on a daily basis. And so the machines have taken over. Again, you and I may differ on whether this is a good thing or a bad thing, but the fact is that it is a trend that is unlikely to be reversed in the foreseeable future.

Are the conclusions that these machines reach infallible in nature? Much like the humans that these machines have replaced, no. They are not infallible. They process information much faster than we humans can, so they are definitively better in handling much more data, but machines can make errors in classification, just like we can. Here, have fun understanding what this means in practice.

Say this website asks you to draw a sea turtle. And so you start to draw one. The machine “looks” at what you’ve drawn, and starts to “compare” it with its rather massive data bank of objects. It identifies, very quickly, those objects that seem somewhat similar in shape to those that you are drawing, and builds a probabilistic model in the process. And when it is “confident” enough that it is giving the right answer, it throws up a result. And as you will have discovered for yourself, it really is rather good at this game.

But is it infallible? That is, is it perfect every single time? Much like you (the artist) are not, so also with the machine. It is also not perfect. Errors will be made, but so long as they are not made very often, and so long as they aren’t major bloopers, we can live with the trade-off. That is, we give up control over decision making, and we gain the ability to analyze and reach conclusions about volumes of data that we cannot handle.

But what, exactly, does “very often” mean in the previous paragraph? One error in ten? One in a million? One in an impossibly-long-word-that-ends-in-illion? Who gets to decide, and on what basis?

What does the phrase “major blooper” mean in that same paragraph? What if a machine places you on the scene of a crime on the basis of security camera footage when you were in fact not there? What if that fact is used to convict you of a crime? If this major blooper occurs once in every impossibly-long-word-that-ends-in-illion times, is that ok? Is that an acceptable trade-off? Who gets to decide, and on what basis?

If you are a lawyer with a client who finds themselves in such a situation, how do you argue this case? If you are a judge listening to the arguments being made by this lawyer, how do you judge the merits of this case? If you are a legislator framing the laws that will help the judge arrive at a decision, how do decide on the acceptable level of probabilities?

It needn’t be something as dramatic as a crime, of course. It could be a company deciding to downgrade your credit score, or a company that decides to shut off access to your own email, or a bank that decides that you are not qualified to get a loan, or any other situation that you could come up with yourself. Each of these decisions, and so many more besides, are being made by machines today, on the basis of probabilities.

Should members of the legal fraternity know the nuts and bolts of these models, and should we expect them to be experts in neural networks and the like? No, obviously not.

But should members of the legal fraternity know the principles of statistics, and have an understanding of the processes by which a probabilistic assessment is being made? I would argue that this should very much be the case.

But at the moment, to the best of my knowledge, this is not happening. Lawyers are not trained in statistics. I do not mean to pick on any one college or university in particular, and I am not reaching a conclusion on the basis of just one data point. A look at other universities websites, conversations with friends and family who are practicing lawyers or are currently studying law yields the same result. (If you know of a law school that does teach statistics, please do let me know. I would be very grateful.)

But because of whatever little I know about the field of statistics, and for the reasons I have outlined above, I argue that statistics should be taught to the students of law. It should be a part of the syllabus of law schools in this country, and the sooner this happens, the better it will be for us as a society.