Reproducibility and Replicability

I and a colleague conducted a small behavioral economics and experimental economics workshop for our students at the Gokhale Institute. It was a very small, very basic workshop, but one of the things that came up was the reproducibility problem, or as Wikipedia puts it, the replication crisis.

The replication crisis (also called the replicability crisis and the reproducibility crisis) is an ongoing methodological crisis in which it has been found that many scientific studies are difficult or impossible to replicate or reproduce. The replication crisis most severely affects the social sciences and medicine. The phrase was coined in the early 2010s as part of a growing awareness of the problem. The replication crisis represents an important body of research in the field of metascience.

https://en.wikipedia.org/wiki/Replication_crisis

And further on in that same article:

A 2016 poll of 1,500 scientists reported that 70% of them had failed to reproduce at least one other scientist’s experiment (50% had failed to reproduce one of their own experiments).[9] In 2009, 2% of scientists admitted to falsifying studies at least once and 14% admitted to personally knowing someone who did. Misconducts were reported more frequently by medical researchers than others.

https://en.wikipedia.org/wiki/Replication_crisis

The basic idea behind replicability is very simple: you should be able to take the data and the code from the paper you are reading/reviewing, and replicate the results obtained. You don’t have to agree with the choice of method, or with the results or with anything – you should be able to replicate the results, that’s all.

One basic standard of economic research is surely that someone else should be able to reproduce what you have done. They don’t have to agree with what you’ve done. They may think your data is terrible and your methodology is worse. But as a minimal standard, they should be able to reproduce your result, so that the follow-up research can then be in a position to think about what might have been done differently or better. This standard may seem obvious, but during the last 30 years or so, the methods for reproducibility have been transformed.

https://conversableeconomist.blogspot.com/2021/01/the-reproducibility-challenge-with.html

Now (to me, at any rate) this is interesting enough in and of itself, but at the risk of becoming a little meta, reading the rest of Tim Taylor’s post is worth it because it raises so many interesting issues.

The first is a link to a lovely overview of the problem by Lars Vilhuber, published in the Harvard Data Science Review. It is relatively simple to read, and is recommended reading. For example, Vilhuber draws a careful distinction between replicability and reproducibility, and is full of interesting nuggets of information. I’ll list out the major ones (major to me) here. Note that I have simply copy-pasted from the link:

  1. Publication of research articles specifically in economics can be traced back at least to the 1844 publication of the Zeitschrift für die Gesamte Staatswissenschaft (Stigler et al., 1995).
  2. As the first editor of Econometrica, Ragnar Frisch noted, “the original data will, as a rule, be published, unless their volume is excessive […] to stimulate criticism, control, and further studies” (Frisch, 1933)
  3. …only 17.4% of articles in Econometrica in 1989–1990 had empirical content (Stigler et al., 1995)
  4. As Dewald et al. (1986) note: “Many authors cited only general sources such as Survey of Current Business, Federal Reserve Bulletin, or International Financial Statistics, but did not identify the specific issues, tables, and pages from which the data had been extracted.”
  5. Among reproducibility supplements posted alongside articles in the AEA’s journals between 2010 and 2019, Stata is the most popular (72.96% of all supplements), followed by Matlab (22.45%; Vilhuber et al., 2020) (Note: Do check figure 2 at the link. Fascinating stuff.)
  6. It was concluded that “there is no tradition of replication in economics” (McCullough et al., 2006).
  7. The extent of the use of replication exercises in economics classes is anecdotally high, but I am not aware of any study or survey demonstrating this.
  8. The most famous example in economics is, of course, the exchange between Reinhart and Rogoff, and graduate student Thomas Herndon, together with professors Pollin and Ash (Herndon et al., 2014; Reinhart & Rogoff, 2010). (Note to students: this is a fascinating tale. Read up about it!)

There is much more at the link of course, but Tim Taylor’s post does a good job of extracting the key points. I’m noting them here in bullet point fashion, but you really should read the entire thing.

  1. Economic data – our understanding of the phrase needs to change, because a lot of it is in fact not publicly available today.
  2. “Vilhuber writes: “In 1960, 76% of empirical AER [American Economic Review- articles used public-use data. By 2010, 60% used administrative data, presumably none of which is public use …””
  3. Restricted Access Data Environments is a new thing that I discovered while writing this blogpost. “…where accredited researchers can get access to detailed data, but in ways that protect individual privacy. For example, there are now 30 Federal Statistical Data Research Centers around the country, mostly located close to big universities.” We could do with something like this in India. Actually, we would be a lot happier with just dbie working the way it was supposed to, but that’s for another day.
  4. Data that is given by creating a sub-sample, data that is ephemeral (try researching Instagram stories, for example) and data that you need to pay for are all challenging, and relatively recent, developments.
  5. I worked for four years in the analytics industry, so believe me when I say this. Data cleaning is a huge issue.
  6. Tim Taylor writes five paragraphs after this one, but this is a glorious para, worth quoting in full:
    “As a final thought, I’ll point out that academic researchers have mixed incentives when it comes to data. They always want access to new data, because new data is often a reliable pathway to published papers that can build a reputation and a paycheck. They often want access to the data used by rival researchers, to understand and to critique their results. But making access available to details of their own data doesn’t necessarily help them much.”

If there are those amongst you who are considering getting into academia, and are wondering what field to specialize in, reproducibility and replicability are fields worth investigating, precisely because they are relatively underrated today, and are only going to get more important tomorrow.

That’s a good investment to make, no?