The Economics of ReCAPTCHA

This has been doing the rounds on my Whatsapp groups recently, and maybe you’ve seen it too:

Mildly funny, but the story behind it is quite something.


Bots have been a problem for many many years – much before Elon Musk thought of buying Twitter. And as long as sixteen years ago, folks were trying to solve the problem of stopping bots from signing up for services. So how does a computer make sure that the entity trying to sign up for a service actually is a human?

Well, by showing images such as these, and asking the entity on the other side to make out what the word is:

We’ve all been subjected to a variant of this, haven’t we.

Now, one of the folks who came up with this system – it’s called Captcha (say it out aloud and you can figure out the reason behind the name) ran the numbers:

And at some point I did a little back of the envelope calculation about how many of these were typed by people around the world, and it turns out the number I came up with was about 200 million.
So about 200 million times a day somebody would type one of these CAPTCHAs, and that’s when I started thinking, “I wonder if we can do something with this time.” Because the thing is each time you type one of these, not only are they annoying but also they waste about ten seconds of your time, and if you multiply ten seconds by 200 million, you get that humanity as a whole is wasting like 500,000 hours every day typing these annoying CAPTCHAs.

https://tim.blog/wp-content/uploads/2018/08/135-luis-von-ahn.pdf

Work that will gladden the heart of any economist. And so the guy who did these back of the envelope calculations tried to figure out how these 500,000 hours might be put to better use. Thus was born reCAPTCHA. And the idea was a very, very good one.

When you digitize, or scan books for the first time, there will be books with old fonts, outdated fonts. And therefore there will be a fair few words that computers will not be able to decipher. And not just books, this is also true of newspaper archives.

So if we have scanned books and newspaper archives that are non-machine-readable, and we have humans spending 500,000 hours every day… what about connecting the two, and having humans read these words, one at a time?

Scanned text is subjected to analysis by two different OCRs. Any word that is deciphered differently by the two OCR programs or that is not in an English dictionary is marked as “suspicious” and converted into a CAPTCHA. The suspicious word is displayed, out of context, sometimes along with a control word already known. If the human types the control word correctly, then the response to the questionable word is accepted as probably valid. If enough users were to correctly type the control word, but incorrectly type the second word which OCR had failed to recognize, then the digital version of documents could end up containing the incorrect word. The identification performed by each OCR program is given a value of 0.5 points, and each interpretation by a human is given a full point. Once a given identification hits 2.5 points, the word is considered valid. Those words that are consistently given a single identity by human judges are later recycled as control words. If the first three guesses match each other but do not match either of the OCRs, they are considered a correct answer, and the word becomes a control word. When six users reject a word before any correct spelling is chosen, the word is discarded as unreadable.

https://en.wikipedia.org/wiki/ReCAPTCHA

The system has evolved since then, and this version of reCAPTCHA (known as reCAPTCHA v1) is no longer around. We now have reCAPTCHA v2 and reCAPTCHA v3, and if you’re curious, you can learn more about it here.

But I really like the idea behind reCAPTCHA v1, even though it is no longer in use. It used the opportunity presented by a necessary but time-consuming activity by matching it with a necessary but money-and-effort-consuming activity, to the benefit of all concerned.

Turns out the person who came up with the idea has been thinking about computers and human brains as being complementary to each other for a fairly long time, even writing a PhD thesis about it:

Von Ahn’s Ph.D. thesis, completed in 2005, was the first publication to use the term “human computation” that he had coined, referring to methods that combine human brainpower with computers to solve problems that neither could solve alone. Von Ahn’s Ph.D. thesis is also the first work on Games With A Purpose, or GWAPs, which are games played by humans that produce useful computation as a side effect. The most famous example is the ESP Game, an online game in which two randomly paired people are simultaneously shown the same picture, with no way to communicate. Each then lists a number of words or phrases that describe the picture within a time limit, and are rewarded with points for a match. This match turns out to be an accurate description of the picture, and can be successfully used in a database for more accurate image search technology. The ESP Game was licensed by Google in the form of the Google Image Labeler, and is used to improve the accuracy of the Google Image Search. Von Ahn’s games brought him further coverage in the mainstream media. His thesis won the Best Doctoral Dissertation Award from Carnegie Mellon University’s School of Computer Science.

https://en.wikipedia.org/wiki/Luis_von_Ahn

There’s an old talk by Louis von Ahn on the topic as well, if you’re interested.

And here’s the kicker: the same idea, human computation, is at work another venture that Louis von Ahn has started. You may have heard of it, it has got this cute little green owl as its mascot:

So the way this works is whenever you’re a just a beginner, we give you very simple sentences. There’s a lot of very simple sentences on the web. We give you very simple sentences along with what each word means. And as you translate them and as you see how other people translate them, you start learning the language. And as you get more advanced, we give you more complex sentences to translate. But at all times, you’re learning by doing.

https://www.ted.com/talks/luis_von_ahn_massive_scale_online_collaboration/transcript?language=en

Both reCAPTCHA v1 and Duolingo have different business models now, of course. But as students of economics, its’s worth appreciating the idea of complementarity between humans and computers, and the idea of turning a necessary but time intensive activity into a socially useful one.

It may be a funny Whatsapp forward, sure, but as it turns out, there’s quite a story behind it. No?