Andrej Karpathy on an Intro to LLM’s

What is the Liar’s Dividend?

Well, what is it? Here’s a definition:

The benefit received by those spreading fake information as a consequence of the environment in which there is a great deal of fake information and hence it is unclear what is real and what is fake.


The first and immediate problem with deep fakes, or pictures generated with AI, isn’t the fact that they exist. Just the idea that it could exist is enough.

Fake images are problematic in and of themselves. But they are also problematic because it is now all too easy to deny that real images are, well, real.

Amid highly emotional discussions about Gaza, many happening on social media platforms that have struggled to shield users against graphic and inaccurate content, trust continues to fray. And now, experts say that malicious agents are taking advantage of A.I.’s availability to dismiss authentic content as fake — a concept known as the liar’s dividend.

https://www.nytimes.com/2023/10/28/business/media/ai-muddies-israel-hamas-war-in-unexpected-way.html

That picture of a murdered (insert religion and nationality of choice here so as to not offend your sensibilities) child?

Real if it is a convenience for our worldview, fake if it isn’t. And it is very, very easy to convince yourself of the truth value of either of these statements, because who can tell these days?

And so fake images being fake isn’t the only problem.

Real images can also be dismissed as being fake. They are being dismissed as being fake.


The greatest trick AI ever pulled, it turns out, was in convincing the world that it might exist.

Here’s the original definition of the liar’s dividend:

Hence what we call the liar’s dividend: this dividend flows, perversely, in proportion to success in educating the public about the dangers of deep fakes

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3213954

Realize the utterly delightful paradox: the better we get at convincing people of the problem of deep fakes, the easier it is to convince them that parts of reality itself are fake.

If you want to make your Monday even cheerier, do read the whole paper.

Dall-E 3

My favorite question to ask about AI: how long before we can ask AI to make a film about ourselves that we can sit and watch? “Make a movie about me, my wife and my daughter, based loosely on The Incredibles, with characters from Madagascar and Ice Age. Keep the story light and make sure the movie ends on a happy note.”

Me, personally, I think we’re about five years away. You?

Ethan Mollick et al on AI’s Jagged Frontiers

The public release of Large Language Models (LLMs) has sparked tremendous interest in how humans will use Artificial Intelligence (AI) to accomplish a variety of tasks. In our study conducted with Boston Consulting Group, a global management consulting firm, we examine the performance implications of AI on realistic, complex, and knowledge-intensive tasks. The pre-registered experiment involved 758 consultants comprising about 7% of the individual contributor-level consultants at the company. After establishing a performance baseline on a similar task, subjects were randomly assigned to one of three conditions: no AI access, GPT-4 AI access, or GPT-4 AI access with a prompt engineering overview. We suggest that the capabilities of AI create a “jagged technological frontier” where some tasks are easily done by AI, while others, though seemingly similar in difficulty level, are outside the current capability of AI. For each one of a set of 18 realistic consulting tasks within the frontier of AI capabilities, consultants using AI were significantly more productive (they completed 12.2% more tasks on average, and completed tasks 25.1% more quickly), and produced significantly higher quality results (more than 40% higher quality compared to a control group). Consultants across the skills distribution benefited significantly from having AI augmentation, with those below the average performance threshold increasing by 43% and those above increasing by 17% compared to their own scores. For a task selected to be outside the frontier, however, consultants using AI were 19 percentage points less likely to produce correct solutions compared to those without AI. Further, our analysis shows the emergence of two distinctive patterns of successful AI use by humans along a spectrum of human- AI integration. One set of consultants acted as “Centaurs,” like the mythical half- horse/half-human creature, dividing and delegating their solution-creation activities to the AI or to themselves. Another set of consultants acted more like “Cyborgs,” completely integrating their task flow with the AI and continually interacting with the technology.

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4573321

That’s the abstract of a paper written by a team of academicians based in the United States, of whom Prof. Ethan Mollick is one. The idea behind the paper is very simple: can we quantify just how much of an improvement in productivity is made possible because of using AI?

And the TL;DR is that productivity is way up. From the abstract: “consultants using AI were significantly more productive (they completed 12.2% more tasks on average, and completed tasks 25.1% more quickly), and produced significantly higher quality results (more than 40% higher quality compared to a control group)”

Some points of especial interest from my perspective:

  1. The advantages of AI are substantial, but unclear. We don’t know which tasks will be completed more efficiently (and better) by using AI, and which won’t. Worse, nobody knows for sure. It is very much a trial-and-error thing. (pg 3)
  2. This is a dynamic problem. How our interaction with AI changes, how the nature of our tasks change, and how AI gets better – all of these will vary with time. This paper will be outdated within a matter of weeks, not days – but that is a feature, not a bug. (pg 4)
  3. What was the task itself? Note that there were two different experiments, and within each experiment, there were two tasks. The first experiment was “within the frontier”, which means an experiment that was thought to be well within GPT-4’s capabilities. For each experiment, participants were “benchmarked” using an assessment task, and were then asked to work on an “experimental” task. I will always be referring to the “experimental” task:
    “In this experimental task, participants were tasked with conceptualizing a footwear idea for niche markets and delineating every step involved, from prototype description to market segmentation to entering the market. An executive from a leading global footwear company verified that the task design covered the entire process their company typically goes through, from ideation to product launch.5 Participants responded to a total of 18 tasks (or as many as they could within the given time frame). These tasks spanned various domains. Specifically, they can be categorized into four types: creativity (e.g., “Propose at least 10 ideas for a new shoe targeting an underserved market or sport.”), analytical thinking (e.g., “Segment the footwear industry market based on users.”), writing proficiency (e.g., “Draft a press release marketing copy for your product.”), and persuasiveness (e.g., “Pen an inspirational memo to employees detailing why your product would outshine competitors.”). This allowed us to collect comprehensive assessments of quality.” (pg 8 and pg 9)
  4. This was especially impressive:
    “Our results reveal significant effects, underscoring the prowess of AI even in tasks traditionally executed by highly skilled and well-compensated professionals. Not only did the use of AI lead to an increase in the number of subtasks completed by an average of 12.5%, but it also enhanced the quality of the responses by an average of more than 40%. These effects support the view that for tasks that are clearly within its frontier of capabilities, even those that historically demanded intensive human interaction, AI support provides huge performance benefits.” (pg 12)
  5. The “outside the frontier” task:
    “Participants used insights from interviews and financial data to provide recommendations for the CEO. Their recommendations were to pinpoint which brand held the most potential for growth. Additionally, participants were also expected to suggest actions to improve the chosen brand, regardless of the exact brand they had chosen” (pg 13)
  6. Even in the case of these tasks, there was improvement across the board in terms of lesser time spent, and also in terms of improvement of quality in output (pg 14 and 15)
  7. The authors found that there were two dominant approaches:
    “The first is Centaur behavior. Named after the mythical creature that is half-human and half-horse, this approach involves a similar strategic division of labor between humans and machines closely fused together.12 Users with this strategy switch between AI and human tasks, allocating responsibilities based on the strengths and capabilities of each entity. They discern which tasks are best suited for human intervention and which can be efficiently managed by AI.
    The second model we observed is Cyborg behavior. Named after hybrid human- machine beings as envisioned in science fiction literature, this approach is about intricate integration. Cyborg users don’t just delegate tasks; they intertwine their efforts with AI at the very frontier of capabilities. This strategy might manifest as alternating responsibilities at the subtask level, such as initiating a sentence for the AI to complete or working in tandem with the AI.” (pg 16)
  8. And finally, their concluding paragraph:
    “Finally, we note that our findings offer multiple avenues for interpretation when considering the future implications of human/AI collaboration. Firstly, our results lend support to the optimism about AI capabilities for important high-end knowledge work tasks such as fast idea generation, writing, persuasion, strategic analysis, and creative product innovation. In our study, since AI proved surprisingly capable, it was difficult to design a task in this experiment outside the AI’s frontier where humans with high human capital doing their job would consistently outperform AI. However, navigating AI’s jagged capabilities frontier remains challenging. Even for experienced professionals engaged in tasks akin to some of their daily responsibilities, this demarcation is not always evident. As the boundaries of AI capabilities continue to expand, often exponentially, it becomes incumbent upon human professionals to recalibrate their understanding of the frontier and for organizations to prepare for a new world of work combining humans and AI. Overall, AI seems poised to significantly impact human cognition and problem-solving ability. Similarly to how the internet and web browsers dramatically reduced the marginal cost of information sharing, AI may also be lowering the costs associated with human thinking and reasoning, with potentially broad and transformative effects”

This chart tells uite the story:

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4573321 (pg 28)

The appendix (pg 44 onwards) details the tasks, if you would like to go through them.

Finally, a part of the abstract that I’m still thinking about:

“Consultants across the skills distribution benefited significantly from having AI augmentation, with those below the average performance threshold increasing by 43% and those above increasing by 17% compared to their own scores. For a task selected to be outside the frontier, however, consultants using AI were 19 percentage points less likely to produce correct solutions compared to those without AI”


A lovely, thought-provoking paper. Whatever your own opinions about the impact of AI upon productivity, employment and output, a carefully designed academic study such as this is worth reading, and critiquing.

And if you are currently in college (any college), learn how to get better at working with AI!

When Computers Write Proofs…

Hello Whale

I still find myself wishing that Wired had gone with the headline that I did, but that’s neither here nor there.

What is here and there is one of the most fascinating use cases of AI that I have read about. The title of the article is “How to Use AI to Talk to Whales—and Save Life on Earth”. The second half may be a little bit of hyperbole – we’ll find out, one way or another – but the article makes for fascinating reading.

Humans have always known how to listen to other species, of course. Fishers throughout history collaborated with whales and dolphins to mutual benefit: a fish for them, a fish for us. In 19th-century Australia, a pod of killer whales was known to herd baleen whales into a bay near a whalers’ settlement, then slap their tails to alert the humans to ready the harpoons. (In exchange for their help, the orcas got first dibs on their favorite cuts, the lips and tongue.) Meanwhile, in the icy waters of Beringia, Inupiat people listened and spoke to bowhead whales before their hunts. As the environmental historian Bathsheba Demuth writes in her book Floating Coast, the Inupiat thought of the whales as neighbors occupying “their own country” who chose at times to offer their lives to humans—if humans deserved it.

https://www.wired.com/story/use-ai-talk-to-whales-save-life-on-earth/

That led me down one of the weirder Wikipedia paths I’ve recently been on, and I got to learn about the Killer Whales of Eden, and about Old Tom. I also got to learn about the Earth Species project. The idea is quite something:

The motivating intuition for ESP was that modern machine learning can build powerful semantic representations of language which we can use to unlock communication with other species.

https://www.earthspecies.org/
https://www.earthspecies.org/what-we-do/roadmap

It’s early days yet, of course, but this paper made for interesting reading:

We introduce the Bioacoustic Cocktail Party Problem Network (BioCPPNet), a lightweight, modular, and robust U-Net-based machine learning architecture optimized for bioacoustic source separation across diverse biological taxa. Employing learnable or handcrafted encoders, BioCPPNet operates directly on the raw acoustic mixture waveform containing overlapping vocalizations and separates the input waveform into estimates corresponding to the sources in the mixture. Predictions are compared to the reference ground truth waveforms by searching over the space of (output, target) source order permutations, and we train using an objective function motivated by perceptual audio quality. We apply BioCPPNet to several species with unique vocal behavior, including macaques, bottlenose dolphins, and Egyptian fruit bats, and we evaluate reconstruction quality of separated waveforms using the scale-invariant signal-to-distortion ratio (SI-SDR) and downstream identity classification accuracy. We consider mixtures with two or three concurrent conspecific vocalizers, and we examine separation performance in open and closed speaker scenarios. To our knowledge, this paper redefines the state-of-the-art in end-to-end single-channel bioacoustic source separation in a permutation-invariant regime across a heterogeneous set of non-human species. This study serves as a major step toward the deployment of bioacoustic source separation systems for processing substantial volumes of previously unusable data containing overlapping bioacoustic signals.

https://www.nature.com/articles/s41598-021-02790-2

(If more than half of this flew above your head, as it did in my case, ask ChatGPT to do an ELI5.)


Part of the reason this is so fascinating is because of two separate problems. How does AI “learn” language? Second, how do animals talk communicate?

But these new machine learning methods bypassed semantics altogether. They treated languages as geometric shapes and found where the shapes overlapped. If a machine could translate any language into English without needing to understand it first, Raskin thought, could it do the same with a gelada monkey’s wobble, an elephant’s infrasound, a bee’s waggle dance? A year later, Raskin and Selvitelle formed Earth Species

https://www.wired.com/story/use-ai-talk-to-whales-save-life-on-earth/

Raskin, one of the co-founders of the Earth Species project, thinks that AI will allow us to understand animals in much the same way that improvements in optics helped seventeenth century astronomers understand the night sky better.

This might be a good time to say “OK, cool, but so what, exactly?”

If you’re wondering what the hell that was all about, it is this:

Teaming up with DFO and Rainforest Connection, we used deep neural networks to track, monitor and observe the orcas’ behavior in the Salish Sea, and send alerts to Canadian authorities. With this information, marine mammal managers can monitor and treat whales that are injured, sick or distressed. In case of an oil spill, the detection system can allow experts to locate the animals and use specialized equipment to alter the direction of travel of the orcas to prevent exposure.
To teach a machine learning model to recognize orca sounds, DFO provided 1,800 hours of underwater audio and 68,000 labels that identified the origin of the sound. The model is used to analyze live sounds that DFO monitors across 12 locations within the Southern Resident Killer Whales’ habitat. When the model hears a noise that indicates the presence of a killer whale, it’s displayed on the Rainforest Connection (a grantee of the Google AI Impact Challenge) web interface, and live alerts on their location are provided to DFO and key partners through an app that Rainforest Connection developed.

https://blog.google/intl/en-ca/company-news/technology/ai-protecting-our-endangered-orca/

So not only can AI “learn” these languages, but we are already putting them to good use – and we’re just getting started.

But learning these languages is complicated, because animals may not just “speak” their language. And that’s where the answer to the second question comes in – how do animals communicate?


Ari Friedlaender has something that Earth Species needs: lots and lots of data. Friedlaender researches whale behavior at UC Santa Cruz. He got started as a tag guy: the person who balances at the edge of a boat as it chases a whale, holds out a long pole with a suction-cupped biologging tag attached to the end, and slaps the tag on a whale’s back as it rounds the surface. This is harder than it seems. Friedlaender proved himself adept—“I played sports in college,” he explains—and was soon traveling the seas on tagging expeditions.

Friedlaender’s multifaceted data is especially useful for Earth Species because, as any biologist will tell you, animal communication isn’t purely verbal. It involves gestures and movement just as often as vocalizations. Diverse data sets get Earth Species closer to developing algorithms that can work across the full spectrum of the animal kingdom. The organization’s most recent work focuses on foundation models, the same kind of computation that powers generative AI like ChatGPT. Earlier this year, Earth Species published the first foundation model for animal communication. The model can already accurately sort beluga whale calls, and Earth Species plans to apply it to species as disparate as orangutans (who bellow), elephants (who send seismic rumbles through the ground), and jumping spiders (who vibrate their legs). Katie Zacarian, Earth Species’ CEO, describes the model this way: “Everything’s a nail, and it’s a hammer.”

https://www.wired.com/story/use-ai-talk-to-whales-save-life-on-earth/

Opportunity costs are everywhere, right? So also with this very cool technology – the benefits are there all right, but so are the risks:

Rutz is careful to say that generating calls will be a decision made thoughtfully, when the time requires it. In a paper published in Science in July, he praised the extraordinary usefulness of machine learning. But he cautions that humans should think hard before intervening in animal lives. Just as AI’s potential remains unknown, it may carry risks that extend beyond what we can imagine. Rutz cites as an example the new songs composed each year by humpback whales that spread across the world like hit singles. Should these whales pick up on an AI-generated phrase and incorporate that into their routine, humans would be altering a million-year-old culture. “I think that is one of the systems that should be off-limits, at least for now,” he told me. “Who has the right to have a chat with a humpback whale?”

https://www.wired.com/story/use-ai-talk-to-whales-save-life-on-earth/

Read the whole article, as always, to get a full flavor of all of what is possible and what the possible risks and dangers are. As with pretty much everybody else on the planet, I have been keen to find out all of what AI can do. You may, given your outlook towards AI, want to replace the word “keen” with something that fits better with your worldview. But I think it safe to say that most of us are curious to find out what AI can do.

And this is certainly one of the cooler (coolest?) applications of AI that I have come across.

Let me end with this delightful, but also haunting paragraph:

If you could speak to a whale, what would you say? Would you ask White Gladis, the killer whale elevated to meme status this summer for sinking yachts off the Iberian coast, what motivated her rampage—fun, delusion, revenge? Would you tell Tahlequah, the mother orca grieving the death of her calf, that you, too, lost a child? Payne once said that if given the chance to speak to a whale, he’d like to hear its normal gossip: loves, feuds, infidelities. Also: “Sorry would be a good word to say.”

https://www.wired.com/story/use-ai-talk-to-whales-save-life-on-earth/

A chat with Pi about blended learning in Indian higher education

Arnold Kling:

I tried Personal Intelligence (Pi) from Inflection AI. As a chatbot companion, it charms you by offering encouraging reactions to what you tell it. After commenting on what you have to say, it always asks an interesting question. Think of it as a very skillful and probing interviewer. Yes, it’s only software playing a game with you, but it plays it well.
To get an idea of where a conversation with Pi can go, see part of my chat with Pi. The excerpt I posted starts with its message after I’d told it about my Marginal Revolution is Dead post. I predict that you’ll be impressed by it.

https://arnoldkling.substack.com/p/gptllm-links-dfa

I did go and see Arnold’s chat with Pi, and yup, I was impressed.

Impressed enough to have a conversation with Pi myself. What topic would you guess I chose? The one closest to my heart in a professional context, of course:

I am convinced that classroom education in higher education in India is inefficient, takes too much time, leads to sub-optimal learning and rote memorization for examinations. This is because of a lot of different factors, most of which are interlinked with each other in many different ways. But long story short, young people in India, even in the very best universities, do not learn as well as they could. And since best practices trickle down to other universities, we end up creating a culture of learnng that is sub-optimal at all levels.

https://pi.ai/s/Z3QUR9V69aAmNgc2ek1rS

If you, like me, are convinced that classroom learning is overrated, please do go and read my conversation with Pi. If you, unlike me, are not convinced that classroom learning is overrated, definitely go and read my conversation with Pi, and please do tell me where I’m wrong.

Three points that I would like to highlight:

  1. Far too much of student’s time is spent in passive listening (and that for hours on end). Reduce it dramatically, and even the bit that remains should be online. If your choice is between packing a hundred and fifty students into a classroom like sardines or allowing students to learn online, go online. Is online bad? Well it’s not perfect, sure. But relative to what alternative? If the alternative is the sardines-in-a-can approach, then why not?
  2. To me, the job of a professor in higher education is to mentor, not to teach. This is not a binary variable, and the truth lies somewhere in the middle, but more mentoring than classroom teaching, that much is for sure. So if anything, the workload for a professor will go up in my proposal, not down. But more personalized teaching/mentoring. Leave the large scale classroom to online education. What else is it for?
  3. AI in education is coming. You may not like it, you may resist it and you may say (as a professor) “but what are we here for then?”. But read the rest of Arnold’s post and ask yourself if the median professor in your university is better or worse than AI tutoring. Then ask yourself how many students this professor can mentor/tutor. Again, AI in education is coming. But the answer to the question “but what are we here for?” lies in learning to think of ourselves as complements to AI. Ask what AI can’t yet provide, and provide that. What the “that” will be changes based on a variety of factors, but in my specific case, I would think it is working on projects with students. And that is what I am focussing on this year.

I sent my conversation with Pi to two friends of mine, who gave me extremely thoughtful responses.

Samrudha Surana highlighted the fact that I should also be thinking about the question “what is college for?”. Different students want different things from the same course. Some may wish to become professors, while many more may wish to join the corporate world. This is as it should be: higher education is not a replication machine, whose sole job is to produce more professors over time. But we need both (future professors and future employees in the corporate world), and people trained in many other professions besides. Blended learning, and AI’s introduction allows for more customization, and that is a good thing.

He also pointed out that we should carefully think through how the online courses will be chosen by the students, and to what end. What he means by this is how much of a say a student should have in choosing their course(s), and how much of it should be the decision of the professor. Not just the choice of the courses, but also the choice of the project/assignment/paper for which the course is being taken. He favors more autonomy for the student, and while I’m inclined to agree, the magnitude will be tricky to set as a rule. In general, a higher degree of autonomy in later semesters, I would think.

Much more discussions are needed in our classrooms. Much, much more. We professors need to be challenged in class, our assumptions and claims scrutinized, our premises questioned and our conclusions critiqued. Learning is best achieved through Socratic discourse (in my opinion). But our classrooms are more about proclamations by the professor rather than any of the above. Smaller class sizes will help, as will more seminars, discussion groups and workshops. That matters, and is rendered more probable under such an arrangement.

Undergraduate courses, finally, might involve much more of classroom learning in the initial semesters. Although under the new four year undergraduate programme (and the likelihood of them more or less replacing Masters programmes altogether), even here you would want to shift to more of blended learning towards the end of the degree.


We need to teach students in higher education better, of that I am convinced. What I have suggested here is worth further discussion, I’m fairly sure. Whether you agree with me or otherwise (and I hope it is otherwise), please tell me why 🙂

Steam Engines, AI and Diffusion

Steam-powered manufacturing had linked an entire production line to a single huge steam engine. As a result, factories were stacked on many floors around the central engine, with drive belts all running at the same speed. The flow of work around the factory was governed by the need to put certain machines close to the steam engine, rather than the logic of moving the product from one machine to the next. When electric dynamos were first introduced, the steam engine would be ripped out and the dynamo would replace it. Productivity barely improved.
Eventually, businesses figured out that factories could be completely redesigned on a single floor. Production lines were arranged to enable the smooth flow of materials around the factory. Most importantly, each worker could have his or her own little electric motor, starting it or stopping it at will. The improvements weren’t just architectural but social: Once the technology allowed workers to make more decisions, they needed more training and different contracts to encourage them to take responsibility.

https://slate.com/culture/2007/06/what-the-history-of-the-electric-dynamo-teaches-about-the-future-of-the-computer.html

This is the second time this quote is appearing in a post on EFE. By the way, do read that earlier post, especially if you are in academia, and please let me know how your university has adjusted to the post pandemic world – have we just gone back to a fully offline world, or not?


But to come back to why I wanted to talk about this excerpt again – it is because The Economist asks an inevitable and obvious question regarding the deployment of AI in offices the world over:

Speculation about the consequences of ai—for jobs, productivity and quality of life—is at fever pitch. The tech is awe-inspiring. And yet ai’s economic impact will be muted unless millions of firms beyond Silicon Valley adopt it. That would mean far more than using the odd chatbot. Instead, it would involve the full-scale reorganisation of businesses and their in-house data. “The diffusion of technological improvements”, argues Nancy Stokey of the University of Chicago, “is arguably as critical as innovation for long-run growth.”

https://www.economist.com/finance-and-economics/2023/07/16/your-employer-is-probably-unprepared-for-artificial-intelligence

Having technology is not the same as using it. And in fact people will take a long time to adopt to a new technology, and that for a variety of reasons. Some may be cultural, some may be about being comfortable with the “old” workflow, and some may be, well, irrational, plain and simple.

The article in The Economist gives the examples of Japan and France, and that section is well worth a read, but what is true for countries is true, of course, at the level of organizations and institutions too. Resistance to change is hard to overcome, and the diffusion of technology simply doesn’t happen as fast as some might hope. For example:

In 2017 a third of Japanese regional banks still used cobol, a programming language invented a decade before man landed on the moon. Last year Britain imported more than £20m-($24m-) worth of floppy disks, MiniDiscs and cassettes. A fifth of rich-world firms do not even have a website. Governments are often the worst offenders—insisting, for instance, on paper forms. We estimate that bureaucracies across the world spend $6bn a year on paper and printing, about as much in real terms as in the mid-1990s.

https://www.economist.com/finance-and-economics/2023/07/16/your-employer-is-probably-unprepared-for-artificial-intelligence

But other factors are at play, beyond my simple list of factors from above (cultural reasons, inertia and irrationality). There may simply be no incentive to move to a better technology, if you are a business that is doing well in a sector with no young upstarts for competition. Particularly in the western world, it may simply be a case of an aging population that prefers to not learn new tricks. Governments may be ham-handed in terms of regulating the deployment of new technologies, and society may wish not to adopt technologies that save on labor. Costs, data privacy concerns, legal compliance issues, inevitable mistakes that AI will make – all are hurdles to be overcome.

The study of how this will change in the years to come will fascinate economists, sociologists, psychologists and many other -ists.

It is impossible to say how this will play out, but it will be a fascinating topic of study, that is for certain.

Buckle up!


Further reading, if you are interested in an economic analysis of some of these issues.

Teaching Statistics in the Age of ChatGPT

One of my favorite websites to use while teaching statistics is Seeing Theory, by Brown University. It is a wonderful website, because it allows people to “see” statistics.

Visualize concepts in statistics, to use the technically correct term, but you see what I mean.

One of the many reasons I like this website is because it presents a fun, interactive way to “get” what statistics is all about. It is one thing to talk about flipping a coin, it is quite another to actually flip a coin 1000 times. Or roll a dice, or understand what a probability distribution is, or to (finally!) “get” what the Central Limit Theorem is trying to get at (beware, though – every time you think you’ve “got” the CLT it has a way of revealing an additional layer of intrigue).

This past summer, as I’ve mentioned before, I was teaching school-going students courses in economics, statistics and public policy. I have made use of Seeing Theory in the past, but with the advent of ChatGPT (and especially ChatGPT4), I figured it might be a good time to not just show “cool” visualizations, but also actually try and build them.

And so we did! As we covered a topic, I would ask my students to “build” a working demo using ChatGPT (or Bard). I would nudge and prompt the students to well, write better prompts, and if necessary, step in and write the prompts myself on occasion. But for the most part, the work was done by the students, and we were able to get simple working demos of some stats concepts out of the door.

The “whoa, this is so cool!” moments were worth it in and of themselves, but it is my ardent hope that the students understood the concepts a little bit better for having seen the visualizations.

A great example is the Monty Hall problem. Run a simple Google search for it, if you haven’t come across it before. In my experience, some students tend to not “get” the explanation the first time around. Until this summer, I would get around this problem by asking them “what if it was a million doors instead?”, or if all else failed, by actually “playing” the game using three cards from a deck of cards.

But this time, we built a demo of the problem! So also for Chebyshev’s inequality, the expected value upon rolling a pair of dice and a simple way to visualize what regression does. The demos won’t satisfy professors of statistics or professional coders, for you could add so much more – but for young students who were trying to internalize the key concepts in statistics, it was pure magic.

And the meta lesson, of course, was that they should try and do this for everything! Why stop at stats? Build working demos for concepts in math, in physics, in geography. And if you know even a little bit of coding, try and build even better demos – both I and my students were relatively unfamiliar with coding in general, so we stuck with simple HTML.

But with AI’s new coding capabilities, it is clear that teaching (and learning) can become much better than was the case thus far. If you wish to disagree with me about the word “better”, I look forward to the argument, and you may well end up having more than a couple of points. But the classes were certainly more interactive – and at least along that one dimension, they were certainly better.

I hope to do much more of this in the months and years to come, but for the moment, do try out some of these demos, and let me know how they could be made better.

Thank you!

Put Me Out of a Job – 2

Let’s begin with the second class today. Your outline mentions the topic “Time Management and Opportunity Cost”. Before we begin the class, outline a definition of both terms, and explain their importance to my life. When you focus on the importance of time management to my life, make sure that you remember I am an eighteen year old. Why should I bother with managing my time? I have my entire life ahead of me – time isn’t a constraint, surely? When then do I need to manage it?
Once you explain your answer to these questions, proceed with the outline as discussed, as per the format I have asked for.

You might think this (time management and opportunity cost) to be a weird topic for a second class in a course called “Principles of Economics”. You would certainly think it to be unconventional. Not the latter half of the topic – opportunity costs – but the first one. What does time management have to do with economics? Well, think of it this way – if you are an Indian student who has learnt economics, you have almost certainly come across Lionel Robbins’ definition, and have most likely memorized it back then.

Here it is: the science which studies human behaviour as a relationship between ends and scarce means which have alternative uses.

What is more scarce than time? We all have a limited amount of time, and we all have ends to achieve. The ends we would like to achieve in our lives are much more than the time that is afforded to us, and so we must choose which of these ends to pursue, and which to sacrifice. That’s a pretty good, and if you ask me, useful way to understand opportunity costs. The weirdness, or the unconventional choice (to some) of the topic, is a direct consequence of my request to ChatGPT from yesterday’s post:

Optimize for helping me understand the principles and their applicability to various aspects of my life (professional, personal etc), and it is fine if the outline doesn’t match what is usually taught in introductory economics textbooks. Given this, would you like to change your outline in any way? If yes, please show me the new outline. If no, please begin with the first class.

I am seeking to learn economics not to write an examination at the end of the semester in a college or university with this course. I am seeking, instead, to learn economics in order to apply it to various aspects of my life. Of course, as a student enrolled in a university, you may seek to optimize your learning geared towards doing well in a examination at the end of the semester, and that’s fine. Just let ChatGPT know accordingly, that’s all:

A minor tangent before we get back to our course. What if I asked you to change your course outline for these thirty classes to help me write an examination in an Indian undergraduate course? What would the new outline look like? In your answer, show me both outlines together, per lecture. That is, the first line in the description of lecture 1 (out of 30) should be your original lecture, and the second line should be the course tweaked for doing well in an examination, rather than applicability to various aspects of my life. Ditto for the second course, and so on and so forth for all thirty.

I won’t show you the whole output, but simply how the second lecture’s outline has changed:

Here’s a revised outline for the thirty lectures, with the original lectures designed for applicability to various aspects of life (Lecture A) and the modified lectures tailored to help you succeed in an Indian undergraduate economics course (Lecture B):
Lecture A: Time Management and Opportunity Cost
Lecture B: Opportunity Cost: Definition, Types, and Examples

ChatGPT-4’s Output (in part)

I know which one I prefer, and why. The good news, as a student, is that you can do both! Learn in order to score well in an examination, and also learn in order to figure out how to apply economics better in the case of your own life. Why should the two be different? Ah, some questions you should reflect upon, rather than ask ChatGPT.

Anyway, back to our lecture du jour. I asked ChatGPT to explain why it chose time management, and I do not think I would have asked that question as an eighteen year old. The older you get, the more aware you are of how limited your time is. And at least in my own case, the converse is also true. I count this as a mark in my favor – that while a good prompt may get a student going, said student will still need help and advice on an ongoing basis.

So far, at any rate.

Further proof of that fact that I’m not out of a job, just yet, is below. The context is that I read the answer, and felt it to be incomplete. So I prodded it a bit, and then just a little bit more:

I’ve often read the phrase “all costs are opportunity costs”. Please explain what this means. Remember that I know no economics, and as usual, give me one example from the Mahabharata, and one from a real-life situation

My professor wondered why the word “TINSTAAFL” hasn’t come up in your answers yet. I didn’t even know this was a word! Is he joking, or is this word relevant to what you’re telling me right now?

To be clear, it isn’t so much about the phrase TINSTAAFL, as it was about the fact that I felt its explanation to be incomplete. This prompted me (no pun intended) to ask it to be more thorough:

I have an ongoing request. I’m looking to make my professors job as easy as possible, or even make him, in some sense, redundant. Optimize your answer for thoroughness, and if you think you can’t fit all of what you want to say in a single output, end with a line that says, “I can tell you more, please let me know if you’re interested to know even more” This will always be applicable in our conversations.
Now, back to the second lecture’s outline. Expand upon the three sub-points from the broad contours. I am particularly interested in the third one, so give more details, explanations and background in the case of “recognizing and evaluating opportunity costs in decision making”

This is an important lesson in and of itself. Feel free to tell ChatGPT to give more (or less) detail, or ask it to modify how it gives you the answer (more examples | simpler language | write like person X | show your output as a debate between person X and person Y). Get your “teacher” to be the kind of teacher that you like to learn from!

With regard to your explanation of “Recognizing and evaluating opportunity costs in decision-making”, I’m confused about how to think about short term and long term factors while making my choices, and the short term and long term consequences of my choices. How should I think about this, what framework should I use, and is there an underlying principle at work here that I should know about?

I count this as a pretty important miss on ChatGPT’s part. My personal opinion is that you haven’t fully explained opportunity costs without talking about the importance of how your evaluation of opportunity costs changes given different time horizons. Time matters! ChatGPT actually agrees with me (see below), but only after prodding. And this after making explicit the fact that I was interested in learning about time horizons! And so I asked it again:

Is it useful to think of time preference as a separate principle of economics? More broadly speaking, how should a student of economics think about time preferences? Give me answers from a theoretical perspective, but also from an application perspective.


I’m two days in, where I’m the “student” and ChatGPT my teacher. Today’s class wasn’t great. I don’t think ChatGPT’s output was good enough to stand on its own, and it needed additional prompts to deliver what I would consider to be a good introduction to the concept of opportunity costs, its many nuances and its many applications. It wasn’t bad, but it was far from being good, in my opinion.

Should I take this as a sign that I need to get better at writing prompts, or should I take this as a sign that AI isn’t good enough to replace me yet? How should I change my mental model about whether the average student in a typical college can learn better from AI?

If you are a regular reader of EFE, you know what’s coming next: the truth always lies somewhere in the middle.I need to get better at writing prompts, yes, but also AI isn’t good enough to replace me yet. Both of these things will change over time, of course, but for the moment, less than ten percent into the course, I am inclined to think that I am not out of a job, just yet.

And even better, the complements over substitutes argument just got stronger – I’ll be a much better teacher of a course such as this the next time I get to teach it. Tomorrow we tackle “Supply and Demand: Basics and Market Equilibrium”.

I’ll see you in class tomorrow!