May 2, 2022 – EconForEverybody

This is a true story, but I’ll (of course) anonymize the name of the educational institute and the student concerned:

One of the semester end examinations conducted during the pandemic at an educational institute had an error. Students asked about the error, and since the professor who had designed the paper was not available, another professor was asked what could be done. Said professor copied the text of the question and searched for it online, in the hope that the question (or a variant thereof) had been sourced online.

Alas, that didn’t work, but a related discovery was made. A student writing that same question paper had copied the question, and put it up for folks online to solve. It hadn’t been solved yet, but the fact that all of this could happen so quickly was mind-boggling.

The kicker? The student in question had not bothered to remain anonymous. Their name had been appended with the question.

Welcome to learning and examinations in the time of Coviid-19.

I have often joked in my classes in this past decade that it is only a matter of time before professors outsource the design of the question paper to freelance websites online – and students outsource the writing of the submission online. And who knows, it may end up being the same freelancer doing both of these “projects”.

All of which is a very roundabout way to get to thinking about Elicit, videos about which I had put up yesterday.

But let’s begin at the beginning: what is Elicit?

Elicit is a GPT-3 powered research assistant. Elicit helps you classify datasets, brainstorm research questions, and search through publications.
https://www.google.com/search?q=what+is+elicit.org

Which of course begs a follow-up question: what is GPT-3? And if you haven’t discovered GPT-3 yet, well, buckle up for the ride:

GPT-3 belongs to a category of deep learning known as a large language model, a complex neural net that has been trained on a titanic data set of text: in GPT-3’s case, roughly 700 gigabytes of data drawn from across the web, including Wikipedia, supplemented with a large collection of text from digitized books. GPT-3 is the most celebrated of the large language models, and the most publicly available, but Google, Meta (formerly known as Facebook) and DeepMind have all developed their own L.L.M.s in recent years. Advances in computational power — and new mathematical techniques — have enabled L.L.M.s of GPT-3’s vintage to ingest far larger data sets than their predecessors, and employ much deeper layers of artificial neurons for their training.
Chances are you have already interacted with a large language model if you’ve ever used an application — like Gmail — that includes an autocomplete feature, gently prompting you with the word ‘‘attend’’ after you type the sentence ‘‘Sadly I won’t be able to….’’ But autocomplete is only the most rudimentary expression of what software like GPT-3 is capable of. It turns out that with enough training data and sufficiently deep neural nets, large language models can display remarkable skill if you ask them not just to fill in the missing word, but also to continue on writing whole paragraphs in the style of the initial prompt.
https://www.nytimes.com/2022/04/15/magazine/ai-language.html

It’s wild, there’s no other way to put it:

So, OK, cool tech. But cool tech without the ability to apply it is less than half of the story. So what might be some applications of GPT-3?

A few months after GPT-3 went online, the OpenAI team discovered that the neural net had developed surprisingly effective skills at writing computer software, even though the training data had not deliberately included examples of code. It turned out that the web is filled with countless pages that include examples of computer programming, accompanied by descriptions of what the code is designed to do; from those elemental clues, GPT-3 effectively taught itself how to program. (OpenAI refined those embryonic coding skills with more targeted training, and now offers an interface called Codex that generates structured code in a dozen programming languages in response to natural-language instructions.)
https://www.nytimes.com/2022/04/15/magazine/ai-language.html

For example:

(Before we proceed, assuming it is not behind a paywall, please read the entire article from the NYT.)

But about a week ago or so, I first heard about Elicit.org:

Define your *current* understanding of complements and substitutes #principlesofecon

(And do try this out! As @chris_bail points out, this is currently limited in terms of the ability to search across all journals, but wow, buckle up for the ride, fellow academicians) https://t.co/9vgblPJ2g9
— Ashish (@ashish2727) April 22, 2022

Watch the video, play around with the tool once you register (it’s free) and if you are at all involved with academia, reflect on how much has changed, and how much more is likely to change in the time to come.

But there are things to worry about, of course. An excellent place to begin is with this essay by Emily M. Blender, on Medium. It’s a great essay, and deserves to be read in full. Here’s one relevant extract:

There is a talk I’ve given a couple of times now (first at the University of Edinburgh in August 2021) titled “Meaning making with artificial interlocutors and risks of language technology”. I end that talk by reminding the audience to not be too impressed, and to remember:
Just because that text seems coherent doesn’t mean the model behind it has understood anything or is trustworthy
Just because that answer was correct doesn’t mean the next one will be
When a computer seems to “speak our language”, we’re actually the ones doing all of the work
https://medium.com/@emilymenonbender/on-nyt-magazine-on-ai-resist-the-urge-to-be-impressed-3d92fd9a0edd

I haven’t seen the talk at the University of Edinburgh referred to in the extract, but it’s on my to-watch list. Here is the link, if you’re interested.

And here’s a Twitter thread by Emily M. Blender about Elicit.org specifically:

1. No, LLMs can't do literature reviews.
2. Anyone who thinks a literature review can be automated doesn't understand what the purpose of a literature review is.

>> https://t.co/Egv0jf6cZU
— @emilymbender on Mastodon (@emilymbender) April 25, 2022

In response to this critique and other feedback, Elicit.org have come up with an explainer of sorts about how to use Elicit.org responsibly:

https://ought.org/updates/2022-04-25-responsibility

Before we proceed, I hope aficionados of statistics have noted the null hypothesis problem (which error would you rather avoid) in the last sentence of pt. 1 in that clipping above!

So all that being said, what do I think about GPT3 in general and elicit.org in particular?

I’m a sucker for trying out new things, especially from the world of tech. Innocent until proven guilty is a good maxim for approaching many things in life, and to me, so also with new tech. I’m gobsmacked to see tools like GPT3 and DallE2, and their applications to new tasks is amazing to see.

But that being said, there is a lot to think about, be wary of and guard against. I’m happy to keep an open mind and try these amazing technologies out, while keeping a close eye on what thoughtful critics have to say.

Which is exactly what I plan to do!

And for a person with a plan such as mine, what a time to be alive, no?

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31

Day: May 2, 2022

AI/ML: Some Thoughts

Share this: