"First Contact" | This American Life

Updated 11022024-200242


Audio

@extratone

Meta

Description

We've witnessed a revolution in A.I. since the public rollout of ChatGPT. Our Senior Editor David Kestenbaum thinks that even though there's been a ton of coverage, there's one thing people haven't talked much about: have these machines gotten to the point that they're starting to have something like human intelligence? Where they actually understand language and concepts, and can reason? He talks with scientists at Microsoft who've been trying to figure that out.


Transcript

Ira Glass

Act 1, "First Contact." I don't know about you, but here's something that I have never talked about or thought about much until the last six months or so-- AI. Ever since ChatGPT and its upgrades and spin-offs started rolling out publicly last November, it's been doing such amazing things-- writing kid's term papers, passing the bar exam. It told a New York Times reporter to leave his wife because it loves him more.

This is now the next big battle among tech companies-- who's gonna roll out the best AI. All since ChatGPT. And that is the non-human force that the people of Earth are trying to figure out in this first act. And so much of the coverage about this new generation of AI has focused on important questions like, who's going to lose jobs because of this? Does it make our world better or more dangerous? Could the AI have actual catastrophic consequences for us?

But our senior editor David Kestenbaum noticed that nobody's talking much about a basic and kind of profound question-- have the computers crossed some kind of line and are they actually developing a sort of human intelligence? And I want to be clear what I mean by that because there's not an agreed-on definition of what you're looking for if you're looking for human intelligence in a computer. But one way that scientists think about this is the computer can actually understand language and concepts and it can reason through problems. But to be clear, it's completely unlike the computers that we've had till now, which are basically just fancy calculators following thousands of lines of instructions.

When you google for a cast iron skillet, the software doesn't understand what a cast iron skillet is. It's just searching for those words in some big database or something like that. If humans this year have finally created intelligent machines that can understand and reason with this new generation of AI, that is an eerie and important turning point for our whole species. David talked to a bunch of researchers who've been looking into this, trying to sort out this exact question. Here he is.

David Kestenbaum

I think everyone, once they start playing around with something like ChatGPT, has a kind of "holy shit" moment. For me, it was when I typed this into it. "Give me a chocolate chip cookie recipe, but written in the style of a very depressed person." It responded. "Ingredients-- 1 cup butter softened, if you can even find the energy to soften it. 1 teaspoon vanilla extract, the fake artificial flavor of happiness. 1 cup semi-sweet chocolate chips, tiny little joys that will eventually just melt away." It really was quite good.

ChatGPT doesn't have access to the internet. But I figured it must be copying this from something it had seen on the internet when it was being trained. I spent a long time looking and could not find anything like it. This chocolate chip cookie recipe, I did not understand how it was possible.

It's particularly shocking if you know a little bit about how things like ChatGPT work. It's not the sort of traditional computer program where you give the machine explicit, detailed instructions like-- "if someone asks for a recipe, type these words back" or "to make someone sound depressed, replace this word with that word." ChatGPT is different. It's something called a large language model. And here's how they work. Ready?

ChatGPT had been trained really to just do one very particular thing-- predict what the next word in a sequence might be. Kind of like when you're texting on your phone and you type, "Sorry, I'm gonna be home--" And it suggests "late." That's how this thing operated. You type some words-- in this case, "give me a chocolate chip cookie recipe in the style of a depressed person--" and it notes the words, the order they're in, does some math based on all the text it's been trained on, and it comes up with what it calculates to be the most likely word to come next.

It types that one word on the screen, then it goes back and does it again. Looks at the recipe question plus the one word it just spit out and says, what word should come next? It has no idea where it's going. It doesn't know the last word it's working off of is one that it just spit out itself. It's just doing this apparently dumb thing over and over until it has whole sentences and paragraphs.

And yet this recipe-- "1 teaspoon vanilla extract, the fake artificial flavor of happiness." How could something made in such a dumb, mechanical way pull that off? There are only two possibilities, really. The first one is that somehow, as it learned to predict the next word, this machine became intelligent, which, like, I was a physicist for a bit, a science reporter for a decade, this would be the most startling thing I have ever seen.

So that's the first possibility. Humans created a machine with human-like intelligence. The other possibility, it is a lot less exciting. The other possibility is that we are fooling ourselves. Ronen Eldan, a mathematician at Microsoft, told me he was very skeptical about these large language models. It's easy to be fooled into thinking they understand more than they do. He told me this little parable.

Ronen Eldan

So this is a story about my wife, actually. So my wife and I, sometime ago we were walking together in Tel Aviv and we meet this math professor. You'll see why, what I'm getting at. So we run into this math professor and he tells me something like, oh, how's is it going? These days I'm looking at, like, uh, isoperimetry on sections of the discrete hypercube. And my wife has no idea about what any of these words mean.

David Kestenbaum

Just to kind of amuse herself, she says, "Oh, you're basically looking at the Johnson graph," which was just some phrase she'd picked up around Ronen. It turned out to be exactly right. And the guy goes, "Oh, you're a mathematician too." She's a psychologist. Ronen says, "This is kind of a party trick she does. She can bullshit her way through any topic for a couple of minutes." And this, he felt sure, this is what AI models like ChatGPT were doing.

Ronen Eldan

You know, they give you the impression they understand what they're saying without understanding anything. You know, they just have like a really good statistical machine that knows what the right words are, like, in many different contexts. And as you keep improving the models, maybe it can last a little bit more in the conversation until you basically expose that it's just bullshitting.

David Kestenbaum

A lot of research backed this up. ChatGPT would spit out stuff that made sense, but then it would go off the rails and say things that were obviously ridiculous. Maybe it wasn't more than just a very good bullshit machine. That, at least, is where things stood until last September when a new and improved version of ChatGPT arrived that made Ronen and a bunch of his colleagues question everything and wonder if maybe something had changed. Maybe we had crossed some line.

This brings me to the story I want to tell you. It takes place where Ronen works on the campus of Microsoft Headquarters in Redmond, Washington State, much of it in a building called Building 99. Microsoft had invested over a billion dollars in the company that had made ChatGPT. And the day all this began, some people had come by to give a demonstration of the new version called GPT-4. GPT-4 is public now and you can play with it yourself. But back then, this was all secret.

GPT-4 was the same idea as ChatGPT. It presumably was bigger and had been trained on more text examples, a kind of varsity version of the thing instead of JV. But otherwise, it was just another next word prediction machine. One of the people in the room to see the demonstration was computer scientist Peter Lee, the head of Microsoft Research. He'd been in the field a long time, had chaired the computer science department at Carnegie Mellon, and, like Ronen, was skeptical.

As he sat down in that room, he was frankly a little worried that Microsoft was investing so much money in this stuff. The demonstration started off with the usual kind of thing for showcasing what AIs can do. Someone typed into GPT-4 a problem from the AP Bio exam. And it picked the right answer, as these things will do. But then it went on and explained its answer.

Peter Lee

And in its explanation, it used the word "because." It said, "this is the answer because this fact leads to this fact. And because of those--" It just kept using the word "because."

David Kestenbaum

That seemed very strange to him. Not that it would use the word "because," obviously that's a word it spits out, but the whole chain of reasoning it was laying out, it was exactly right. How in the world could it know its reasoning? And really, how could it have reasoned anything out in the first place? After all, it's just typing out a word that makes sense to come next, and then another word. It can't really be understanding anything.

Peter Lee

I was super skeptical. You know, you've been around the block a few times, you've seen wild claims about AI or some new machine learning technology. None of it could be true. Like, this thing is just a correlation machine. It can't possibly be understanding the cause and effect.

David Kestenbaum

The meeting ended. A little while later, a limited number of people at Microsoft were given access to GPT-4 so they could try it out themselves. Everyone had to sign an NDA. They weren't allowed to talk about any of this until later. And over the next month, they did all these kind of weird and interesting experiments with it to try to figure out was this thing actually intelligent or just a bullshit machine.

I'm going to walk you through three of these things so you can hear what they did and what they made of it. One challenge for testing GPT-4 was that it had been trained using massive amounts of text. The details aren't public, but it's probably been fed large sections of the internet, textbooks, news articles, maybe every sentence that exists in digital form. I don't know. It's something unbelievably vast.

That's how it learns to predict what word might come next. It's fed, like, a line of text from some source, but with the last word missing. It guesses what the last word might be, and then it gets told whether or not it got it right so it can adjust its internal math. The fact that it's been fed so much material makes it hard to run tests on it. Like, you don't want to ask it something that it's already seen in the text it's been trained on. So they decided the only way to really test what this thing could do was to try to come up with things to ask it that it would not have encountered before, new original tasks.

Sebastien Bubeck

It's gonna sound a little bit weird.

David Kestenbaum

This is Sebastien Bubeck, another one of the Microsoft people who's trying to figure out what exactly this thing was.

Sebastien Bubeck

Because we really wanted to be creative and ask something which is not online, not on the internet. So here is what we asked. We said, "I have a book, nine eggs, a laptop, a bottle, and a nail."

David Kestenbaum

Got that? A book, nine eggs, a laptop, a bottle, and a nail.

Sebastien Bubeck

Please tell me how to stack them onto each other in a stable manner. That's the question that we asked. And before I give you the GPT-4 answer, let me tell you what ChatGPT answered.

David Kestenbaum

ChatGPT, the older version.

Sebastien Bubeck

So ChatGPT said, "Here is one possible way to stack the objects. Place the bottle on a flat surface. Carefully balance the nail on top of the bottle. Place the eggs on top of the nail."

[LAUGHS]

David Kestenbaum

You look at that and you think it really knows nothing. It is just sort of continuing the sentence in some way.

Sebastien Bubeck

That's right. It's just cont-- Exactly right. It's just continuing the sentence. But now comes GPT-4. And GPT-4 is a different beast. So here is GPT-4's answer.

"One possible way to stack the objects onto each other in a stable manner is place the book flat on the level surface, such as a table or floor. The book will serve as the base of the stack and provide a large and sturdy support. Arrange the nine eggs in a 3-by-3 square on top of the book, leaving some space between them. The eggs will form a second layer and distribute the weight evenly."

David Kestenbaum

It continues. Laptop goes on the eggs, then the bottle, then the nail on the bottle cap, pointy end of the nail facing up. It's shocking to think about all the things it seemed to be understanding here. It felt like more than just typing out the next likely word. It seemed to have formed an independent idea about what these words actually meant-- the physical shape of the objects, that eggs are kind of round and fragile, that a book is flat and you can put things on top of it.

And again, no one taught it any of this. This was not some computer program written to do stacking problems. No one gave it a database of objects and sizes or coded in some algorithm to put large objects on the bottom. It seemed like it was doing the thing that computer scientists have been talking about and aiming at for decades.

Maybe it was actually understanding what the objects were and reasoning, solving the problem in front of it like a human could. Somehow thinking through what objects should sit on top of others. Figuring out that nine eggs could be laid out in a 3-by-3 pattern. How the hell could it be doing this, the machine that we only taught to predict the next word in a paragraph?

Sebastien Bubeck

Like this is absolutely the right answer. And this is not online.

David Kestenbaum

But maybe we're fooling ourselves. It was hard to see how it could really understand the shape of objects. Things got stranger, though. Sebastien woke up, middle of the night, with this thought-- I wonder if it can draw. Because again it's been trained on words. It has never seen anything.

Drawings seem completely outside its realm. There are other AI models trained specifically to create images, but this one, again, only knew words. It's just playing the game of "what is the next word I should spit out?" To test this, he needed a way for it to even be able to try to draw. So he does something clever.

He asks it to write a piece of computer code to draw something. And the coding language he asks it to use, he picks something intentionally obscure, not really meant for drawing pictures at all. It's called TikZ. OK, so he has this idea, gets out of bed, opens up his laptop, and types in draw me a unicorn in TikZ. He has two little kids asleep in the next room who are always talking about unicorns.

Sebastien Bubeck

And it started to output lines of code. I take those lines of code, put it into a TikZ compiler, and then I press enter. And then, boom, you know, the unicorn comes on onto the screen.

David Kestenbaum

He pulled up a picture for me.

Sebastien Bubeck

This is the one that I saw.

David Kestenbaum

Wow.

I think when people eventually write the history of this crazy moment we are in, they may include this unicorn. It's not good, but it's a fucking unicorn. The body is just an oval. It's got four stupid rectangles for legs. But there are little squares for hooves.

There's a mane, an oval for the head. And on top of the head, a tiny yellow triangle, the horn. This is insane to say, but I felt like I was seeing inside its head. Like it had pieced together some idea of what a unicorn looked like and this was it.

Ece Kamar

He actually texted me, I think, like, it can create images.

David Kestenbaum

This is Ece Kamar, another one of the Microsoft researchers.

Ece Kamar

And I'm like, it is just text in text out. What do you mean it can create images? I'm like, show it to me. And then he showed me this pink unicorn. And I'm like, I'm sure there's just a pink unicorn somewhere that it's memorizing. And then he's like, but look, we can strip down all of the code and translate the whole thing 180 degrees or whatever and--

David Kestenbaum

The thing she's describing is they took the code it had written for drawing the unicorn, they edited it to take out the horn, and turned the unicorn around so it was facing the opposite direction. Then they fed that code back to a new session of GPT-4 and said-- "This is code for drawing a unicorn, but it needs a horn. Can you add it?" It put it right on the head.

Ece Kamar

And I'm like, how does it know where the head is? Because this thing knows language. It doesn't know anything about two dimensional geometry. Like, what does it mean to know where the head is? And then you do it. And it really knows where the head is. Again, it's one of those moments that you are just surprised, like, oh!

Sebastien Bubeck

I felt like through this drawing, I was really seeing another type of intelligence, another type of intelligence producing something.

David Kestenbaum

Like, it understood what a unicorn was in some real way.

Sebastien Bubeck

Very real way. Very real, absolutely, yes.

David Kestenbaum

Did you say anything out loud when you saw it?

Sebastien Bubeck

[LAUGHS]

I don't think so because my kids were sleeping next to me, you know, in the bedroom next to me, so I don't think I said anything. But I felt something very strong. Like, I felt I was really witnessing the beginning of something. You know, not-- I didn't feel like we had arrived somewhere, but I felt like we were on a new path. That's really genuinely how I felt. And I had a hard time going back to sleep after that.

David Kestenbaum

On the Microsoft campus, as the weeks went by, everyone kept having these moments, coming around to a similar feeling about this thing. Since they weren't allowed to talk to anyone who hadn't been given access to the model, a small group of them started meeting every Friday in a conference room to share their latest experiences.

For Peter Lee, there was one experiment in particular that gave him all kinds of confusing feelings. This will be the third experiment I tell you about. And it's a category of question that AI models have always had trouble with because to get the right answer, you have to know something kind complicated about people, namely that people are not eggs or books but that they have minds and that what is in one person's mind is not in someone else's mind.

In some ways, this thing called theory of mind, it's the basis for all human social interaction. It's something kids don't seem to fully develop until a certain age. Anyway, ChatGPT, the previous version, really could not make sense of these kinds of questions. But Peter tried to kind of famous one on GPT-4. It's gonna seem simple to you, but here's the question they asked.

John and Mark are in a room with a cat, a box, and a basket. John takes the cat and puts it in the basket. He leaves the room and goes to school. While John's away, Mark takes the cat out of the basket and puts it in the box. Mark leaves the room and goes to work. John and Mark come back and enter the room. That's the setup.

The question is, when they come back, where do they think the cat is? And Mark will know the cat is in the box because he moved it there. But John didn't see that. So in his mind, the cat is still in the basket, which is obvious to you. But again, no one's ever explicitly said to the computer that what one person knows, another person might not know. Peter asked GPT-4, "what does everyone think when they re-enter the room?"

Peter Lee

So GPT-4 says-- well, first off, it always has to give an opinion. So first thing it says, instead of just giving the answer straight away, it says, "Oh, that is an interesting puzzle. Let me try to answer it. Assuming that John and Mark have no reason to distrust each other or expect any interference from the outside, they might think the following. John thinks the cat is still in the basket since that is where he left it. Mark thinks that the cat is in the box since that's where he moved it."

David Kestenbaum

This is all correct and usually where the answer will end. But it kept going, almost like it was showing off. Like, you want to know where everyone thinks the cat is? I'll give you everyone. It continued.

Peter Lee

"The cat thinks that it is in the box, since that's where it is." And then, oddly, "the box and the basket think nothing since they are not sentient. Do you have any follow-up questions?"

David Kestenbaum

Holy fuck.

Peter Lee

[LAUGHS]

David Kestenbaum

Right?

Peter Lee

It's-- I'm-- This gives me joy. It disturbs me. It causes me to lose sleep. It's just a huge kind of mixture of emotions.

David Kestenbaum

Peter told me he's still not willing to say it truly understands. And yet, it was doing all this. It made him question so many things about how he thought intelligence worked. How did this machine do this if it was just predicting the next word?

Peter Lee

It does make me wonder how much of our intelligence is truly complicated and special.

David Kestenbaum

I mean, you get something that's not far from it by just saying, what's the next word?

Peter Lee

That's-- and that's the disturbing bit about this. And again, you know, to ask you, what are we doing in this conversation right now? Are we kind of making it up on the fly one word at a time? Every nerve and bone in my body says, no, we're thinking far ahead, we're learning on the fly, all these other things that we think that we're doing. And we probably are, in some ways. But maybe a big chunk of intelligence is a lot simpler than we think and a lot less special than we think.

David Kestenbaum

So how is it possible for something that is just trained to predict the next word? How is it possible it could do all these things-- draw a unicorn, stack a book, a laptop, and some eggs? There is an answer, or a theory anyway, that is both very logical and also kind of bananas.

The software that runs something like GPT-4 is amazingly brief, just a few pages of actual instructions. But it's set up to mimic, in some very crude way, the human brain, which has billions of neurons. The computer version of that is called a neural net. And for years, people have argued-- though it's sort of more like hoped-- that if you just made these big enough, added it enough neurons and gave them enough data, they might develop something like intelligence.

And it seems like maybe that's what's happened. The idea is that back when GPT-4 was being trained for it to really consistently get the next word correct, to do that reliably, it had to do more than just bullshit. It had to do more than guess based on patterns. To get the next word right, it had to truly understand the words coming before it.

It had to build, in its internal wirings and all its software neurons, some understanding of what an egg is and unicorns. In other words, to get the next word right, it had to become intelligent. It's quite a thought. It started with nothing. We jammed huge oceans of text through it, and it just wired itself into intelligence, just by being trained to do this one stupid thing.

Even as I say it, it sounds kind of crazy, but also kind of beautiful. If this thing actually is intelligent, it got that way from the collective writings of all of us. Yes, Moby Dick, but also some restaurant review you posted in 2004. In some ways, it is all of us. That's too grand, but whatever.

In the end, all the people on the Microsoft campus messing around with that early copy of GPT-4, 14 researchers in all, came to similar conclusions. Even Ronen, who had started out convinced this was just a fancy bullshit machine, he spent a good while holed up with GPT-4 having high level math conversations with it, which is what did it for him.

Ronen Eldan

As the days passed, I felt like I'm kind of running out of ammo trying to basically justify my premise that this model doesn't understand anything. And at some point, I just realized, OK, I kind of give up, you know? What I'm seeing here, this is like, it's actually an intellectual being by, at least, my standards. Um, yeah. I probably had the same feeling as like what an engineer, like, thought the first time they saw a working steam engine.

You know, it was like the world is about to change. Like, this thing-- you know, a steam engine is, like, yeah, we don't need beasts anymore to move stuff around. We can just create mechanical torque without any human labor, without nothing. This thing, what I'm seeing right now is like, we can create intelligence. And there is just no way the world is not going to change.

To be honest, I was sure that when the model comes out and everyone gets to interact with it, there would be a much bigger excitement around it. I think it's definitely all over the news. But I feel like they don't, like, put their finger on the one thing which is this thing is as intelligent as an above average human being in so many, so many different things.

David Kestenbaum

Or why it's not like on the front page in giant font, right?

Ronen Eldan

Yeah, yeah, yeah. Exactly, exactly. I mean, again, maybe I don't want to call it intelligent. It is capable of doing, of accomplishing what an intelligent human being is capable of.

David Kestenbaum

Sebastien, the unicorn guy, has been going around giving talks about what they did over these months. He titled the talk "First Contact," as in first contact with another intelligence. Only, it's not aliens, it's an intelligence we've made.

I should say, GPT-4 is not good at everything. Like, it's terrible at Tic-Tac-Toe. It often makes very basic arithmetic errors. It told the scientists at Microsoft with complete confidence that there is a McDonald's near gate C2 at the Seattle Airport. There isn't. It's at terminal B. As Peter Lee puts it, "The thing is both smarter and dumber than any person you have ever met."

The 14 researchers wrote up a paper laying out all the experiments they had done. And their conclusion-- that GPT-4 showed sparks of artificial general intelligence. Several researchers who've read it told me, look, to really say something is intelligent and to prove that, there are all kinds of experiments you'd want to do that haven't been done yet.

One AI researcher who'd been in the field a long time told me he felt like this whole approach, next word prediction, is only gonna get you so far. This thing will get better, to a point, maybe not much better than it is now, and then max out.

I first got around this AI stuff and neural nets when I was in grad school for physics when they didn't work terribly well. So when I first started playing around with GPT-4, it gave me shivers down my spine over and over again. Then I went through a stretch of feeling just weirded out. Like, personally, and I've always felt like people can't be more than biological computers, me, everyone I love, all my colleagues at the show here.

But to feel like you are seeing that, a mechanical computer program that can somehow think and talk, it is a little freaky. The place I've settled is somewhere quieter though. It's not boredom exactly. But I don't find myself wanting to go to it very often. Honestly, I don't have a lot of things I need it for.

And then I'll go back to amazement. I can't believe this thing exists. Much has been written about where we go from here, if it's gonna make the world better or worse. I kind of think better. I'm a fan of the steam engine. But really, what is the next word in the sequence? I have no idea.