Point of Discovery

I Know What You're Thinking

Episode Summary

I know what you’re thinking. No really. A new non-invasive brain decoder can translate your brain activity into a continuous string of words that’s similar to the story you’re hearing or imagining. It also gets the gist of videos you watch. It relies in part on the kind of AI model behind ChatGPT.

Episode Notes

On today’s show we talk with Alex Huth, assistant professor of neuroscience and computer science at The University of Texas at Austin, and Ph.D. student Jerry Tang about a new system that can read a person’s thoughts in real time and produce a stream of continuous text. The system they developed, called a semantic decoder, relies in part on the kind of AI model behind ChatGPT. It might one day help people who are mentally conscious yet unable to physically speak, such as those debilitated by strokes, to communicate intelligibly again. The scientists behind it are also wrestling with thorny issues this technology brings up, concerning privacy and the ethical use of AI.

Show Notes

If you liked this episode, check out our earlier episode featuring Alex Huth talking about an earlier iteration of this research.

Through the Good Systems initiative, The University of Texas at Austin is bringing together researchers from a broad range of disciplines to explore ways to ensure that artificial intelligence develops in a way that is beneficial, not detrimental, to humanity. Learn more about Good Systems here.

Episode Credits

Our theme music was composed by Charlie Harper

Other music for today’s show was produced by: Podington Bear

Episode Transcription

Marc Airhart: This is Point of Discovery. I’m Marc Airhart. Remember that talking dog from the movie Up?

Dug (the dog): Hi there. 

Carl (older man): Wah?!? Did that dog just say ‘hi there’? 

Dug: Oh yes.

Carl: Bwah?!? 

Dug: My name is Dug, I have just met you and I love you. My master made me this collar. He is a good and smart master and he made me this collar so that I may talk. Squirrel!!!

MA: Today, we’ve got an update on a podcast we did a few years ago that was about how words and ideas are stored and retrieved in the brain. But what does that have to do with a talking dog? Research on how the brain processes language is moving at a fast clip. On today’s show we’ll talk about a new system that can read a person’s thoughts in real time and produce a stream of continuous text. The new system is nowhere near as portable or easy to use as a dog collar, but the results are almost as impressive as brain decoding technologies you’ve seen in the movies. And the scientists behind it are giving a lot of careful thought to some issues this technology brings up, concerning privacy and the ethical use of AI. Stay tuned—you won’t want to miss this.

MA: When I interviewed Alex Huth [hooth] four years ago, he and his team were in the middle of an ambitious project to collect hundreds of hours of brain activity data from people lying in an fMRI scanner while listening to podcasts. 

AH: ... we want to understand how the brain does this thing that is language, which is, I think, one of the most remarkable things that human brains do … 

MA: Huth—an assistant professor of neuroscience and computer science at The University of Texas at Austin—has now collected 20 hours of data from himself listening to podcasts in an fMRI—and dozens more from students in his lab. Lying in an fMRI scanner for a long time is hard—you’re crammed into a metal tube and it’s noisy and you have to lie perfectly still—any little movement spoils the data.

MA: So now Huth and his team have two streams of data that are correlated with each other in time—on the one hand there’s a transcript of what someone is saying in a podcast …

This American Life: [ep. 395] I reached over and secretly undid my seatbelt. And when his foot hit the brake at the red light, I flung open the door and I ran. ...

MA: … and on the other hand, they have a 3D map that shows where blood flow is rising and falling over time throughout a person’s brain. Where brain cells are active—thinking about a word or idea—there’s a little rush of fresh, oxygenated blood to that area. It’s a signal that hey, these are the parts of the brain that perk up when you’re thinking about a certain word or idea. Huth and his team have used this giant dataset of brain activity to train a model that’s built on top of a so-called transformer model. Transformer models drive AI text generators that have been all over the news the past few months, like ChatGPT. These chatbots automatically generate text that’s eerily similar to what a human might write. The UT Austin researchers run their model with fresh data from brain activity—that blood flow map. And then the system actually guesses the text that the person was listening to. The scientists call this system a “semantic decoder.”

Jerry Tang: So if we had subjects listen to a podcast, the decoder would take in the brain responses that were recorded, while the subject listens to the podcast. And it would generate a sequence of words … these sequences are usually intelligible, they are linguistically correct most of the time. And they capture the meaning of what the subject was hearing, they capture the gist of the story.

MA: That’s doctoral student Jerry Tang, who led the latest part of the research. Their decoder even works when the person in the fMRI scanner isn’t listening to a podcast, but is just thinking about telling a story. It’s like some sci-fi mind-reading technology. Here’s an example of the UT Austin system in action, with an episode of This American Life.

JT: So in one of our examples from the paper, the story that the subject is listening to says,

This American Life: [ep. 395] "I don't have my driver's license yet. And I just jumped out right when I needed to."

JT: And our decoder takes the brain responses that were recorded while the subject listened to that part of the story. And it predicts, "She is not ready. She has not even started to learn to drive yet." So it … captures what the speaker of the story is saying, that she doesn't have the driver's license.

MA: That’s really impressive. It’s not word for word what the listener heard, but it gets the gist. Alex Huth says, even then, it’s not 100 percent accurate.

AH: [13:10ish] It's surprisingly accurate for what it is, for fMRI, which I think is also part of the story. But I'd say like, 50% of the time, it's pretty good. 50% of the time, it just like kind of goes off the rails and doesn't really know what it's talking about. But 50% of the time, it's getting, … a pretty good retelling of what the person is saying.

MA: Huth says this is especially impressive for a non-invasive technique. No one had to do brain surgery and implant electrodes in a person’s head. If this technology is ever going to be used widely outside of the lab, it’s going to have to be non-invasive.

AH: This is … a real leap forward compared to what's been done before, which is typically on like single words or short sentences, where you can get something about the gist of a sentence or something about, you know, what a word is related to, but we're getting … whole strings of text, … pretty complicated ideas ...

MA: Having abilities like this could be really useful to help people who aren’t physically able to speak—like those who have had strokes or neurodegenerative diseases—to communicate with the outside world using just their minds. But there is a big challenge—for this to work, a test subject has to spend lots of time in an fMRI machine to train the system.

AH: ... which means it's like a $3 million, … 20-ton machine. That is gigantic. And people have to come to a special place and lay inside this machine to use this. And so it's, in its current form, this … is not actually practical in this way. But we're really hopeful that because of the sort of algorithmic and … data developments that we have here, that these same ideas can be applied to other neuro-imaging technologies, that are maybe more portable, and could be usable more broadly …

MA: For example, there is another brain imaging technique called fNIRS, which uses infrared light to measure brain activity. And all you have to do is wear a stretchy cap with little lights embedded in it. Currently, this type of technology captures brain activity at a lower resolution, making it a poor match for the decoder. But technology evolves. Huth and Tang’s semantic decoder isn’t ready for prime time yet, but what happens when it is? Will we, as a society, be ready for technology that can decode our thoughts? More on that in a minute.

MA: Okay, now, my most important question I need to ask you guys: Can I bring my cat in to find out what he's thinking?

AH: So I think we'd have to get your cat to listen to like, 15 hours of podcasts first and assume that the cat understands what's going on in those podcasts. So I think that that might be a stretch. But you know, I don't know, if we find some appropriate videos that a cat could watch. Maybe we could get there … 

MA: Maybe a cat podcast, and it's just all meows? 

AH: Sure.

MA: OK, maybe not that one, but this kind of technology does raise important questions. Humans have a long and troubling track record of abusing new technologies. Are we just one technology-enhanced-stretchy-cap-with-little-lights away from some pretty scary things? For example, what if an authoritarian regime wanted to use this on its political enemies… and start charging people for thought crimes? 

JT: ... we take very seriously the concerns that it could be used for bad purposes. And we want to dedicate a lot of time moving forward to try to try to avoid that.

MA: Huth and Tang say there are at least two reasons this technology would be hard to misuse. First, it takes a lot of training in a lab setting with big, expensive equipment—it can’t be used on a person without their knowledge or cooperation.

AH: So this person needs to spend up to 15 hours laying in an MRI scanner, being perfectly still, paying good attention to stories that they're listening to, before this, like really works well on them. If at any time, you know, they're jiggling around or moving, or like not paying attention or falling asleep, the thing just gets a lot worse …

MA: Not only must the training be done on willing people, what the model learns from those people isn’t applicable for anyone but them. Using the model on someone without the training?

AH: It doesn't work at all. It's really crummy, like, I think, statistically detectable as like there's something there. But qualitatively, it's garbage.

MA: There is another scenario the team tested.

AH: The other thing that we tested was, whether the person who's in the MRI scanner, whether they can sort of disrupt what's being decoded or whether they can think thoughts, do things in their head that make it so that you can't decode what's going on …

MA: The system has been trained on these people—but now, instead of just passively listening to stories, the researchers instructed them to do what they call “resistance tasks”–like name as many animals as you can think of, or quietly imagine telling your own story.

AH: So we had them do these things while they're listening to a story and then tried to decode the words of that story. And it was just garbled, just mostly gone. So that also means that like, if people don't want to have something decoded from their brains, they can control that using just their cognition, they can like, think about other things, and then it all breaks down.

MA: So, bottom line—no one is going to be sneaking up and stealing your thoughts any time soon. But what if someday, the technology gets so good that even those defenses no longer work? Are there things we can do as a society to prepare? Again, Jerry Tang:

JT: I think, right now, while the technology is in such an early state, it's important to be proactive, and get a head start on like, for one, enacting policies that protect people's mental privacy, … giving people a right to their … thoughts and their brain data. … we want to make sure that … people only use these when they want to, and that it helps them.

MA: That’s our show. Point of Discovery is a production of The University of Texas at Austin’s College of Natural Sciences and is a part of the Texas Podcast Network. The opinions expressed in this podcast represent the views of the hosts and guests, and not of The University of Texas at Austin. Our website is at pointofdiscovery.org. There you can see photos from inside the lab and watch a short video explaining how the semantic decoder works.

MA: Our theme music was composed by Charlie Harper. If you like our show, be sure and tell your friends. We’re available wherever you get your podcasts, including Apple Podcasts, Google Podcasts and Spotify. Our senior producer is Christine Sinatra. Thanks for listening!