Amazon’s AI Guru Is So Totally Over the Turing Test

Plus: Robot reporters, Apple’s post-Jobs essence, and an Unhappy Gilmore TikTok.
alan turing
Alan Turing’s eponymously named test has long been the benchmark of AI’s progressPhotograph: Alamy

All products featured on WIRED are independently selected by our editors. However, we may receive compensation from retailers and/or from purchases of products through these links.

Hello! This was earnings week, which means there are now two happy groups of people: the Big Tech CEOs, because it’s made them even richer, and the critics who want to curb Big Tech, because all those billions in profits make the case that things are out of hand.

The Plain View

You might think that Rohit Prasad would be a big fan of the Turing test, the venerated method to determine whether computers are as smart as humans. As the VP and head scientist of Amazon Alexa AI, Prasad has been instrumental in getting people to communicate with machines. Partly thanks to him, many of us now ask them for the weather report, to spin our favorite tunes, and—not least, since it’s Amazon—to do some shopping for us. But lately, Prasad has been on a crusade to declare the Turing test obsolete, politicking against it in a Fast Company article, speaking of its limitations at the recent Collision conference, and skewering it in a recent conversation with me.

First thing's first. Alexa, what is the Turing test? Here’s her answer: “According to Wikipedia, the Turing test, originally called the imitation game by Alan Turing in 1950, is a test of a machine’s ability to exhibit intelligent behavior equivalent to, or indistinguishable from, that of a human.” Thanks, sis, I’ll take it from here. In Turing’s landmark paper, “Computing Machinery and Intelligence,” he proposed what now seems like a surprisingly complex three-party game between two humans and a machine, where one of the humans has to pick which of the two other parties is a fellow sapien. Academics and computer scientists routinely use the Turing rules to see if their bot can be fooled for the human, passing the test and ushering in a new era of AI.

Prasad notes that the test is an artifact of a time when the idea of a “thinking computer” was preposterous. Now, he says, computers, armed with amazing power and an array of sensors unimaginable in the 1950s, do all sorts of human tasks. Rather than a scientific benchmark, Rohit argues, in modern times, the Turing test seems like a stunt. Its core premise was not to see how intelligent or knowledgeable a computer system was, but how well it could trick someone into incorrectly identifying the computer and the human. Deception was encouraged. Even in 1950, Turing knew that for a computer to pass the test, one of its challenges would have less to do with being smart than intentionally being dumb. If a questioner asks the sum of 34,957 and 70,764, Turing suggests a pause of 30 seconds, to fake a mental calculation. If someone poses a really hard math problem, the digital contestant would be wise to say, “Hey, go ask a computer!”

For well over half a century, the idea persisted that the bar for artificial intelligence rested on fooling people into thinking that a machine was a person. Meanwhile, advances in machine learning made it possible for Google, Amazon, and Apple to build natural language interaction into their products, without caring whether the computer seemed overly lifelike. While we still talked about the Turing test, speculating when it might be aced, we were actually talking to computers, in some cases without realizing it.

Was that even ethical? The question came up in 2018, when Google announced Duplex, a system where people called a bot to arrange a restaurant reservation or a haircut. Google’s engineers programmed the system to incorporate the quirks of human speech—umms and uhs, and variations in tone that implied that a human was on the other end of the line. “In the domain of making appointments, Duplex passes the Turing test,” Alphabet chair John Hennessy said at the time. But critics felt it would be a dangerous precedent to fool people into mistaking a machine for a person. When Google launched the product, it started the conversation with a disclaimer.

In any case, Prasad is correct that the Turing test should be retired. He wants to replace it with a series of challenges like the one that Amazon sponsored last year, giving a prize to the team that can best sustain a general human-machine conversation for 20 minutes. (This is a good deal for Amazon, which hopes to benefit from every advance in natural language AI.) The winner isn’t judged by how well it tricks someone but by how well it carries on the conversation. While Prasad says that he believes “socialbots” should be transparent about their artificiality, he doesn’t rule out using the conversational burps and hiccups that dot human vocal interactions. “Anthropomorphization is very natural,” he says. “As Alexa’s capability and interaction cues get better and better, a bond is formed.”

So we’re bonding … with a software phantom? That idea gives me the shivers. We don’t need a test or a challenge to know that software can successfully mimic a human conversation partner—it’s doing that already to some degree, and will only get better. The more interesting question is whether we will care to distinguish between a human and a machine—even when we know we are talking to a machine.

Efforts are already underway for digital “empathetic” companions to the elderly. Meanwhile we are raising a whole generation who spend their toddler years talking to smart speakers. Oh, and you can already buy an artificial mate with pillow talk built in. Just as the hapless protagonist in the movie Her, you don’t have to be deceived to get entangled with one of those systems. To be honest, I don’t know if the prospect is cool or dystopian. Maybe it’s both. But it’s clear to me that we’re in the opening stages of a dead-serious, real-time experiment in the relationship between people and machines. Can we maintain our biological uniqueness in the face of artificial conviviality? It’s not the machines that are being tested. It’s us.

But enough of my speculations. Alexa, what do you think?

Time Travel

In 2012, I wrote about Narrative Science, whose AI robots wrote news stories about Little League baseball games, earnings results, and other data-driven subjects. To readers, the articles looked like they were produced by human reporters. Still, cofounder Kristian Hammond’s prediction that a robot would win a Pulitzer Prize in five years proved overly optimistic:

Narrative Science's CTO and cofounder, Kristian Hammond, works in a small office just a few feet away from the buzz of coders and engineers. To Hammond, these stories are only the first step toward what will eventually become a news universe dominated by computer-generated stories. How dominant? Last year at a small conference of journalists and technologists, I asked Hammond to predict what percentage of news would be written by computers in 15 years. At first he tried to duck the question, but with some prodding he sighed and gave in: "More than 90 percent."

That's when I decided to write this article, hoping to finish it before being scooped by a MacBook Air.

Hammond assures me I have nothing to worry about. This robonews tsunami, he insists, will not wash away the remaining human reporters who still collect paychecks. Instead the universe of newswriting will expand dramatically, as computers mine vast troves of data to produce ultra-cheap, totally readable accounts of events, trends, and developments that no journalist is currently covering.

Ask Me One Thing

Estaban writes, “I think Apple is overrated. Apple disrupted the world years ago with its first computer and the iPhone, but in the last few years, it has done nothing quite as spectacular. Do you think Apple lost its essence when Steve Jobs died?”

Thanks for the question, Estaban. It depends on what you mean by Apple’s essence. Jobs’ deathbed instruction to Tim Cook was to avoid trying to ask “What would Steve do?” all the time but to “do what’s right.” Cook says that he is carrying on with Jobs’ core value of making Apple the place where cutting-edge technology powers easy-to-use and delightful products that transform our lives. Naturally, as the company keeps evolving, it will inevitably be different. If you define Apple’s essence as game-changing products, though, I acknowledge the jury is still out. AirPods and even the Apple Watch are not as earth-shattering as the iPhone. Upcoming innovations like AR glasses and maybe an Apple car will be a test of whether the company can rock the globe again.

You can submit questions to mail@wired.com. Write ASK LEVY in the subject line.

End Times Chronicle

I can’t decide which is the bigger sign of an impending apocalypse: that Adam Sandler had to wait for a table at IHOP, or that it went viral on TikTok.

Last but Not Least

Alexa might be getting smarter, but so is the Google Assistant.

In a fascinating excerpt from his book Full Spectrum, Adam Rogers explains how Pixar hacks our brains with color.

The hype about Miami being the next big tech town is deafening. And even fun. At least until the long moist summer.

Yes, there is racism in porn.

Have a great weekend—you’ve earned it!

If you buy something using links in our stories, we may earn a commission. This helps support our journalism. Learn more.


More Great WIRED Stories