Making HAL Your Pal

What happens when artificial intelligence becomes far smarter than humans? What will keep it friendly? The Singularity Institute says it has the answers for what happens during the next stage of humanity's evolution. By Declan McCullagh.

Eliezer Yudkowsky has devoted his young life to an undeniably unusual pursuit: planning for what happens when computers become far smarter than us.

Yudkowsky, a 21-year-old researcher at the Singularity Institute, has spent the last eight months writing an essay that's half precaution, half thought exercise, and entirely in earnest.

This 750 KB treatise, released Wednesday, is not as much speculative as predictive. If a computer becomes sufficiently smart, the argument goes, and if it gains the ability to harm humans through nanotechnology or some means we don't expect, it may decide it doesn't need us or want us around.

One solution: Unconditional "friendliness," built into the AI as surely as our genes are coded into us.

"I've devoted my life to this," says Yudkowsky, a self-proclaimed "genius" who lives in Atlanta and opted out of attending high school and college.

It's not for lack of smarts. He's a skilled, if verbose, writer and an avid science-fiction reader who reports he scored a perfect 1600 on his SATs.

Yudkowsky's reason for shunning formal education is that he believes the danger of unfriendly AI to be so near -- as early as tomorrow -- that there was no time for a traditional adolescence. "If you take the Singularity seriously, you tend to live out your life on a shorter time scale," he said.

Mind you, that's "Singularity" in capital letters. Even so-called Singularitians like Yudkowsky admit that the term has no precise meaning, but a commonly accepted definition is a point when human progress, particularly technological progress, accelerates so dramatically that predicting what will happen next is futile.

The term appears to have been coined by John von Neumann, the great mathematician and computer scientist who used it not to refer to superhuman intelligence, but to the everyday pace of science and technology.

Science-fiction author Vernor Vinge popularized the concept in the 1980s, capitalizing the word and writing about whether mankind would approach Singularity by way of machine intelligence alone or through augmented mental processes. Predictions vary wildly about what happens at the Singularity, but the consensus seems to be that life as humanity currently knows it will come to a sudden end.

Vinge is the closest thing Singularitians have to a thought leader, spokesman and hero. He offers predictions based on measures of technological progress such as Moore's Law, and sees the Singularity as arriving between 2005 and 2030 -- though some Vinge aficionados hope the possibility of uploading their brains into an immortal computer is just around the corner.

One of them is Yudkowsky, who credits Vinge for turning him onto the Singularity at age 11. "I read True Names," he said, referring to a Vinge novel. "I got to page 47 and found out what I was going to be doing for the rest of my life."

Since then, Yudkowsky has become not just someone who predicts the Singularity, but a committed activist trying to speed its arrival. "My first allegiance is to the Singularity, not humanity," he writes in one essay. "I don't know what the Singularity will do with us. I don't know whether Singularities upgrade mortal races, or disassemble us for spare atoms.... If it comes down to Us or Them, I'm with Them."

His life has included endless theorizing -- little programming, though -- about friendly AI.

"Any damn fool can design an AI that's friendly if nothing goes wrong," Yudkowsky says. "This is an AI in theory that should be friendly if everything goes wrong."

Of course, some of the brightest people in the world, including Nobel laureates, have spent decades researching AI -- friendly or not -- and have failed to realize their dreams. AI, it seems, is always just a decade or less away from becoming reality -- and has been for the last 40 years.

The difference today? The Singularitian movement. Back when the late Herb Simon co-invented the humble General Problem Solver in the mid-1950s, there wasn't a crowd of eager geeks cheering his efforts and hoping to dump their brains into his technology.

Yudkowsky says he hopes to show his essay to the AI community "and maybe even branch out into the cognitive science community and maybe get some useful comments that can be incorporated into the document."

The only problem is that academics don't seem interested. When asked for comment, one well-known researcher said in response to the essay: "Worthless speculation. Call me when you have running code."

Alon Halevy, a faculty member in the University of Washington's computer science department and an editor at the Journal of Artificial Intelligence Research, said he's not worried about friendliness.

"As a practical matter, I'm not concerned at all about AI being friendly or not," Halevy said. "The challenges we face are so enormous to even get to the point where we can call a system reasonably intelligent, that whether they are friendly or not will be an issue that is relatively easy to solve."

Rules limiting smart computers to human-approved limits, of course, are nothing new. The most famous example is the Laws of Robotics, written by Isaac Asimov, one of the fathers of science fiction.

To Asimov, only three laws were necessary: (1) A robot may not injure a human being, or, through inaction, allow a human being to come to harm; (2) A robot must obey orders given it by human beings, except where such orders would conflict with the First Law; (3) A robot must protect its own existence as long as such protection does not conflict with the First or Second Laws.

In 1993, Roger Clarke, a fellow at the Australian National University, wrote an essay wondering how Asimov's laws applied to information technology.

One conclusion: "Existing codes of ethics need to be re-examined in the light of developing technology. Codes generally fail to reflect the potential effects of computer-enhanced machines and the inadequacy of existing managerial, institutional and legal processes for coping with inherent risks."

Yudkowsky takes it a step further, writing that he believes AI "will be developed on symmetric-multiprocessing hardware, at least initially." He said he expects Singularity could happen in the very near future: "I wouldn't be surprised if tomorrow was the Final Dawn, the last sunrise before the Earth and Sun are reshaped into computing elements."

When one researcher booted up a program he hoped would be AI-like, Yudkowsky said he believed there was a 5 percent chance the Singularity was about to happen and human existence would be forever changed.

After another firm announced it might pull the plug on advanced search software it created, Yudkowsky wrote on Sunday to a Singularity mailing list: "Did anyone try, just by way of experimentation, explaining to the current Webmind instantiation that it's about to die?"

That kind of earnest hopefulness comes more from science fiction than computer science, and in fact some researchers don't even think the traditional geek-does-programming AI field is that interesting nowadays. The interesting advances, the thinking goes, are taking place in cognitive and computational neuroscience.

Yudkowsky seems undeterred, saying he wants the Singularity Institute's friendly AI guidelines eventually to become the equivalent of the Foresight Institute's nanotechnology guidelines. Released last year, they include, among others, principles saying nanobots should not be allowed to replicate outside a laboratory, and only companies that agree to follow these rules should receive nanotech hardware.

"The Singularity Institute is not just in the business of predicting it, but creating it and reacting to it," he says. "If AI doesn't come for another 50 years, then one way of looking at it would be that we have 50 years to plan in advance about friendly AI."

Given humanity's drive to better itself, it seems likely that eventually -- even if the date is hundreds or thousands of years off -- some form of AI will exist.

Though even then, some skeptics like John Searle, a professor at the University of California at Berkeley, argue that a machine will merely be manipulating symbols -- and will lack any true understanding of their meaning.

Yudkowsky, on the other hand, sees reason for urgency in developing Friendly AI guidelines.

"We don't have the code yet but we have something that is pretty near the code level," Yudkowsky said, talking about the Institute's work. "We have something that could be readily implemented by any fairly advanced AI project."

Like a character from science fiction, Yudkowsky sees his efforts as humanity's only hope.

In an autobiographical essay, he writes: "I think my efforts could spell the difference between life and death for most of humanity, or even the difference between a Singularity and a lifeless, sterilized planet... I think that I can save the world, not just because I'm the one who happens to be making the effort, but because I'm the only one who can make the effort."