Feb 1, 2000 12:00 PM

Superhuman Hearing

NEURAL NETS Theodore Berger and Jim-Shih Liaw have heard the complaints about voice-recognition software: It’s a chronic underachiever, often unable to distinguish the same word spoken by different people and hypersensitive to ambient noise. Their response: Turn on, tune in, and listen up. The two University of Southern California biomedical engineers have designed a system […]

NEURAL NETS

Theodore Berger and Jim-Shih Liaw have heard the complaints about voice-recognition software: It's a chronic underachiever, often unable to distinguish the same word spoken by different people and hypersensitive to ambient noise. Their response: Turn on, tune in, and listen up. The two University of Southern California biomedical engineers have designed a system they say not only outperforms conventional voice-recognition products but understands the spoken word better than humans do. The secret? A neural network that mimics the way the brain interprets speech.

Past efforts at voice decoding usually relied on brute-force computing, breaking down speech patterns into tiny chunks of data. But such systems often failed to understand a wide range of speakers, because human speech varies greatly in timbre, phrasing, and intonation.

Berger and Liaw employed a dynamic network of chips called neurons. Rather than being programmed, these neural nets "learn" to perform tasks; because the network imitates the brain instead of just chopping up words, it's better able to search for underlying patterns.

"We analyze the whole word, not a tiny slice of it," explains Berger, "and we change the dynamics of our system until it matches what is common to many speakers." (Visit www.usc.edu/ext-relations/news_service/real/real_video.html to see the project in action.)

In tests involving several words and speakers - including some speaking English as a second language - the Berger-Liaw network easily surpassed the human capacity to recognize words, even when confronted with background noise 560 percent louder than the speaker's voice. "The human cognitive-listening process is exquisitely tuned, and it's rare when an artificial system can outperform it," says Joel Davis of the US Office of Naval Research, one of the groups funding the project.

The results have to be replicated with a larger vocabulary, but several major communications companies are already interested in Berger and Liaw's efforts, while the military is keeping an eye - and an ear - on them.

MUST READ

Silent Scream
Spy vs. Spy
Superhuman Hearing
Tracing Paper - with Lasers!
Kozmo's High Hopes
Ask Dr. Bob
Tails of the City
People
Jargon Watch
Beatnik's Remix
Digital Do
eBay's Top Cop
Scoring a Grammy
The Thing Network
Seeing Digital
Raw Data