An 8-year-old voice-synthesizing technology that has so far escaped the typical high-tech nano-lifespan is helping make the Web more accessible to the blind and dyslexic.
Digital Equipment Corporation's DECTalk is the voice behind pwWebSpeak, and will soon come to computers via sound cards to help meet the growing demand for voice-capable applications. Audiences for this technology are growing because they, too, see an advantage to having a computer that reads to them, said John Churhill, vice president of operations for the Center for the Blind and Dyslexic.
"DECTalk is still the most advanced speech synthesis available today, but it still sounds like a machine," said Larry Goldberg, director of the National Center for Accessible Media.
The journey to bring voice to computers has been a long and tedious one, mostly because humans have such a spontaneous way of speaking - ranging from coos of delight at things that melt our hearts to screams of indignation at things that boil our blood. But a vanilla computer cannot feel; rather, it is "like a mouth without a brain," said Bathsheba Malsheen, vice president of speech and audio business at Voxware.
Without a brain, the computer doesn't know, for instance, how to form the "o" or "m" sounds when saying words with those letters. Humans round or close their lips to make these sounds without thinking. For a computer to do these operations takes memory and processing power.
DECTalk, which comes as either a standalone box or an add-in board for a computer, attempts to give the computer a bit of a brain when it comes to speech.
At the heart of understanding how to speak, a human or device must understand phonemes, the basic building blocks of speech. DECTalk is programmed to generate the most basic of English phonemes, what DEC engineers determined to be roughly 40. In addition, the technology has an understanding of the rules of English speech. But English is not always logical, and DEC included a list of exceptions that users can customize. "DECTalk can trip up on proper names, which are often nonstandard English or of foreign origin," said Jim Fruchterman, president of Arkenstone, a nonprofit organization that develops a software driver that pwWebSpeak and other programs use to access the DECTalk board in a PC.
For example, Fruchterman (frook-ter-man) said DECTalk would ordinarily pronounce his last name with the "ch" as in the word "chalk." He gave DECTalk the phonetic pronunciation of his name to help DECTalk learn to pronounce the "ch" as though it were a hard "c" sound as in cat, he said.
After DECTalk processes the phonemes, coming up with "best guesses" for the sounds it doesn't have on a list, the text is sent on to the voice synthesizer, a series of cascading filters that help mimic the length and resonance of the human vocal tract.
For much of its existence, DECTalk has been available to a small audience, mostly because of its high cost and lack of compatibility with computer applications. But Fruchterman's company is helping make it more accessible - and affordable. The software driver Arkenstone has developed has enabled developers of sound-blaster cards, including one from CreativeLabs.
Fruchterman said his driver will be to voice-enabled applications what printer drivers are now to word processors and page layout programs. Users will choose voices, dialects, and accents the way people choose fonts, sizes, and styles and send them to the printer. If the device supports that sound, then that's what the user will hear, said Fruchterman.
And the choices of language are soon to appear, said Edward Bruckert, product engineer for DECTalk. He said the company is working on a Spanish version.