Verbmobil, a coalition of German university and industry research labs, is on the cutting edge of speech-to-speech research. This chart shows the primary method of Verbmobil's translation system, which can handle conversations about travel and meeting arrangements. The system also uses three other techniques: an example-based system, a statistical probability approach, and a model based on determining the type of "dialog act": scheduling a meeting, canceling a meeting, and so on.
- Matt Steinglass (mattsteinglass@csi.com)
"Has he got a meeting with Bill in, uh, Hamburg on May fifteenth?"
Speech Recognition Modules
Acoustic Probability Module Runs the digital waveform through a vast compilation of speech samples to identify the most likely phoneme strings.
Phonetic Recognition Module Suggests words by simply guessing at their pronunciation, assigning a probability to each phoneme matchup. Because no two speakers or utterances are exactly alike, the computer can never be 100 percent sure. Filters out nonsense phonemes, like "uh."
Language-Model Module Tests word-order probability, which measures the likelihood of particular word sequences. The sequence "Is he got," for example, becomes a lot less likely.
Prosodic Analysis Module Recognizes rising and falling tones and asks, "Is this sentence a question or a statement? Is there an emphasis on a certain word?" It also hears pauses that segment phrases, enabling it to distinguish, for example, between "Ed said Sue did it" and "Ed, said Sue, did it."
Syntactic/Semantic Analysis Module Takes the most probable sentence guess so far - "Has he got a meeting with Bill in Hamburg on May 50?" - and, using four analytical strategies, tries to determine its meaning.
Performs the same analyses on the next-most-probable sentence guesses.
Dialog Semantics Module Combines these different interpretations into one expression in Verbmobil Interface Terms, a language-neutral representation of the sentence's meaning and structure.
Dialog and Context Evaluation Module Keeps track of what's happened in the conversation so far. It asks, "Who does 'he' refer to?" It also knows certain things about the real world - chiefly things having to do with travel and meetings. In this case, it knows there's no such thing as May 50. So it switches to the next best sentence guess: "Has he got a meeting with Bill in Hamburg on May 15?"
German Module Pulls up the corresponding German words for the Verbmobil Interface Terms.
Sentence Generation Module Rearranges the words into a German sentence structure, conjugates verbs, and declines nouns and adjectives.
Speech Synthesis Module Reads out the sentence with emphasis on the proper words by consulting content information as well as a phoneme database.
"Hat er eine Besprechung mit Bill am funfzehnten Mai in Hamburg?"