[#video:https://youtube.com/embed/ZZFcgg-7dlc ]
Researchers at the Massachusetts Institute of Technology have developed wearable tech that could help avoid crossed wires and miscommunication.
Using artificial intelligence to detect the "tone" of a conversation based on speech patterns, the researchers were able to use a wearable band to “classify the overall emotional nature of the subject’s historic narration”. More simply, the band could spot signs of sadness, anger, boredom and so on from a person's voice.
The project, by Tuka AlHanai and Mohammad Mahdi Ghassemi at CSAIL, is one of many aimed at ‘social coaching’ designed to help people with chronic social disorders become more comfortable with the complexities of communication.
Day-to-day life is underpinned by social interactions that play a key role in our mental health and development. While the linguistic element of communication is learned through repetition of words and phrases, tonality and intonation often make it harder to ascertain meaning in a conversation.
The tone of a conversation can drastically affect its emotional intent - conveying sadness, aggression or happiness through nuanced changes in pitch and intonation. While many of us may pick these up naturally throughout life, they are particularly difficult to identify by people with certain social disorders, and even more so for a machine.
Those with anxiety disorders, Asperger’s syndrome or other chronic social disorders can struggle to determine the emotional intent of a conversation, particularly if it has "multiple levels" of emotion – such as a happy story with elements of frustration. An AI with the ability to navigate through nuanced social cues could lead to an improvement in the quality of life for these groups.
During tests, participants in the project were given a Samsung Simband, a wearable device which collects so-called "high-resolution" physiological changes. The Simband is modular and can run on multiple sensors meaning it can track a large amount of data and, when combined with audio collected on iPhones, this formed the basis for recording the emotions in conversations between the volunteers.
To build the neural network, researchers recorded physiological changes among the participants, from facial expressions to changes in speech, to discover how different emotions were physically expressed. Using this data, the AI and researchers categorised the results into positive, negative and neutral. These groups were then expanded to contain further variations. For example, 'negative' speech could contain a mix of sadness, loathing, fear or boredom.
The unique nature of the project, the researchers claim, comes from the use of so-called 'natural data'. Instead of asking participants to act out various emotions, they were instead allowed to tell a story of their own choosing through which the facial expressions and vocal tones could be collected. While the project still has room for development, AlHanai and Mahdi Ghassemi's results so far indicate “real-time emotional classification of natural conversation [are] possible with high fidelity.”
This article was originally published by WIRED UK