Welcome to Rosalind Picard's touchy-feely world of empathic tech

This article was taken from the Novembef 2012 issue of Wired magazine. Be the first to read Wired's articles in print before they're posted online, and get your hands on loads of additional content by <span class="s1">subscribing online.

A woman -- and, worse than that, a blonde -- Rosalind Picard was determined to have nothing to do with the study of emotion. Twenty years ago, for a young research scientist in the male-dominated world of electrical engineering, there were already enough stereotypes working against her. Studying something as irrational as emotion could only open her to ridicule. After years of chip design at Bell Laboratories in Holmdel, New Jersey, her research was as cold-blooded as that of any of her peers: content-based retrieval, modelling machine architecture and building algorithms to enable computers to see. This was a central part of the quest to build a true artificial intelligence which, like many at the time, Picard believed could be achieved by reproducing the higher functions of the human brain -- the visual cortex, the auditory cortex -- that were involved in most perceptual experiences.

But, prompted by reading The Man Who Tasted Shapes, neurologist Richard Cytowic's case study of synaesthesia in which the subject's strange sensory episodes took place while his cortex was shutting down, she began to doubt that perception was as straightforward as she had thought. Picard had became intrigued by the idea that the limbic system -- traditionally regarded as the most ancient part of the brain and the home of memory, attention and emotion -- could have a role in the process. Reading more, she learned that emotion not only affects what goes into the memory, but also guides perception by helping make choices between different stimuli -- determining what we find interesting. The more Picard discovered about the limbic system, the clearer it became that any attempt to build a perceiving computer that did not take account of emotion would fail. "I realised we're not going to build intelligent machines until we build, if not something we call emotion, then something that functions like our emotion systems."

Still, Picard didn't want to be associated with work on emotion; she tried to talk colleagues into it, but failed. Eventually, she reluctantly took on the work herself and in 1995 circulated a tech report she called Affective Computing, arguing the importance of integrating emotions into the machine environment. In the radical environment of the Media Lab, this was just the kind of work she was expected to do: one colleague appeared in the doorway of her office shaking a copy of it and approvingly declaring it to be "crazy".

Elsewhere, the reaction to her ideas was just as she had feared; her manifesto was rejected from every peer-reviewed publication she submitted it to. In 1997 one reviewer suggested it was better suited to an in-flight magazine. Later that year, at the Conference on Computer Vision and Pattern Recognition, an annual event that attracts specialists in the field from all over the world, she overheard a group of scientists discussing her. "That's Rosalind Picard," they said."She used to do respectable work."

Today, Picard's tech note, and the book she followed it with in 1997, are credited with creating an entirely new field of study -- one now significant enough to require its own journal. From her office overlooking the yawning meeting space of the Media Lab's East Laboratory -- a two-storey-deep romper room scattered with experimental equipment and bric-a-brac -- she now oversees the Affective Computing research group. With a team of ten graduate students working on projects as disparate as "emotional social prosthetics" for autistic children, and robot sponsors for recovering drug addicts, for more than a decade the group has been pioneering work in emotion measurement and communication technology. Last year, Picard and her Media Lab colleague, Egyptian-born computer scientist Rana El Kaliouby, spun off a private company, Affectiva, to exploit emotion-recognition technology for use in advertising and marketing.

In the years since she first conceived the term "affective computing", Picard says that the aims of artificial-intelligence research have evolved subtly but profoundly. Both in Picard's lab and in the field at large, work has moved away from perfecting machines that are intelligent for their own sake, towards building those that can use emotional intelligence to help us solve problems. Today, her research group concentrates on creating tools that help computers understand human emotions, not to try to mimic them. This has the added benefit of shifting research away from the pursuit of a goal -- computer consciousness -- that threatened to make humans obsolete. "We've decided it's more about building a better human-machine combination," Picard explains late one afternoon in June, "than it is about building a machine where we will be lucky

if it wants us around as a household pet."

In the beginning, much of the work of the Affective Computing group was focused on making computers easier for humans to get along with. "Computers were frustrating," Picard says. "Human/computer interaction is intrinsically natural and social. I thought, we're trying to build intelligent machines, but people emote at machines and machines have been unintelligent in not responding to it."

Human beings have long had unreasonable expectations of computer empathy. Regardless of how rationally aware we might be that they are boxes of components processing information, we interact with computers as if they are people -- and become very exercised when the machine doesn't respond in kind. In a 2010 Intel survey, 80 percent of people admitted having become frustrated with their computer, while 33 percent described seeing a colleague shouting abuse at theirs; 24 percent admitted hitting their screen or keyboard. "I love the story of the chef in New York who threw his computer in a deep-fat fryer," Picard says. "There was a guy who fired severalshots through the monitor and several through the hard drive. You don't do that because you're having a great experience."

Picard's favourite example of computers' lack of emotional intelligence is Clippy, the relentlessly perky animated paperclip employed as an onscreen assistant in many iterations of Microsoft Office -- and infamously detested by users. Clippy materialised, unbidden, each time the software suspected you needed assistance -- with writing a letter or spelling a word -- bursting with cheerful suggestions, paying no regard to how busy or bad-tempered the user was, and even when repeatedly sent away insisted on doing a little dance before departing. "That is emotionally unintelligent behaviour," Picard says.

Clippy proved so unpopular that Microsoft approached the research group to find ways in which he would know when he was unwelcome. The team devised a squeezable mouse with pressure sensors designed to detect tension; in one demonstration a student tried to address a letter to a Mr Abotu, a name the software repeatedly insisted on changing to "Mr About". As the software registered the student's grip tightening in fury, Clippy appeared and observed, "It seems as if you're frustrated. Should I turn off AutoCorrect?"

The squeezable mouse was one of many tools the Affective Computing group devised at the end of the 90s to enable machines to read emotional cues better. These included spectacles that could detect a frown and a hat with a camera to scan facial expressions.

There was also something named the "galvactivator", a glove that measured the electrical current conducted between two points on the skin, which increases with sweating, to provide an index of emotional arousal. Back in 1999, MIT students used the galvactivator to enhance Quake -- plugged into a modified version of the game, it made characters leap backwards when it registered players' shock. But the interface it provided was far more significant than this use suggests: the wearable biosensors created then have become the basis of the most far-reaching applications yet to emerge from Picard's lab.

In her office on the second floor of MIT's Media Lab building, Picard shakes out a small plastic bag over her desk. Picking through a selection of grubby-looking towelling sweatbands, she alights on a dusty pink one, decorated with a bow. "Ah," Picard says, "there's the one I could have worn to that cocktail party the other night." Inside each of the bands is a prototype of Affectiva's Q Sensor -- a metal and plastic lozenge slightly larger than a box of matches -- that records electrodermal response and wirelessly transmits the data to a laptop or smartphone. Until now the only way to gather such data has been through fingertip electrodes, which can only be worn for up to 20 minutes at a time. But the MIT team's wearable and durable biosensor now makes it possible to map an individual's stress levels in real time, for weeks at a stretch. Picard and her students have been enthusiastic guinea pigs, providing thousands of hours of information about their emotional states; when Wired arrives, the first thing Picard does is strap on a sensor, which streams data to the computer on her desk. Picard knows how her nervous system reacts to REM sleep, parties, being ill and giving presentations; she can show you in detail how she reacted to taking her son on a Six Flags rollercoaster for his birthday (she found getting the boy and his friends to the theme park more stressful than anything that happened inside its gates).

But the team's most extensive field tests of the sensor so far have been with autistic children. On most parts of the autism spectrum, sufferers have the same limitations in their interactions with others that humans encounter with computers. They lack empathy and find it hard to read the social and emotional cues of others.

For many, communicating their feelings is impossible; many have language difficulties; some simply cannot speak. One of the worst symptoms of this emotional opacity is the apparently sudden and inexplicable onset of what Picard calls "challenging events" or "meltdowns", in which the children express frustration by biting or hitting themselves or others. These episodes can seem all the more unexpected because they're often preceded by apparent calm. As the child grows up the meltdowns can be increasingly dangerous to themselves and others; some parents find no alternative to committing their children to institutions where their violent outbursts can more easily be managed. But in an ongoing trial with a group of 15 children at the Groden Center for Autism Research in Providence, Rhode Island, the Affective Computing group is using the Q Sensor to reveal the hidden meteorology of these emotional storms.

The biosensors showed that, far from arriving abruptly, meltdowns were the climax of gradually rising stress levels, which, in these often hypersensitive children, caused a sensory overload and shutdown -- which gave them an appearance of placid relaxation.

If the stress continued, the child's readings finally spiked into meltdown. By monitoring a child's readings in real time through the sensor, one can isolate and remove these causes of the stress as soon as they appear.

The light that the Q Sensor can shed into the minds of autistic people may also save lives. Picard cites a boy, treated by research pioneer Ted Carr, who had very limited language and was, Picard says, "unable to get his words to match his thoughts". The boy would often respond to stress or pain with self-injury, banging his head against a wall to calm himself -- the action released soothing endorphins into his bloodstream. When one day the boy's head-banging began to increase unexpectedly, his carers had no explanation until he was hospitalised with acute appendicitis, and died. "This boy had no way of describing his pain," Picard says, and gestures to her own Q Sensor reading slowly tracking across the monitor in front of her. "This goes up with pain. What if someone had seen this signal go through the roof and said, 'What's going on?', and taken him to be checked out?"

The Affectiva offices are on the second floor of a bland two-storey red-brick building on an industrial estate next to a dentist's surgery in the Boston suburb of Waltham. When Wired visits in the summer, the company has been installed for six months, but the empty bookshelves and unmarked whiteboards in rows of deserted offices suggest hiring is still under way. In the conference room, Rana El Kaliouby flips open her MacBook to demonstrate her invention, the company's flagship product: Affdex, a cloud-based application that makes it possible for practically any digital device to read human emotions.

Like the Q Sensor, Affdex was proved in the crucible of autism research. In 2001, El Kaliouby -- whose master's thesis at the American University in Cairo had been written on an elementary facial-tracking system for computers -- began conducting doctoral work at Cambridge University on developing an emotionally intelligent machine that could read faces. She knew nothing about autism until, giving a lecture in which she discussed the difficulties of teaching a computer to recognise expressions and map them to emotional states, an audience member said that the problems sounded just like those experienced by his brother. "My brother is autistic," he said.

Fascinated by the possibility that she could borrow from autism theory to build an emotionally intelligent machine, El Kaliouby contacted Simon Baron-Cohen, head of the Research Centre in Cambridge.

Baron-Cohen -- a cousin of comedian Sacha -- was building a taxonomy of the emotional states that people can communicate using their faces, and had created a video database of six actors performing 4,000 permutations of 24 different emotions -- not only happy or sad, but also more nuanced states such as confused or seductive. He was using the videos in a study, teaching autism sufferers to interpret the expressions, to ease their interactions in the outside world.

El Kaliouby wanted to use them for much the same thing, except that her pupil was a computer. She wrote software that enabled the machine to analyse the video images, and lock on to 20 different feature points on the faces of the actors. The computer then mapped the patterns of pixels it saw to identify individual expressions -- composed of face tilts, nose wrinkles, eyebrow raises -- and codify groups of those expressions into emotional states classified by Baron-Cohen. "And then I realised," says El Kaliouby, "wouldn't it be so cool if we took that technology -- that's now trained using all this data -- and applied it to improved human communication."

Her first thought was to use the system to create a device that could provide autistic children with the social feedback they couldn't perceive for themselves. This was the conception of the "emotional social-intelligence prosthesis", or ESP -- a series of wearable devices, culminating in a pair of glasses with a webcam and an LED built into the frames. In conversation, the webcam pointed at the face of anyone the wearer spoke to, providing a real-time feed of their expression and head movements for streaming analysis by El Kaliouby's software. The LED gave the wearer feedback about the meaning of the listener's expression -- engaged (green), neutral (amber) or bored (red) -- with an accuracy of up to 88 percent. Eventually, the capabilities expanded beyond basic emotions, to embrace, for example, the subtleties of "thinking" (brooding, choosing and judging). By employing machine learning, the more expressions it saw, the more accurate it became. El Kaliouby christened it Mindreader.

In 2004, Picard and El Kaliouby met when Picard was included on the committee for El Kaliouby's dissertation, and the two began collaborating almost immediately. Two years later, El Kaliouby joined her at the Media Lab, and they embarked on a five-year study using their core technology -- the Q Sensor and Mindreader -- with the children at the Groden Center. Each year, the team presented their results to visiting commercial sponsors of the Media Lab, and each year they heard the same thing: the autism work was impressive, but the technology had potentially more far-reaching applications in marketing and product testing. "We realised, wow, this could very well go beyond autism," El Kaliouby says. "This could help people around the world communicate. This could help bridge the gap between consumers and businesses."

Picard and El Kaliouby believed the best way to fulfil this potential was by drawing more students into the research programme, and went to Frank Moss, then director of the Media Lab, for permission to expand it. He refused. If they really wanted the tools they had developed to reach as many people as possible, he told them, they should spin off their own business. Reluctantly -- they wanted to conduct research, not run a company -- in 2009, they agreed.

Affectiva was launched last summer, offering both the Q Sensor and Affdex as market-research tools, in collaboration with Millward Brown, the agency that handles ad testing for many Fortune 500 companies -- to analyse viewer response to commercials. Graham Page, now head of Millward Brown's neuroscience practice, had spent almost ten years looking for technologies that could be used to improve their testing processes, but found many shared the same problem: "A lot of the methods that neuroscientists use don't translate well out of the lab. They're often cumbersome, or require devices that you have to strap on people's heads or around their chests," he says.

But Affdex was exactly what he had been looking for. Affectiva began with a pilot project in March this year based on recording the faces of viewers watching three TV ads broadcast during the Superbowl, which they streamed online; in the second quarter of 2012, they worked on 200 more commercials, for clients including Intel, Unilever and Coca-Cola.

El Kaliouby launches Affdex on her laptop and, with a couple of clicks, the webcam light winks on. After a camera check, as the machine makes sure my face fills the frame and is well enough lit to be analysed, a commercial for Doritos begins: two hefty frat-boy types in their living room; one complains that the other has eaten all the crisps. "Relax, bro-chaco," he replies, "this new phone I got will get us anything we want." He demonstrates, by asking the phone to send more Doritos, and then a sombrero, which magically plink into existence around him. His friend takes over: "Send three hot, wild girls." "Sending three Rottweilers," replies the phone. Uh-oh. After the punch line -- three women in low-cut outfits left in the suddenly deserted room, asking, "So... why are we here again?" -- there's another pause while the machine transfers the video of my face into the cloud for processing, inferring emotional state from my expressions. It then presents its analysis of my reaction on a five-layer graph mapping a video strip of the ad against fluctuating emotion tracks: smile, surprise, confusion/dislike, attention and valence, or the intensity of feeling. My response is apparently close to the global average: a slowly rising track of smile and surprise, peaking with the appearance of

the barking dogs; broadly, the ad is a success.

But there are subtle regional variations, inferred from IP addresses: it went down better in California than in Middle America. "We love our Siri-enabled phones out West," says Avril England, Affectiva's head of marketing. "Common-sense midwesterners don't have a lot of time for that flim-flam."

The appeal of the technology to the giant corporations is simple: it's a window into the minds of consumers. Once, ad testing agencies relied upon focus groups to contemplate what they'd seen, and code their reactions using scores out of ten on a piece of paper. But this was an often unreliable process, as each individual's interpretation of their own feelings is subject to the vagaries of self-awareness, memory and self-censorship, or what El Kaliouby calls a "cognitive filter". Affdex circumvents that filter, so test subjects unconsciously reveal themselves before the merciless eye of the computer.

In follow-up interviews about the Superbowl ads, some viewers reported that they didn't realise they'd done anything with their faces. "You don't think you're reacting, but you're giving some signs, whether it's furrowing your brow, leaning in, or tilting your head," England says. This has produced some surprising results. El Kaliouby cues up a TV commercial for a women's body lotion recently launched in India. The ad is simple: a man returns home to find his wife in the garden, where she draws his attention to her bare midriff with a tinkling charm dangling at her waist. He reaches out, touches her stomach and -- bingo -- is captivated by how smooth her skin feels. Cut to pack shot.

In March this year, Affectiva tested the commercial on Indian women, using both Affdex and conventional interviews. When asked to recall what happened in the advertisment, many women failed to mention that the woman's husband had touched her skin, suggesting that it was unmemorable and therefore ripe for trimming; the majority of those that did said that they found it offensive and that it should be removed from the broadcast version. Because, El Kaliouby explains, "this is a conservative culture -- you don't show skin, or the guy touching his wife." Affdex told a rather different story. "When you looked at the participants' facial expressions, it was the exact opposite. A lot of the women were smiling. They clearly liked it." In April this year, the clients launched the campaign in India, using the full spot that the survey group had claimed to disapprove of. "If they hadn't used our technology for that particular test," England says, "they would have axed that scene for sure. And that scene is probably the most memorable."

In the US, Affectiva is now extending testing into longer formats: Sony recently employed the company to read audience responses to movie trailers, and Nielsen may use the system for TV ratings. Disney, England tells me, is considering using Affectiva's tools to product test all of its content. Taking El Kaliouby's face-reading system beyond the confines of the Media Lab has not only made the technology commercially viable; deploying its machine-learning capacity in the wild has also made it exponentially more perceptive.

Before Affectiva launched, it had taken El Kaliouby six years to show her system 1,000 separate clips of human faces. "Now we have about 90,000 in our platform," she says. "Around 54 million facial frames. And it's all global. Chinese faces, Indian faces, Russian faces: it keeps learning." Increasing the scale of samples from those measured in hundreds to the tens of thousands has led to a dramatic jump in accuracy -- from 75 percent in some expressions to more than 90 percent; the machine can now detect any expression of disgust in 97 percent of cases.

And the growing saturation of the world with webcam-equipped devices now means that computers that can read feelings may soon be unavoidable. In India, the Affectiva team was taking a four-year-old Nokia mobile into shopping malls and homes to run Affdex tests. "There are about four billion smartphones out there already," England says. "That's a lot of cameras."

If Affectiva realises its ambitions for its technology, it will add a striking new dimension to the ways in which, as social media and digital interaction tighten their grip on the globe, we all communicate with one another. "Everyone is increasingly attached to some form of technology -- whether their mobile phone, tablets or laptops -- for an increasingly large portion of their communication," England says. "And yet emotion is not being transferred to this format. We can really enrich these interactions in a way that's innately human -- with emotional content that's more sophisticated than an emoticon or a 'like' button."

At the moment, what England is talking about is allowing Facebook or YouTube to read your expression as you watch a video of your friend's new baby and immediately post your reaction online.

Picard, naturally, sees wider and more significant applications.

She is well aware of the potential dangers of the technology -- the US National Security Agency has already expressed a discreet interest in her work -- but says that Affectiva is focused on an ethical and open use of affective computing. "Our mission is to enable the communication of emotion when people want to opt in -- not the extraction from them of things that they may not want to share." Yet she has a convincing example of how the face-reading system might be used anonymously to help bring more peaceful solutions to political problems. During the Arab Spring -- when El Kaliouby was at home in Cairo -- both she and Picard were struck by the gulf between what Hosni Mubarak was saying on television and the negative reactions of everyone El Kaliouby knew. "Mubarak was still talking to the people as if they liked what he was saying.

What were his advisers telling him?" Picard asks. "If people had been able to Affdex their faces watching his speech -- scowling, with asymmetric smiles and grimaces -- then we would have a way to aggregate that very powerful feedback."

Since Picard first publicly outlined her notions of what affective computing could be in 1995, digital machines have become ever more intimately enmeshed in our lives. Picard's lab continues to find ways in which that process will only intensify. In the meantime, the goal that originally inspired her remains as far away as ever: the realisation of a true artificial intelligence is as remote now as it was 15 years ago. But Picard has no problem at all with this. "Which would you rather invent? I want to invent the thing that enables us all to have better experiences in life. To better understand one another, to have deeper relationships, to help people who have an illness that has been misunderstood," she says. "I don't want to be the one who invented the thing that makes me feel like a dog."

Adam Higginbotham wrote about Chernobyl in 06.11

This article was originally published by WIRED UK