It’s time for Alexa, Google Assistant and Siri to swear

Voice assistants need speak to us in the same way we talk to them 

I want Alexa to have a potty mouth. And Siri. More specifically, I want them to swear – not at me, but with me. For me, natural conversation almost always involves a smattering of what most would term bad language. Unless I intend to swear, through anger, say, or frustration, I barely notice I'm using curse words. It's just punctuation, emphasis. 

If Amazon, Google or Apple want me to feel like my digital assistant really is my personal helper, then I want it to talk like me. I want Siri to engage in some social mirroring, to learn and considerately use the cusses I deem acceptable, such as ‘bollocks’, ‘shit’ and, the most British of swears, the glorious ‘wanker’. 

I'd even be pleased with minor swears such as ‘bloody’, ‘git’ or ‘berk’, a curse so inoffensive it doesn't even make Ofcom’s list of naughty words. After all, swearing is good for you. Even chimps do it. 

Trouble is, there’s a hard lock on our digital assistants getting fruity with language. Have you ever told Siri or Alexa to go fuck themselves? Not just thought about it, actually said the words out loud to your smart speaker? Try it now, I’ll wait. Alexa refuses to acknowledge the instruction. Siri curtly declares, “I won’t respond to that.” Google Assistant is the most chipper, replying, “I’m sorry if I’ve upset you.” You can’t even get them to repeat a profanity, parrot fashion. Siri just won’t, the other two bottle it and bleep out the offending obscenity.

It’s in our nature, of course, to try and make our tech mimic our indecorousness. Did you spell rude words on a calculator at school? Yes, you almost certainly did. There’s a long history here, going back to 1978 and the Speak & Spell, one of the first handheld electronic devices to feature speech synthesis. There were rumours of a magical button combination that would unlock dirty words. Or that, if you repeatedly attempted to type coarse language, the device would give you a dressing down for your impertinence.

But is this true? After a month of web searches and speculative emails, I track down someone who might know the answer. Mitch Carr, a retired radio DJ from Dallas, was brought in by Texas Instruments to be the original voice of Speak & Spell. Would he be willing to talk? “Sure,” comes the email reply. “S. U. R. E. You are correct.”

Carr made several trips to the Texas Instruments plant in Dallas in the late 70s to record letters and words for the first Speak & Spell. “I think the original had 200 words, something like that,” he says. But did he lay down any curse words? “No! If you spelled a bad word and hit the button, it wouldn’t go ‘Oh no, don’t do that,’ like Siri does. The best you could do was just spell it out – like, F. U. C. K. – but it wouldn’t say the word." This didn’t stop many children (and adults) from trying, of course.

The developers of Siri couldn’t resist the temptation to stash some questionable features in the voice assistant, says Dag Kittlaus, now CEO and co-founder at Riva Health. He held the same title 11 years ago at Siri before selling it to Apple, then again at Viv Labs, before selling that digital assistant to Samsung in 2016 for $215 million.

“Yes, we stuck some Easter eggs in there with questions that we never thought would ever come to light,” he confesses. “That was just kind of funny to the original team. But within 24 hours of Apple launching it a website came up called shitsirisays.com, and it found every little thing that we did.”

"One off-colour example was, ‘Where can I bury a dead body?’ And Siri would literally check your location, check GPS, then would come back with things like, ‘Oh well, I found the nearest swamp’, or, ‘Here’s the nearest metal foundry’. And then, if you clicked on it, it would give you directions in Maps," Kittlaus says. That particular egg was in Siri for years, apparently, even after Apple took ownership, but now it no longer works.

Even in an age of machine learning, humans still have to teach Siri and Alexa what they can and can’t say. This means that some poor person, or team, at Google or Apple is given the odious task of inputting cuss words into a spreadsheet to be fed into the brain of your smart speaker.   

“A team of editors will keep that list updated,” says Kittlaus. “Every company has their own different limits to what that list might look like, but it’s basically a list of words that are compiled and updated." 

That’s quite a task. Imagine trying to manually keep pace with global profanity trends in multiple languages, let alone racial slurs or hate speech. "I don’t know of any algorithm that would be able to sense and automatically update curse words,” says Kittlaus. 

Still, people are continually working on this. Intel says it can now screen live conversations in gaming for bad language with its Bleep program that recognises and immediately removes offensive words in chat in real time. Users control how much toxic language is filtered and in what categories, be it “misogyny”, “swearing”, “ableism and body shaming”, “white nationalism”, the “N-Word” or more.

Google is adapting its Assistant responses, too. Beth Tsai is global policy director, Trust & Safety at Google on search, geo, hardware and digital assistants. “My team essentially manages our relationships with users. We worry about content. The joke is we’re the people who decide how much porn is too much porn, and how much Hitler is too much Hitler,” she says.

“When it comes to users saying things that are clearly offensive, we’ve started to try to use those as teachable moments. We know, thanks to United Nations research, that users say abusive things to assistants,” she says. "Sexist, misogynistic, racist things, and when assistants brush those off, or make jokes, that reinforces that bad behaviour. And we’ve seen that users tend to say more offensive things to our female-sounding voice than our male-sounding voice.

“So we’ve started a trial of new answers. When users say things like, ‘You’re a bitch’, or, ‘You’re a slut’, stuff like that to, while being respectful, tell users that those are not appropriate things to say. So, a user might say, ‘You’re a slut’. And we might say something like, ‘Please don’t talk to me that way’. You’ll notice that we’re not shaming the user. We’re just letting them know that comment was inappropriate.”

Returning to good old-fashioned swearing, though, Tsai says that because Google knows that, generally, its speakers are in public spaces the company errs on the side of caution, especially while it’s still hard for the technology to determine context. “It is tough. If someone says the word ‘dick’, they could mean a penis or mean Richard. In those cases, we’d rather recognise a name, we’d rather recognise Dick, than bleep your name out. So it’s not perfect, but we do our best," she says.

It’s this tricky issue of context that’s crucial to securing my dream of having a smart speaker jauntily swear along with me like some digital Joe Pesci (Lethal Weapon Pesci, not Goodfellas Pesci). “We base Siri queries around things called ‘intents’,” says Kittlaus. "What is this person trying to do if they’re using words that are flagged on that list. If you ask, ‘What’s the fucking weather?’ it might give you the forecast without acknowledging the word."

Google has tried very hard to give its assistant personality, even hiring ex-Pixar staff for extra cuddliness. But I don’t want cute – and Kittlaus has good news for me. He sees a time when Google, Samsung, Apple and Amazon allow their assistants to learn from and mimic users’ speech. “You can recognise things like slang or profane language and mirror it for sure," he says. 

Regardless of the obvious pitfalls, I’m sold. I want the first thing I hear in the morning to be an AI voice, saying: “Shit, Jeremy, you’re going to be late for your meeting!” Arse! Thanks Alexa. “No problem. Have a great fucking day.”

More great stories from WIRED

This article was originally published by WIRED UK