Echo Custom Wake Words: Why Isn’t This A Thing Yet?

Post Updated 10/12/18 – One of the most frequent complaints I see on Echo user discussion boards is about lack of support for custom wake words. As of this writing (last updated 10/12/18), you can either go with the default wake word of Alexa, or change it to Echo, Amazon or Computer in the Alexa app. Note that wake word options may vary in different geographic regions.

Voice Recognition Tech: It’s All About The Phonemes
Before I can get into the whys and wherefores of wake words, I need to provide a little background on phonemes because these are the foundation of voice recognition tech.

Phonemes are the distinct units of sound that characterize words used in spoken languages. Phonemes can be represented in writing as phonetic spellings. For example, the phonemes that make up the wake word “Alexa” could be represented as:

uh + leks + uh

Voice recognition tech is entirely dependent on the careful use of phonemes, and the user’s proper pronunciation of them. Voice-activated services and devices can only interpret spoken requests that can be matched against that service’s or device’s repository of phonemes, which are mapped to specific words either using real-time calculations or by checking against a database.

It’s like the way adults try to figure out a toddler’s first words and sentences. It took a while for me to realize my son’s first word(s) was/were “kitty cat”, because what I heard him saying were the phonemes kee and kah. Eventually I was able to match his “kee kah” to the words “kitty cat”, but it took repeated hearings and him pointing at an actual cat for me to finally clue in.

Voice recognition tech isn’t so very different: Alexa hears your phonemes and compares them to available data to arrive at what you said, or might have said.

What Does This Have To Do With Wake Words?
There’s a reason why it’s taken so long for voice recognition tech to become widely available, and it’s the same reason why voice recognition tech is still imperfect: speech synthesis engineering is HARD. There are people who’ve made entire careers and PhD theses out of studying specific aspects of speech synthesis.

Speech synthesis engineering is hard because even among people who theoretically speak the same language, pronunciations can vary widely. Where I’d say I wash my clothes, my dad would say he “warshes” them, for example. In some parts of the U.S. people barely pronounce certain consonants, while in other parts they’re particularly sharp.

In many cases, the way some people pronounce certain words makes it sound like they’re saying entirely different words to others who are not familiar with the different pronunciation. Someone who hears my dad say, “I’m going to do the warsh,” might think he said, “I’m going to do the marsh,” for example, and conclude he’s off to fish some local swamp.

Now add regional and international accents to the mix and you begin to appreciate the scope of the challenge in getting a machine to match spoken phonemes to actual words.

The study and analysis of wake words is an entire category of voice synthesis engineering unto itself. Consider this excerpt, “Wake Up Word Recognition” from the book Speech Technologies, ISBN 978-953-307-996-7. This excerpt is from a chapter written by Veton Kepuska:

[A Wake Up Word] has the following unique requirement: Detect a single word or phrase when spoken in an alerting context, while rejecting all other words, phrases, sounds, noises and other acoustic events with virtually 100% accuracy including the same word or phrase of interest spoken in a non-alerting (i.e. referential) context.

Doesn’t sound so simple now, does it? In choosing wake words, Alexa engineers had to find words that were not only easy for the user to pronounce and remember, but were also unusual enough that they’re not commonly used at the start of sentences. Remember: emphasis plays a part in this too, hence that “virtually 100% accuracy including the same word or phrase of interest spoken in a non-alerting (i.e. referential) context” bit.

The combination of phonemes uh leks uh isn’t similar to many other English words at all, so it’s a good choice. The most common ‘false wake’ I see from my own Echos is from the word “next”, and even then, only when that word is spoken with emphasis.

Yeah, But That Still Doesn’t Explain Why I Can’t Pick My Own Wake Word
Anyone who isn’t a voice synthesis engineer isn’t used to thinking in terms of phonemes, and isn’t likely to choose wake words that are unique enough to avoid false wakes.

Don’t believe it? Think of a likely wake word you might use. Now go to the RhymeZone Rhyming Dictionary and Thesaurus, search on your word and check out all the words whose phonemes are a very close match for your proposed wake word.

Now do the same search on the same site with the word “alexa”. There’s not a single exact match for a complete word that rhymes. There are syllable matches, but once those syllables are combined with other syllables to make an actual word a false match will be prevented by the total number of syllables in the complete word, and the likelihood that the phonemes uh leks uh will not be spoken with emphasis within the complete word.

Okay, But I’m Smart Enough To Come Up With Several Options That Won’t Cause False Wakes
Fine, and I’ll take your word for it. But what about everyone else?

Lots of consumers have already jumped to wrong conclusions or incorrect assumptions about some aspect of the Echo being “broken” based on their ideas of how they think the technology should work instead of a true understanding of how it does work.

If Amazon were to give Echo owners the ability to program in their own, custom wake words it’s likely the great majority of them would lead to many false wakes. Considering how angry many Echo owners get just from false wakes caused by an Echo television ad, doesn’t it seem likely that false wakes caused by poor wake word choices would cause at least as much rage among those affected? And doesn’t it also seem likely those angry consumers, who have no background in speech synthesis, wouldn’t know the problem was their own fault and would blame it on Amazon?

More Wake Words May Come, But Don’t Get Too Excited
When Echo order invitations first went out Amazon famously said more wake words would be added “in the future”, and they have added “Computer” and “Echo” as wake word options. But they’ve been pretty mum on the subject since then. In any event, for the reasons I’ve outlined here I doubt it will ever be possible for Echo owners to choose any wake word they please.

A more likely scenario is that Amazon’s Alexa engineers will come up with a list of options to choose from, consisting of words they’ve vetted as unlikely to lead to many false wakes.