Publication Date: 11/2/16 – Just to remind everyone, since next Tuesday is Election Day here in the States I’m running what would’ve been next Tuesday’s AlexaDev Tuesday post today, on a Thursday.
Push is something many devs have been wanting to implement, so today I’ve got a guest post from Matt Farley sharing his project: a workaround for Alexa push notifications.
Take it away, Matt!
Like everyone else, I’ve been waiting for true Push Notifications since the Echo was released. This weekend I was able to put together a *very* quality workaround thanks to the work of John Graves and Miguel Mota. I have built upon their examples with a major improvement that results in higher quality Text-to-Speech (response from the Alexa Voice Services).
The YouTube demo consists of 4 example push notifications:
When I receive an SMS text, it is read by Alexa.
When I simulate physically “coming home” by activating my cell phone wifi and it logs onto the network, Alexa greets me.
The push of my kids Amazon DASH buttons to launch their favorite cartoons on the TV. (I’m not manually controlling the TV, it’s all done via scripted/automated IR Blasters on a Ubuntu HTPC)
Automatic.com cellular dongle in our minivan lets us know my wife just pulled onto our street
In the demo you will actually see several notification systems tied together:
– Desktop Notifications sent by my Linux server to all TVs
– Visual light flash automations (sent by my server to our LIFX WiFi light bulbs)
– The audio Push Notification spoken by Alexa
How the Alexa Push Notifications work: (90% of the work was done by John and Miguel cited above)
1. Register an application for Alexa Voice Service (follow Miguel’s instructions linked above)
2. Use John Graves code as a starting point to interact with AVS via command-line
3. Custom code (running on my home server) receives notifications from a number of sources (Android notifications are sent from our devices to the server using a home grown app), the WiFi connection and DASH button pushes are detected by my DHCP server, which initiates the notifications in response. Bottom line — some code or something to initiate a notification. Could be anything.
4. Write the text of the notification to a file on the server, e.g. /tmp/AlexasAnnouncement.txt
5. Send a canned boiler plate pre-recorded .wav to the AVS, in my case it’s a recording of me saying “tell <skill/app> push notification”
6. My custom Alexa skill receives the words “push notification”, then –
7. Reads and responds with the text from step 4
8. The command in step 5 receives the .mp3 response from AVS (which is the text that was put in AlexasAnnouncement.txt)
9. That mp3 is then played on the Echo over Bluetooth
“Some text to be spoken” > saved in .txt
Send pre-recorded .wav to AVS: “tell <skill/app> push notification” > Alexa Skill
Alexa Skill responds to ‘push notification’ by sending contents of .txt to AVS
AVS sends response as > .mp3 data stream of Alexa speaking the text with enunciation .mp3 played > Bluetooth
Bluetooth > Echo speaker
Everything above is completely scripted/automated. So when we receive notifications from our phones, desktops, DASH buttons, etc, etc, or notified that someone’s at our front door (via security camera), the server in the house writes some text in a file that gets spoken out loud by our trusty Echo (in addition to flashing lights and popups on the PC’s and TV’s).
Question: Why not just send the text directly to AVS using pico2wav (STT) and play the response like John Graves does in his original code? i.e. why are you creating an Alexa Skill to be notified of a notification and read it from a text file? (which adds several loops to the workflow, versus John’s original)
Answer: I originally started just using John’s pico2wav to convert my notification to a computer-spoken .wav file for AVS to translate (e.g. “simon says ). However, I found that AVS often misunderstood the pico2wav files. For example, if my notification was something like “Where are you?”, the quality of the pico2wav may result in AVS hearing “We are who?” You also lose the punctuation and intonations that Alexa is capable of when you are sending her text in a skill (as opposed to a computer-generated .wav speech-to-text)
I’ve had this going for a few days now and am very surprised how well it works. And I can finally quit worrying about when Amazon will give us native push notifications.
Now… if they’ll just give us hardware GUIDs in our skill interactions then I won’t have to use a different Amazon account in each room of the house (so my custom skill knows which lights and TV to turn on and off without forcing the user to specify).
If you want to see more of my Alexa home automation, check out this demo of Jarvis. It’s about a year old and the functionality has increased since the video was taken.
Click here to view Matt’s original post on the Amazon Developer Forum, where you can also post questions and comments if you’re a registered user there (and you really ought to be, if you’re not already).
If you prefer, Matt has also given me permission to share his email address for anyone who’d like to contact him directly for more information.