Researchers at UC Berkeley have shown they can embed stealthy commands for popular voice assistants inside songs that can prompt platforms like Siri or Alexa to carry out actions without humans getting wise.
The research, reported earlier by The New York Times, is a more actionable evolution of something security researchers have been showing great interest in: fooling Siri.
Last year, researchers at Princeton University and China’s Zhejiang University demonstrated that voice-recognition systems could be activated by using frequencies inaudible to the human ear. The attack first muted the phone so the owner wouldn’t hear the system’s responses, either.
The technique, which the Chinese researchers called DolphinAttack, can instruct smart devices to visit malicious websites, initiate phone calls, take a picture or send text messages. While DolphinAttack has its limitations — the transmitter must be close to the receiving device — experts warned that more powerful ultrasonic systems were possible.
That warning was borne out in April, when researchers at the University of Illinois at Urbana-Champaign demonstrated ultrasound attacks from 25 feet away. While the commands couldn’t penetrate walls, they could control smart devices through open windows from outside a building.
The specific research emerging from Berkeley can hide commands to make calls or visit specific websites without human listeners being able to discern them. As capabilities widen for smart assistants that make it easier for users to send emails, messages and money with their voice, things like this are a bit worrisome.
These exploits are still in their infancy, as are the security capabilities of the voice assistants.
One takeaway is that digital assistant makers may have to get more serious about voice authentication so that they can determine with greater accuracy whether the owner of a device is the one voicing commands, and if not, lock down the digital assistant’s capabilities. Amazon’s Alexa and Google Assistant both offer optional features that lock down personal information to a specific user based on their voice pattern, meanwhile most sensitive info on iOS devices requires the device to be unlocked before it’s accessed.
The potential here is nevertheless frightening and something that should be addressed early-on publicly. As we saw from some of Google’s demonstrations with their Duplex software at I/O this week, the company’s ambitions for their voice assistant are building rapidly and as the company begins to release Smart Display devices with its partners that integrate cameras, the potentials for abuse are widening.