Duck Soup: AI Defends Itself Against Malicious Messages Hidden in Speech

Wednesday, May 15, 2019

AI Defends Itself Against Malicious Messages Hidden in Speech

Computer scientists have devised a way of making computer speech recognition safer from malicious attacks — messages that sound benign to human ears but hide commands that can hijack a device, for example through the virtual personal assistants that are becoming widespread in homes or on mobile phones.

Much of the progress made in artificial intelligence (AI) in the past decade — driverless cars, playing Go, language translation — has come from artificial neural networks, programs inspired by the brain. This technique, also called deep learning when applied at a large scale, finds patterns in data on its own, without needing explicit instruction. But deep-learning algorithms often work in mysterious ways, and their unpredictability opens them up to exploitation.

As a result, the patterns that AI uses to, say, recognize images, might not be the ones humans use. Researchers have been able to subtly alter images and other inputs so that to people, they look identical, but to computers, they differ. Last year, for example, computer scientists showed that by placing a few innocuous stickers on a stop sign, they could convince an AI program that it was a speed-limit sign. Other efforts have produced glasses that make facial-recognition software misidentify the wearer as actress Milla Jovovich. These inputs are called adversarial examples.

Sounds suspicious

Audio adversarial examples exist, too. One project altered a clip of someone saying, “Without the data set, the article is useless” so that it was transcribed as, “Okay Google, browse to evil.com.” But a paper presented on 9 May at the International Conference on Learning Representations (ICLR) in New Orleans, Louisiana, offers a way of detecting such manipulations.

Bo Li, a computer scientist at the University of Illinois at Urbana-Champaign, and her co-authors wrote an algorithm that transcribes a full audio clip and, separately, just one portion of it. If the transcription of that single piece doesn’t closely match the corresponding part of the full transcription, the program throws a red flag — the sample might have been compromised.

The authors showed that for several types of attack, their method almost always detected the meddling. Further, even if an attacker was aware of the defence system, attacks were still caught most of the time.

Li says that she was surprised by the method’s robustness, and that — as it often happens in deep learning — it is unclear why exactly it works.

by Matthew Hutson, Nature | Read more:
[ed. We're already playing catch-up.]