Smart speakers and voice-based virtual assistants are core building blocks of modern smart homes. For instance, they are used to retrieve information, interact with other devices, and command a variety of Internet of Things (IoT) nodes. To this aim, smart speakers and voice-based assistants typically take advantage of cloud architectures: vocal commands of the user are sampled, sent through the Internet to be processed and transmitted back for local execution, e.g., to perform an automation task or activate an IoT device. Even if privacy and security is enforced by means of encryption, features of the traffic, such as the throughput, the size of protocol data units or the IP addresses, can leak important information about the habits of the users as well as the number and the type of IoT nodes deployed. In this perspective, the paper showcases risks of machine learning techniques to develop black-box models to automatically classify traffic and implement privacy leaking attacks. We prove that such traffic analysis allows to detect the presence of a person in a house equipped with a Google Home device, even if the same person does not interact with the smart device. Experimental results collected in a realistic scenario are presented and possible countermeasures are discussed.

Are you (Google) home? Detecting users’ presence through traffic analysis of smart speakers

Caputo D.;Verderame L.;Merlo A.;Ranieri A.;Caviglione L.
2020-01-01

Abstract

Smart speakers and voice-based virtual assistants are core building blocks of modern smart homes. For instance, they are used to retrieve information, interact with other devices, and command a variety of Internet of Things (IoT) nodes. To this aim, smart speakers and voice-based assistants typically take advantage of cloud architectures: vocal commands of the user are sampled, sent through the Internet to be processed and transmitted back for local execution, e.g., to perform an automation task or activate an IoT device. Even if privacy and security is enforced by means of encryption, features of the traffic, such as the throughput, the size of protocol data units or the IP addresses, can leak important information about the habits of the users as well as the number and the type of IoT nodes deployed. In this perspective, the paper showcases risks of machine learning techniques to develop black-box models to automatically classify traffic and implement privacy leaking attacks. We prove that such traffic analysis allows to detect the presence of a person in a house equipped with a Google Home device, even if the same person does not interact with the smart device. Experimental results collected in a realistic scenario are presented and possible countermeasures are discussed.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11567/1011510
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? ND
social impact