On April 11, Bloomberg published an investigation into a team of people on Amazon who listen to recorded conversations of users with Alexa’s voice assistant and help the device correctly recognize and process their requests. This was told by seven anonymous Amazon employees.
In fact, employees of Amazon, Google, Apple and Yandex can analyze only a tiny fraction of the records, and there is no evidence that all conversations are continuously sent to the company’s servers in full.
Bloomberg investigation briefly
- Several thousand Amazon employees around the world listen to voice assistant user records – transcribe audio, split into semantic blocks, place annotations and return to the device. In this way, the Alexa team improves the quality of speech recognition.
- The teams of employees are located in Boston, Romania and Costa Rica, they listen to about a thousand audio recordings in a 9-hour working day. They sign the NDA (non-disclosure agreement) and act as independent contractors, not employees of the company. There are no Amazon identifiers in rental offices either.
- Most of the work is routine. For example, one of the employees tracked Taylor Swift’s mentions and posted explanations in records that the user meant the singer. Amazon employees also check the quality of automatic transcription of the Alexa request and evaluate the quality of the interaction between the machine and the person: what the user asked and how effective the answer was from Alexa.
- Sometimes employees come across personal information of people: bank card numbers, accounts, phone numbers or addresses. Such entries are marked as “critical”. Also, employees come across accidentally made records, without the Alexa call command – they also need to be marked.
- If sounds like criminal acts or sexual harassment were heard on the record, Amazon tries not to interfere with this, but sometimes works with the authorities.
- Employees have special chats in which they consult about obscure recordings and exchange funny audio, for example, in which they sing poorly in the shower.
- Amazon employees do not receive customer personal data, but they see device serial numbers, names and account numbers. At Google and Apple in similar departments, the data is completely anonymized, and the sound may be partially distorted.
According to Bloomberg, corporations often miss the human role in machine learning: for example, in marketing materials, Amazon says that Alexa “lives in the cloud and constantly smarter,” and users regard this as a fully automatic “artificial intelligence” that learns without participation person.
Users do not know that they can be listened to, but they are afraid of this by asking: “Alexa, is someone else listening to us? Do you work for the NSA ?, writes Bloomberg.
Why smart devices keep user conversations
Smart devices technically always “hear”, but do not “listen” to user conversations. They constantly record short pieces of audio in order to hear the activating word – “Alexa”, “Hello, Siri”, “Listen, Alice”, “OK, Google”. If an activating word is found, the recording is saved and sent to the voice recognition service, a dialogue with the voice assistant begins.
If the activating word is not found in the piece of audio, it is not saved and, according to the BBC, until evidence has been found that all conversations are continuously sent to the company’s servers in full.
Mistakes happen. For example, the two most popular activating words Alexa, “Echo” and “Alexa,” may not be correctly recognized in some languages. In French, the combination of the words “avec ça” (“with him” or “with her”), and in Spanish, Hecho (“fact”) can trigger the activation of the voice assistant and send the recording to Amazon servers.
Why do companies analyze requests manually
The developers of all voice assistants manually transcribe small volumes of anonymized audio data to improve the quality of speech recognition. This method is one of the main ways to reduce the number of errors of voice assistants and is used to take into account dialects, accents and illegible speech of the interlocutor, as well as regional and distorted expressions.
Developers get the basis for further independent training of voice assistants based on the knowledge gained. This is called “active learning,” Wired writes. The system determines the areas of knowledge that a human expert needs, receives the correct answer from a request from him and learns from him.
Amazon’s response and voice assistant developer privacy policies
Alexa’s terms and conditions of use state that voice recording is used to “answer questions, complete queries and improve your interaction and our services.” The direct analysis of the audio recordings by people is not indicated.
Amazon’s statement for Bloomberg says that only a very small number of records are auditioned by employees, and the company is very serious about data privacy and security. Employees cannot use the records they have received or identify users in any way.
This information helps us train our speech recognition and natural language understanding systems so that Alexa better understands requests and provides quality service for everyone. Amazon has strict technical and operational security measures, as well as a zero tolerance policy regarding abuse of our system.
The corporation also admitted that employees can analyze audio recordings from interactions with Google Assistant, which is built into Android smartphones and smart Home speakers. Google claims that before analysis it distorts the sound to mask the user’s voice and deletes all personal information.
Google Home devices also have a physical mute button.
Voice recordings are stored for six months with a random identifier, then the identifier is deleted from them, but continue to be stored to “improve voice recognition”.
Anonymous voice recordings include user replicas addressed to the voice assistant after activation. These records are transferred to the Yandex server in order to recognize the requested command and then implement the required voice assistant function.
According to the representative of Yandex, Matvey Kireev, the company’s employees do not receive more than one message from the user, and the total number of processed requests is “extremely small” compared to the total number of calls to Alice.
Modern machine learning is arranged in such a way that at some stages it requires the help of a person – thanks to this, in particular, today we can communicate with voice assistants, and they understand us and are able to respond well.
For example, if Alice incorrectly recognizes something and gives an incorrect answer, our employees can look at a single request to help train Alice in the future not to repeat mistakes. Such employees receive no more than one message from the user, and they have no way to understand what kind of person said a particular phrase. The total number of requests addressed in this way is extremely small compared to the millions of calls to Alice per day.
We take the security of our users’ data seriously. All information that we receive from people communicating with Alice is encrypted and stored in anonymized form. We have no way of hearing that a person says “Alice” before he activates her with a voice or a button.Matvey KireevYandex representative
One of the tools for processing voice recordings of Yandex is the crowdsourcing service Yandex.Toloka. Among the tasks you can find checking audio recordings for correctness, transcribing words and assessing the quality of Alice’s response.
Avoiding recording conversations with voice assistants
Alexa privacy settings do not completely abandon the recording and analysis of voice messages, but you can stop using them to “develop new features”. You can also listen to and delete previous voice recordings.
The company allows you to listen to and delete voice recordings on the page ” My activity “. You can also turn off the recording of conversations with the “Assistant” in the “ Voice Control History ”.
The corporation does not provide the opportunity to listen to recordings of conversations with Siri, since they are impersonal and are not tied to an identifier or Apple ID by a user. To delete voice recordings created by Siri on the device, you need to go to the Siri menu in the settings and turn it off, and then turn off voice input in the keyboard settings.
In the Yandex application settings, you can disable the voice activation of the team assistant. Also, Yandex.Stations has a mechanical microphone mute button to avoid listening in standby mode.