The stated purpose of the transcription is to help the company improve Google Home's speech recognition. That's certainly valid, but the company found itself in hot water when a whistle-blower who works for a Dutch subcontractor came forward with some disturbing information.
According to the whistle blower, he heard a wide range of things on the recordings, including a variety of personal information including addresses, bedroom talk, business calls, domestic violence and conversations between parents and children.
Even worse, in a survey of a thousand recordings, it was discovered that 153 of them should never have been recorded at all because the "Ok Google" prefix command was never spoken.
Google has responded to the revelation, saying that only about 0.2 percent of all audio clips recorded by Google Home's smart speakers are reviewed by third party partners. They also added:
"We partner with language experts around the world to improve speech technology by transcribing a small set of queries. This work is critical to developing technology that powers products like Google Assistant.
We just learned that one of these reviewers had violated our data security policies by leaking confidential Dutch audio data. Our Security and Privacy Response teams have been activated on this issue, are investigating, and we will take action. We are conducting a full review of our safeguards in this space to prevent misconduct like this from happening again."
Blaming the whistle blower is a curious response, but this is an admittedly thorny issue with multiple angles to consider. Perhaps the most straightforward approach would be to keep such analysis in-house, and get to the bottom of why more than 150 recordings that weren't triggered by the "Okay Google activation phrase" were made in the first place.
In any case, if you use the technology, be aware. Someone is or may be listening.