Press "Enter" to skip to content

Google’s Translatotron Can Convert Speech In The Voice Of The Speaker

Speaking a different language might be getting simpler. Google is touting Translatotron, a first-of-its-type translation algorithm that can squarely translate speech from one language into different while maintaining a cadence and voice of the speaker. The tool skips the ordinary step of converting text to speech and back to text, which can often result in errors along the way. Rather, the end-to-end method directly converts voice of a speaker into different language. The firm is expecting the development will unlock future progress employing the direct conversion model.

As per Google, Translatotron employs a series-to-series network structure that takes a voice input, runs it as a spectrogram, and creates a new model in a target language. The outcome is a much quicker conversion with fewer odds of something getting lost in the process. The tool also operates with an elective speaker encoder element, which operates to maintain voice of a speaker. The converted speech is still artificial and seems a little robotic, but can efficiently maintain some components of the original voice.

On a similar note, for most consumers, voice helpers are useful tools. But for the millions of people individuals with speech impairments due to neurological cases, voice helpers can be yet one more frustrating hurdle. Google needs to modify that. At its I/O developer conference last week, Google disclosed that it is training AI to better know various speech patterns, such as impaired speech led by conditions such as ALS or brain injury.

Via Project Euphoria, Google associated with ALSRI (ALS Residence Initiative) and ALS TDI (ALS Therapy Development Institute). The concept was that if family and friends of users with ALS can know their loved ones, then the firm can train PCs to do the same. It just required presenting its AI with sufficient samples of impaired speech patterns.