Google's Translatotron can translate speech in the speaker's voice

Sponsored Links


NurPhoto via Getty Images

Speaking another language may be getting easier. Google is showing off Translatotron, a first-of-its-kind translation model that can directly convert speech from one language into another while maintaining a speaker’s voice and cadence. The tool forgoes the usual step of translating speech to text and back to speech, which can often lead to errors along the way. Instead, the end-to-end technique directly translates a speaker’s voice into another language. The company is hoping the development will open up future developments using the direct translation model.

According to Google, Translatotron uses a sequence-to-sequence network model that takes a voice input, processes it as a spectrogram — a visual representation of frequencies — and generates a new spectrogram in a target language. The result is a much faster translation with less likelihood of something getting lost along the way. The tool also works with an optional speaker encoder component, which works to maintain a speaker’s voice. The translated speech is still synthesized and sounds a bit robotic, but can effectively maintain some elements of a speaker’s voice. You can listen to samples of Translatotron’s attempts to maintain a speaker’s voice as it completes translations on Google Research’s GitHub page. Some are certainly better than others, but it’s a start.

Model architecture of Translatotron

Google has been fine-tuning its translations in recent months. Last year, the company introduced accents in Google Translate that can speak a variety of languages in region-based pronunciations and added more langauges to its real-time translation feature. Earlier this year, Google Assistant got an “interpreter mode” for smart displays and speakers that can between 26 languages.

Let’s block ads! (Why?)

Link to original source

Google trains its AI to accommodate speech impairments

Sponsored Links


Google

For most users, voice assistants are helpful tools. But for the millions of people with speech impairments caused by neurological conditions, voice assistants can be yet another frustrating challenge. Google wants to change that. At its I/O developer conference today, Google revealed that it’s training AI to better understand diverse speech patterns, such as impaired speech caused by brain injury or conditions like ALS.

Through Project Euphoria, Google partnered with ALS Therapy Development Institute (ALS TDI) and ALS Residence Initiative (ALSRI). The idea was that if friends and family of people with ALS can understand their loved ones, then Google could train computers to do the same. It simply needed to present its AI with enough examples of impaired speech patterns.

So, Google set out to record thousands of voice samples. One volunteer, Dimitri Kanevsky, a speech researcher at Google who learned English after becoming deaf as a child in Russia, recorded 15,000 phrases. Those were turned into spectrograms — visual representations of sound — and used to train the AI to understand Kanevsky.

This is still a work in progress, and for now, Google is working to bring it to people who speak English and have impairments typically associated with ALS. It’s calling for volunteers, who can fill out a short form and record a set of phrases. Google also wants its AI to translate sounds and gestures into actions, such as speaking commands to Google Home or sending text messages. Eventually, it hopes to develop AI that can understand anyone, no matter how they communicate.

Let’s block ads! (Why?)

Link to original source

Google Translate adds real-time translations for 13 new languages


Getty Images

Google announced this week that its Translate app for iOS and Android recognize 13 new languages through your smartphone’s camera. The update, which includes support for Arabic and Hindi, is in the process of being rolled out to Translate users worldwide, per VentureBeat.

In addition to Arabic and Hindi, the app now supports Bengali and Punjabi—four of the top 10 most spoken languages in the world, according to Ethnologue. Translate also added support for Gujarati, Kannada, Malayalam, Marathi, Nepali, Tamil, Telugu, Thai, and Vietnamese.

Google Translate’s “See” and “Snap” features allow you to point your camera at a sign or menu and watch the app translate the text in real time, or take a quick picture and let the app process any translatable text for you. Travelers can access the feature through the Translate app by tapping the camera icon.

Google has added significant support for new languages since launching visual translations back in 2015. The app now supports nearly 50 languages, and earlier this month added support for local accents to make hearing spoken translations easier on the ears.

The real-time translations are made possible thanks to Neural Machine Translation (NMT), a machine-learning technique that completes translations by predicting the likelihood of a sequence of words. Google also uses the technique to power offline translations.

Let’s block ads! (Why?)

Link to original source

Google Translate for iOS can speak in your local accent


Jaap Arriens/NurPhoto via Getty Images

Until now, using Google Translate on your iPhone has meant listening to the same pronunciation for translations no matter where you live. That’s not very considerate, and potentially a problem if you live in countries where foreign accents could make comprehension difficult. You won’t have that issue from now on — an update to Google Translate has added speech output in local versions of multiple languages, including English, Bengali, French and Spanish. You can hear English results with an Indian accent, for instance, or listen to French with a Canadian spin.

Android has included these speech options for a while. It’s hard to knock this addition, though. Even if you don’t have trouble with variations, it’s good to hear translations that reflect local cultures instead a one-size-fits-all response.

Let’s block ads! (Why?)

Link to original source