New software lets English speakers talk in Mandarin in user’s own voice
Microsoft has demoed new software that translates English into Mandarin in real-time, copying the speaker’s own voice.
In a video posted online, head of Microsoft research, Rick Rashid, demonstrated how the software works. The clip shows Rashid speaking in English with a Mandarin translation spoken a few seconds later copying the tonality of his own voice.
He said the company hopes to have “systems that can completely break down language barriers” in the next few years.
At the core of the software is a technology called Deep Neural Net (DNN) translation. The technology has already been offered as a commercial service called inCus.
The system records the words in English, translates them into Mandarin and then matches them to a recorded set of words in that language.
Rashid said in a blog post that Microsoft was poised to take the system one step further.
The demo was performed at Microsoft Research Asia's 21st Century Computing event. He said his team had been able to reduce the word error rate for speech by over 30 per cent compared to previous methods.
“This means that, rather than having one word in four or five incorrect, now the error rate is one word in seven or eight,” said Rashid.
“While still far from perfect, this is the most dramatic change in accuracy since the introduction of hidden Markov modelling in 1979, and as we add more data to the training we believe that we will get even better results.”
He said that we would not have to wait until the 22nd century for a usable equivalent of Star Trek’s universal translator.
“We can also hope that as barriers to understanding language are removed, barriers to understanding each other might also be removed,” he said.
The demo can be watched in the video below.