Artificial intelligence can do more and more things, such as image recognition, natural language understanding, and go go. Of course, at last, AI has regressed to let human players do not want to play with it.
Google Pr go AI at the same time, Baidu continues to shift in speech and image recognition.
Previously, Baidu launched a software called SwiftScribe, which can transfer text to speech, which is the gospel of journalists. In addition to voice to text, in fact, Baidu AI also has a text to voice software, calledDeep VoiceIts function is text to speech. According to The Verge, the AI is talking about nothing like a real person, and can speak almost two times. However, this system can only learn one voice at a time, and it takes hours or even more audio to learn.
Recently, Baidu AI upgraded the software and launched itDeep Voice 2It can rely on 1.5 hours of audio to learn the difference between a person's voice and other people, and a system that can learn hundreds of accents, that is, to imitate hundreds of people.
The Verge said, in fact, Siri can also mimic regional accents. It was just Siri this thing, a lot of time, because I don't learn a new voice and accent, needs a person to record thousands of hours of audio. Later, engineers have to spend a long time to
The operation method of Deep Voice 2 is not the same, it will first learn to speak common in hundreds of people, the establishment of a basic model of human speech, then according to the different characteristic of speech, tone, accent, to adjust the model. The system does not require manual adjustments.
What does it look like to imitate the functions of different people?
Baidu believes that such technology may be applied to the smart voice assistant this, users use voice to communicate with assistants, or issued orders, which also need to use voice to reply. If each voice assistant has a different voice, the feeling of customization will be more intense, and not every Siri is the same as before.
Now many people love on the way to work to listen to books, this is a kind of application field, with this technology, when you listen to voice books, each character, will have their own voice and the corresponding emotion, mood, that sounds a lot vivid.
In fact, such technology can also be used in voice service. According to 36 krypton understand, Baidu also has a dedicated voice customer service team. Before, Baidu and China Unicom signed a cooperation agreement, Robin Li said, in the future to help Unicom get a smart customer service. Robin Li said, artificial intelligence for us recharge, change the future package, perhaps already faintly visible. At the previous Baidu World Conference, Robin Li also demonstrated one of the applications of voice recognition: telemarketing. If the telephone customer service hits each time, the speaker has a different tone and accent, and feels more like a real person.
In addition, babies who have used voice navigation will know that there are different voice packs inside. If the above language function, you can make your home or your children and friends give you record a voice packet, if a star you love, you can download his singing, interview or speech audio, and then let the AI learning. As a result, your car will guide you in the future, the voice of your favorite person.
Well, then, with a bit of voice control, let's see what other companies have done in this area.
Baidu is not the only giant to explore in this field, and last September, Google's DeepMind team released a voice synthesizerWaveNetThis software has greatly improved the quality of sound over traditional speech synthesis systems.
The track also has a large number of start-up companies. Last month, Lyrebird, a Canadian start-upJust released a new systemIt can imitate a lot of big people with one minute of voice data.
Now the industry is so developed, coupled with AI gradually learn and people, you come to me to communicate, not only after the customer service, voice, the United States and the United States radio station host is not also unemployed?