Home > News content

Amazon AWS Chief Scientist: Speech Recognition Breakthroughs

via:博客园     time:2018/1/28 16:36:27     readed:1019


Text / Tencent "front-line" Xue Fang

On the morning of January 28, 2018, EmTech China, the emerging MIT summit of MIT Technology Review, was officially held at Beijing World Trade Center Hotel. Animashree Anandkumar, chief AWS scientist at Amazon, delivered a speech.

Attach the full text of the speech below:

Multi-domain technology has become a leading technology leader in cutting-edge technology and I am honored to study this topic in my doctorates and post-doctoral degrees. Today I will talk to you about machine learning, how to study and quantify machine learning.

Deep learning needs to go through many layers or even hundreds of layers of processing, such machine learning will also be in different GPUs, cross-machine, cross-device processing, which requires network technology. Multi-domain model can help us deal with science, engineering, various fields of application. We are constantly looking for solutions for machine learning multi-domain models and how multi-domain applications can be calculated on the cloud.

Deep learning spans many areas. First of all we do image understanding, the basic task is to identify different items. Identifying objects in a picture is easy for humans, but extremely difficult for machines. But our system has been greatly improved and more advantageous than ever before.

After that we made a breakthrough in speech recognition. Natural language processing for different languages ​​also has deep learning participation. Different languages ​​have different structures, how to automatically deal with these different languages ​​and understand them?

Human beings use language to communicate in different contexts, such as listening, speaking, reading and writing. In these processes, the way languages ​​behave is not the same. Machine how to deal with different languages? This is the challenge of deep learning.

Another area is about driverless. How to improve their performance, how to identify obstacles, how to have a good vision, and how to make immediate decisions are some of the problems that driverless technologies need to solve and are where deep learning can play a role.

Share with you how to run the current deep learning model. Deep Learning has a wide range of applications, we have some specialized projects, it is also applied to more different hardware infrastructure. Mxnet is one of those deep learning engines, which was first developed by researchers at the university and now we're developing it on AWS.

The advantages of this engine are obvious. It establishes a network, the programming process, the expression, the characteristic description, the style are very flexible, the convenience, has improved the programmer's efficiency. At the same time also provides a good language support, and front and back ends automatically docking, improve programming efficiency.

There are some fixed data in this network, and interconnected levels connect between input and output. Although some specialized project programming process is relatively easy to write, longer language flow, writing more symbols. In terms of the order of calculation, they have a certain sequence relationship, and we have developed a chart to automate parallel comparisons. It also automates memory, which also increases efficiency in code operations.

We also use multi-GPU training to improve efficiency. There will be multiple GPUs on a single machine to parallelize the data and get a lot of data at the same time. The central data comes from the network above the different CPU levels, the data is continuously divided down into each GPU.

GPU processing needs to find similar content will be integrated, but also increase our efficiency. GPU can be integrated in Mxnet computing results, this cost is relatively low. At the same time we also improve the Mxnet performance. After increasing the GPU, the input and output efficiency will be reversed. This is run on AWS infrastructure, including B2X and B22X.

Mxent has the highest efficiency of 91% for all services, including Resnet and Inception v3 and Alexnet. This is a single substrate with multiple GPUs. On a multi-substrate, each machine has 16 GPUs, and when combined together, all data passing through the network can affect efficiency. But our efficiency has not been much reduced because Mxnet is so built that it can improve efficiency. So we can do this distributed multi-machine training.

These can now also be applied to some scenarios and to the framework of our multi-GPU and CPU. We also hope that we can provide such technology to our consumers, let them know that we distributed training has a very good technical package that can help us network compression and network decompression, provide good technical services.

All of these frameworks can be applied to our machine learning platform, CHMaker. This is a platform for multi-machine learning and all distributed deep learning frameworks can run on this platform, such as TensorFlow, Mxnet. Our platform supports all frameworks except MxNet and we want to give our users more flexibility.

In addition, DeepLens is also the first deep learning camera we recently released that can provide a lot of services, such as languages, sentences, computer vision and so on. Users do not need to train their own learning model, we can use our services.

China IT News APP

Download China IT News APP

Please rate this news

The average score will be displayed after you score.

Post comment

Do not see clearly? Click for a new code.

User comments