Home > News content

The open source project of the advanced machine learning: from the interstellar 2 deep learning environment to the nerve Machine Translation

via:博客园     time:2018/1/12 17:49:44     readed:376

This article mainly compiled in Github article "Getting started with machine learning"

Machine learning is the study of data, generalization and prediction. In recent years, with the development of data, the improvement of algorithms and the improvement of hardware computing ability, the machine learning technology has been rapidly developed and continuously extended to new fields. From pattern recognition to video games, developers have implemented a variety of fun applications by training AI algorithms:

MarI / O

Source Address:https://pastebin.com/ZZmSNaHX

A program written using neural networks and genetic algorithms can play "Super Mario World."

Richard-An / Wechat_AutoJump

GitHub Address:https://github.com/Richard-An/Wechat_AutoJump

AI play WeChat hop the correct posture.

lllyasviel / style2paints

GitHub Address:https://github.com/lllyasviel/style2paints

AI cartoon lineart automatic coloring tools.

tensorflow / magenta

GitHub Address:https://github.com/tensorflow/magenta

Machine Smart Music and Art Builder.

jbhuang0604 / awesome-computer-vision

GitHub Address:https://github.com/jbhuang0604/awesome-computer-vision

Very good computer vision resources.

While researchers have had exciting results in machine learning, we are still in the early stages of machine learning.

For those who are new to machine learning, to understand what is machine learning, you first need to understand three parts: input, algorithm, output.

Input: Drive machine learning data

Input refers to the dataset required by the algorithm and training model, and from source code to statistics, the dataset can include anything:

GSA / data

GitHub Address:https://github.com/GSA/data

Disaggregated data from the U.S. General Services Administration.

GoogleTrends / data

GitHub Address:https://github.com/GoogleTrends/data

Google open source data index

nationalparkservice / data

GitHub Address:https://github.com/nationalparkservice/data

An unofficial US National Park database.

fivethirtyeight / data

GitHub Address:https://github.com/fivethirtyeight/data

Some code and data on the news site FiveThirtyEight.

zalandoresearch / fashion-mnist

GitHub Address:https://github.com/zalandoresearch/fashion-mnist

A MINIST-like fashion product database.

beamandrew / medical-data

GitHub Address:https://github.com/beamandrew/medical-data

Machine Learning Medical Data List.

src-d / awesome-machine-learning-on-source-code

GitHub Address:https://github.com/src-d/awesome-machine-learning-on-source-code

Machine learning related links and essay code.

PAIR-code / facets

GitHub Address:https://github.com/PAIR-code/facets

Machine learning data set visualization tools.

Because we need these data to train machine learning algorithms, obtaining high-quality data sets is one of the biggest challenges in machine learning today.

Algorithm: How to process and analyze data

Machine learning algorithms can use data to perform specific tasks, the most common machine learning algorithms are the following:

Supervise the study

Supervising the use of learning as well as annotated and structured data, the machine learns how to identify the target and maps to other learning tasks by developing the output required for a set of input data sets.

For example, in learning a decision tree, values ​​can be estimated by applying a set of decision rules for the input data:

igrigorik / decisiontree

GitHub Address:https://github.com/igrigorik/decisiontree

Implementation of Machine Learning Decision Tree Algorithm Based on.

Unsupervised learning

Unsupervised learning is the process of using unstructured data to discover patterns and structures. Supervised learning may use spreadsheets as input to data, while unsupervised learning may be used to understand a book or an article.

For example, unsupervised learning is a very popular method of natural language processing:

keon / awesome-nlp

GitHub Address:https://github.com/keon/awesome-nlp

A list of resources specifically for natural language processing (NLP).

3. To enhance learning

Enhancing learning can require the algorithm to achieve a particular goal, which maximizes the agent's behavioral performance through reward and punishment.

For example, enhanced learning can be used to develop autonomous vehicles or to teach a robot how to produce objects.

openai / gym

GitHub Address:https://github.com/openai/gym

A toolkit for developing and comparing enhanced learning algorithms.

aikorea / awesome-rl

GitHub Address:https://github.com/aikorea/awesome-rl

A list of resources dedicated to intensive learning.

Some items that you can use to practice:

umutisik / Eigentechno

GitHub Address:https://github.com/umutisik/Eigentechno

Principal component analysis of music cycle

jpmckinney / tf-idf-similarity

GitHub Address:https://github.com/jpmckinney/tf-idf-similarity

Use tf * idf on Ruby gems to calculate the similarity between texts.

scikit-learn-contrib / lightning

GitHub Address:https://github.com/scikit-learn-contrib/lightning

Large-scale linear classification, regression and ranking of Python.

gwding / draw_convnet

GitHub Address:https://github.com/gwding/draw_convnet

Python script to illustrate convolutional neural networks (ConvNet).

Some libraries and tools:

scikit-learn / scikit-learn

GitHub Address:https://github.com/scikit-learn/scikit-learn

Machine learning in Python.

tensorflow / tensorflow

GitHub Address:https://github.com/tensorflow/tensorflow

An open source software library that uses data flow graphs for numerical calculations

Theano / Theano

GitHub Address:https://github.com/Theano/Theano

You can efficiently define, optimize, and evaluate Python libraries that involve math expressions for multi-dimensional arrays.

shogun-toolbox / shogun

GitHub Address:https://github.com/shogun-toolbox/shogun

Efficient open source machine learning tools.

davisking / dlib

GitHub Address:https://github.com/davisking/dlib

Machine Learning and Data Analysis Application Toolkit written in C ++.

apache / predictionio

GitHub Address:https://github.com/apache/predictionio

Machine learning server for developers and machine learning engineers based on Apache Spark, HBase, and Spray.

More depth learning framework, can view the article:

Usher in PyTorch, bid farewell to Theano, 2017 deep learning framework development inventory

Output: The final result

The output of machine learning can be a color-aware pattern, a simple web page tone analysis, or an estimate of a confidence interval. In short, the output can be anything.

There are several ways to get the output:

  • Category: Generate an output value for each item in the data set

  • Regression: Given data, predict the most likely value of the variable under consideration

  • Clustering: Clusters data of similar patterns together

Here are a few examples of applications:

deepmind / pysc2

GitHub Address:https://github.com/deepmind/pysc2

DeepMind uses enhanced learning to play StarCraft 2.

gokceneraslan / awesome-deepbio

GitHub Address:https://github.com/gokceneraslan/awesome-deepbio

A list of deep learning applications for the field of biometrics.

buriburisuri / ByteNet

GitHub Address:https://github.com/buriburisuri/ByteNet

French to English translator based on Tensorflow with DeepMind ByteNet.

OpenNMT / OpenNMT

GitHub Address:https://github.com/OpenNMT/OpenNMT

Open Source Neural Machine Translation on Torch.

Ready to get started machine learning?

Take advantage of open source projects to master machine learning, and you can also contribute your resources like the developers below:

Machine Learning:

josephmisiti / awesome-machine-learning


Lists of some machine learning frameworks, libraries and software.

ujjwalkarn / Machine-Learning-Tutorials


Tutorials, articles, and other resources for machine learning and advanced learning.

Deep learning



Some nice deep learning tutorials, projects and communities.

fastai / courses


fast.ai course.


jtoy / awesome-tensorflow

GitHub Address:https://github.com/jtoy/awesome-tensorflow

TensorFlow Resource Listhttp://tensorflow.org.

nlintz / TensorFlow-Tutorials

GitHub Address:https://github.com/nlintz/TensorFlow-Tutorials

TensorFlow a simple tutorial.

pkmital / tensorflow_tutorials

GitHub Address:https://github.com/pkmital/tensorflow_tutorials

Some TensorFlow basics and interesting applications.

Finally, Lei Feng Network AI study attached to the two programmers commentaries, Buddha would like to bless you programming without bugs.

Guicai-Li / OneDay


YondoL / Buddha


China IT News APP

Download China IT News APP

Please rate this news

The average score will be displayed after you score.

Post comment

Do not see clearly? Click for a new code.

User comments