This article mainly compiled in Github article "Getting started with machine learning"
Machine learning is the study of data, generalization and prediction. In recent years, with the development of data, the improvement of algorithms and the improvement of hardware computing ability, the machine learning technology has been rapidly developed and continuously extended to new fields. From pattern recognition to video games, developers have implemented a variety of fun applications by training AI algorithms:
MarI / O
A program written using neural networks and genetic algorithms can play "Super Mario World."
Richard-An / Wechat_AutoJump
AI play WeChat hop the correct posture.
lllyasviel / style2paints
AI cartoon lineart automatic coloring tools.
tensorflow / magenta
Machine Smart Music and Art Builder.
jbhuang0604 / awesome-computer-vision
Very good computer vision resources.
While researchers have had exciting results in machine learning, we are still in the early stages of machine learning.
For those who are new to machine learning, to understand what is machine learning, you first need to understand three parts: input, algorithm, output.
Input: Drive machine learning data
Input refers to the dataset required by the algorithm and training model, and from source code to statistics, the dataset can include anything:
GSA / data
Disaggregated data from the U.S. General Services Administration.
GoogleTrends / data
Google open source data index
nationalparkservice / data
An unofficial US National Park database.
fivethirtyeight / data
Some code and data on the news site FiveThirtyEight.
zalandoresearch / fashion-mnist
A MINIST-like fashion product database.
beamandrew / medical-data
Machine Learning Medical Data List.
src-d / awesome-machine-learning-on-source-code
Machine learning related links and essay code.
PAIR-code / facets
Machine learning data set visualization tools.
Because we need these data to train machine learning algorithms, obtaining high-quality data sets is one of the biggest challenges in machine learning today.
Algorithm: How to process and analyze data
Machine learning algorithms can use data to perform specific tasks, the most common machine learning algorithms are the following:
Supervise the study
Supervising the use of learning as well as annotated and structured data, the machine learns how to identify the target and maps to other learning tasks by developing the output required for a set of input data sets.
For example, in learning a decision tree, values can be estimated by applying a set of decision rules for the input data:
igrigorik / decisiontree
Implementation of Machine Learning Decision Tree Algorithm Based on.
Unsupervised learning is the process of using unstructured data to discover patterns and structures. Supervised learning may use spreadsheets as input to data, while unsupervised learning may be used to understand a book or an article.
For example, unsupervised learning is a very popular method of natural language processing:
keon / awesome-nlp
A list of resources specifically for natural language processing (NLP).
3. To enhance learning
Enhancing learning can require the algorithm to achieve a particular goal, which maximizes the agent's behavioral performance through reward and punishment.
For example, enhanced learning can be used to develop autonomous vehicles or to teach a robot how to produce objects.
openai / gym
A toolkit for developing and comparing enhanced learning algorithms.
aikorea / awesome-rl
A list of resources dedicated to intensive learning.
Some items that you can use to practice:
umutisik / Eigentechno
Principal component analysis of music cycle
jpmckinney / tf-idf-similarity
Use tf * idf on Ruby gems to calculate the similarity between texts.
scikit-learn-contrib / lightning
Large-scale linear classification, regression and ranking of Python.
gwding / draw_convnet
Python script to illustrate convolutional neural networks (ConvNet).
Some libraries and tools:
scikit-learn / scikit-learn
Machine learning in Python.
tensorflow / tensorflow
An open source software library that uses data flow graphs for numerical calculations
Theano / Theano
You can efficiently define, optimize, and evaluate Python libraries that involve math expressions for multi-dimensional arrays.
shogun-toolbox / shogun
Efficient open source machine learning tools.
davisking / dlib
Machine Learning and Data Analysis Application Toolkit written in C ++.
apache / predictionio
Machine learning server for developers and machine learning engineers based on Apache Spark, HBase, and Spray.
More depth learning framework, can view the article:
Output: The final result
The output of machine learning can be a color-aware pattern, a simple web page tone analysis, or an estimate of a confidence interval. In short, the output can be anything.
There are several ways to get the output:
Category: Generate an output value for each item in the data set
Regression: Given data, predict the most likely value of the variable under consideration
Clustering: Clusters data of similar patterns together
Here are a few examples of applications:
deepmind / pysc2
DeepMind uses enhanced learning to play StarCraft 2.
gokceneraslan / awesome-deepbio
A list of deep learning applications for the field of biometrics.
buriburisuri / ByteNet
French to English translator based on Tensorflow with DeepMind ByteNet.
OpenNMT / OpenNMT
Open Source Neural Machine Translation on Torch.
Ready to get started machine learning?
Take advantage of open source projects to master machine learning, and you can also contribute your resources like the developers below:
josephmisiti / awesome-machine-learning
Lists of some machine learning frameworks, libraries and software.
ujjwalkarn / Machine-Learning-Tutorials
Tutorials, articles, and other resources for machine learning and advanced learning.
Some nice deep learning tutorials, projects and communities.
fastai / courses
jtoy / awesome-tensorflow
TensorFlow Resource Listhttp://tensorflow.org.
nlintz / TensorFlow-Tutorials
TensorFlow a simple tutorial.
pkmital / tensorflow_tutorials
Some TensorFlow basics and interesting applications.
Finally, Lei Feng Network AI study attached to the two programmers commentaries, Buddha would like to bless you programming without bugs.
Guicai-Li / OneDay
YondoL / Buddha