What is Machine Learning: A Tour of Authoritative Definitions and a Handy One-Liner You Can Use - Machine Learning Mastery


  1. Tom Mitchell in his book Machine Learning

    The field of machine learning is concerned with the question of how to construct computer programs that automatically improve with experience.



    • E:experience , what data to collect
    • T:tasks , what decisions the software needs to make
    • P:performance measure , how we will evaluate it’s results
  2. The Elements of Statistical Learning: Data Mining, Inference, and Prediction

    Vast amounts of data are being generated in many fields, and the statisticians’s job is to make sense of it all: to extract important patterns and trends, and to understand “what the data says”. We call this learning from data.


  3. Bishop in the preface of his book Pattern Recognition and Machine Learning comments:

    Pattern recognition has its origins in engineering, whereas machine learning grew out of computer science. However, these activities can be viewed as two facets of the same field…


  4. Marsland provides adopts the Mitchell definition of Machine Learning: An Algorithmic Perspective

    One of the most interesting features of machine learning is that it lies on the boundary of several different academic disciplines, principally computer science, statistics, mathematics, and engineering. …machine learning is usually studied as part of artificial intelligence, which puts it firmly into computer science …understanding why these algorithms work requires a certain amount of statistical and mathematical sophistication that is often missing from computer science undergraduates.


  5. Drew Conway created a nice Venn Diagram in September 2010 that might help.

    In his explanation he comments Machine Learning is Hacking + Math & Statistics.

  6. Jason Brownlee in What is Machine Learning: A Tour of Authoritative Definitions and a Handy One-Liner You Can Use - Machine Learning Mastery

    Machine Learning is the training of a model from data that generalizes a decision against a performance measure.



Practical Machine Learning Problems - Machine Learning Mastery

  1. Spam Detection 识别垃圾邮件

    • E:邮件
    • T:决策问题(分类),把每一封邮件标记为垃圾或正常邮件
    • P:准确率


    • 准备决策程序 -- 训练
    • 收集的数据 -- 训练集
    • 程序 -- 模型
  2. Credit Card Fraud Detection
  3. Digit Recognition:手写数字识别
  4. Speech Understanding
  5. Face Detection
  6. Product Recommendation
  7. Medical Diagnosis:匹配病人症状和数据库中的症状,预测是否可能患病
  8. Stock Trading
  9. Customer Segmentation:对比所有用户历史行为记录,和当前用户行为模式,判断用户类型
  10. Shape Detection


  • Classification 分类,如垃圾邮件识别
  • Regression 回归,例如股市预测
  • Clustering 聚类,如 iPhoto 按人分组
  • Rule Extraction 规则提取,如数据挖掘


Predictive Modeling

有监督学习 supervised learning

给定结果和关系,构建模型去做识别和判断。例子:iris exercise,构建一个模型,通过花的一些测量数据,就可以判断这个花属于什么种类。如果输出是类别 category,则问题分类 classification 问题;如果输出是数值,,则属于回归 regression 问题。

The algorithm does the learning. The model contains the learned relationships.


  1. Sample Data: the data that we collect that describes our problem with known relationships between inputs and outputs.
  2. Learn a Model: the algorithm that we use on the sample data to create a model that we can later use over and over again.
  3. Making Predictions: the use of our learned model on new data for which we don’t know the output.

出处:Gentle Introduction to Predictive Modeling


model = algorithm(data)

A Tour of Machine Learning Algorithms

一、按 learning style 分

Supervised Learning 监督学习


典型问题是分类和回归,典型算法包括 Logistic Regression 和 the Back Propagation Neural Network.

Unsupervised Learning 非监督学习


典型问题包括聚类,降维 dimensionality reduction 和 关联规则学习 association rule learning.

典型算法包括 关联规则 the Apriori algorithm 和 k-Means.

Semi-Supervised Learning 半监督学习

There is a desired prediction problem but the model must learn the structures to organize the data as well as make predictions.


二、按 功能相似性 分

Regression Algorithms 回归算法


  • Ordinary Least Squares Regression (OLSR)
  • Linear Regression
  • Logistic Regression
  • Stepwise Regression
  • Multivariate Adaptive Regression Splines (MARS)
  • Locally Estimated Scatterplot Smoothing (LOESS)

Instance-based Algorithms / winner-take-all methods / memory-based learning

对模型来说重要的训练数据实例的决策问题 Instance-based learning model is a decision problem with instances or examples of training data that are deemed important or required to the model. Focus is put on the representation of the stored instances and similarity measures used between instances.

建立样本数据集并且用 similarity measure 跟新数据比对,找出最佳匹配然后做出预测。

  • k-Nearest Neighbor (kNN)
  • Learning Vector Quantization (LVQ)
  • Self-Organizing Map (SOM)
  • Locally Weighted Learning (LWL)

Regularization Algorithms 正则化算法

  • Ridge Regression
  • Least Absolute Shrinkage and Selection Operator (LASSO)
  • Elastic Net
  • Least-Angle Regression (LARS)

Decision Tree Algorithms

基于数据属性值构建决策模型。Decisions fork in tree structures until a prediction decision is made for a given record.

  • Classification and Regression Tree (CART)
  • Iterative Dichotomiser 3 (ID3)
  • C4.5 and C5.0 (different versions of a powerful approach)
  • Chi-squared Automatic Interaction Detection (CHAID)
  • Decision Stump
  • M5
  • Conditional Decision Trees

Bayesian Algorithms

  • Naive Bayes
  • Gaussian Naive Bayes
  • Multinomial Naive Bayes
  • Averaged One-Dependence Estimators (AODE)
  • Bayesian Belief Network (BBN)
  • Bayesian Network (BN)

Clustering Algorithms

describes the class of problem and the class of methods. All methods are concerned with using the inherent structures in the data to best organize the data into groups of maximum commonality.

  • k-Means
  • k-Medians
  • Expectation Maximisation (EM)
  • Hierarchical Clustering

Association Rule Learning Algorithms 关联规则学习算法


  • Apriori algorithm
  • Eclat algorithm

Artificial Neural Network Algorithms 人工神经网络算法


  • Perceptron
  • Back-Propagation
  • Hopfield Network
  • Radial Basis Function Network (RBFN)

Deep Learning Algorithms 深度学习算法

a modern update to Artificial Neural Networks that exploit abundant cheap computation. many methods are concerned with semi-supervised learning problems where large datasets contain very little labeled data.

  • Deep Boltzmann Machine (DBM)
  • Deep Belief Networks (DBN)
  • Convolutional Neural Network (CNN)
  • Stacked Auto-Encoders

Dimensionality Reduction Algorithms 降维算法


  • Principal Component Analysis (PCA)
  • Principal Component Regression (PCR)
  • Partial Least Squares Regression (PLSR)
  • Sammon Mapping
  • Multidimensional Scaling (MDS)
  • Projection Pursuit
  • Linear Discriminant Analysis (LDA)
  • Mixture Discriminant Analysis (MDA)
  • Quadratic Discriminant Analysis (QDA)
  • Flexible Discriminant Analysis (FDA)

Ensemble Algorithms 整体算法


  • Boosting
  • Bootstrapped Aggregation (Bagging)
  • AdaBoost
  • Stacked Generalization (blending)
  • Gradient Boosting Machines (GBM)
  • Gradient Boosted Regression Trees (GBRT)
  • Random Forest


  • Feature selection algorithms
  • Algorithm accuracy evaluation
  • Performance measures
  • Computational intelligence (evolutionary algorithms, etc.)
  • Computer Vision (CV)
  • Natural Language Processing (NLP)
  • Recommender Systems
  • Reinforcement Learning
  • Graphical Models

其他 list










反向传播算法最初在 1970 年代被发现,但是这个算法的重要性直到 David Rumelhart、Geoffrey Hinton 和 Ronald Williams 的 1986年的论文中才被真正认可。现在,反向传播算法已经是神经网络学习的重要组成部分了。




  • 步骤1:调整心态(相信!)
  • 步骤2:挑选一个过程(怎样得到结果)
  • 步骤3:挑选一种工具(实施)
  • 步骤4:在数据集上应用(投入工作)
  • 步骤5:建立一个组合(展现你的技能)



