Subscribe to DSC Newsletter

Free Book: Lecture Notes on Machine Learning

Lecture notes for the Statistical Machine Learning course taught  at the Department of Information Technology, University of Uppsala (Sweden.) Updated in March 2019. Authors: Andreas Lindholm, Niklas Wahlström, Fredrik Lindsten, and Thomas B. Schön.

Source: page 61 in these lecture notes

Available as a PDF, here (original) or here (mirror).

Content

1 Introduction 7

1.1 What is machine learning all about?
1.2 Regression and classification 
1.3 Overview of these lecture notes 
1.4 Further reading 

2 The regression problem and linear regression 11

2.1 The regression problem
2.2 The linear regression model

  • Describe relationships — classical statistics
  • Predicting future outputs — machine learning

2.3 Learning the model from training data

  • Maximum likelihood
  • Least squares and the normal equations

2.4 Nonlinear transformations of the inputs – creating more features
2.5 Qualitative input variables
2.6 Regularization

  • Ridge regression
  • LASSO
  • General cost function regularization

2.7 Further reading
2.A Derivation of the normal equations

  • A calculus approach
  • A linear algebra approach

3 The classification problem and three parametric classifiers 25

3.1 The classification problem
3.2 Logistic regression

  • Learning the logistic regression model from training data
  • Decision boundaries for logistic regression 
  • Logistic regression for more than two classes 

3.3 Linear and quadratic discriminant analysis (LDA & QDA) 

  • Using Gaussian approximations in Bayes’ theorem 
  • Using LDA and QDA in practice 

3.4 Bayes’ classifier — a theoretical justification for turning p(y | x) into yb

  • Bayes’ classifier 
  • Optimality of Bayes’ classifier 
  • Bayes’ classifier in practice: useless, but a source of inspiration
  • Is it always good to predict according to Bayes’ classifier?

3.5 More on classification and classifiers 

  • Regularization
  • Evaluating binary classifiers 

4 Non-parametric methods for regression and classification: k-NN and trees 43

4.1 k-NN 

  • Decision boundaries for k-NN 
  • Choosing k 
  • Normalization 

4.2 Trees 

  • Basics 
  • Training a classification tree 
  • Other splitting criteria 
  • Regression trees

5 How well does a method perform? 53

5.1 Expected new data error Enew: performance in production 
5.2 Estimating Enew 

  • Etrain 6≈ Enew: We cannot estimate Enew from training data 
  • Etest ≈ Enew: We can estimate Enew from test data 
  • Cross-validation: Eval ≈ Enew without setting aside test data 

5.3 Understanding Enew 

  • Enew = Etrain+ generalization error 
  • Enew = bias2 + variance + irreducible error 

6 Ensemble methods 67

6.1 Bagging 

  • Variance reduction by averaging 
  • The bootstrap

6.2 Random forests 
6.3 Boosting 

  • The conceptual idea 
  • Binary classification, margins, and exponential loss 
  • AdaBoost 
  • Boosting vs. bagging: base models and ensemble size
  • Robust loss functions and gradient boosting 

6.A Classification loss functions 

7 Neural networks and deep learning 83

7.1 Neural networks for regression 

  • Generalized linear regression 
  • Two-layer neural network 
  • Matrix notation 
  • Deep neural network
  • Learning the network from data

7.2 Neural networks for classification 

  • Learning classification networks from data 

7.3 Convolutional neural networks 

  • Data representation of an image 
  • The convolutional layer 
  • Condensing information with strides 
  • Multiple channels 
  • Full CNN architecture 

7.4 Training a neural network 

  • Initialization 
  • Stochastic gradient descent 
  • Learning rate 
  • Dropout

7.5 Perspective and further reading 

A Probability theory 101

A.1 Random variables 

  • Marginalization
  • Conditioning

A.2 Approximating an integral with a sum 

B Unconstrained numerical optimization 105

B.1 A general iterative solution
B.2 Commonly used search directions 

  • Steepest descent direction 
  • Newton direction 
  • Quasi-Newton 

B.3 Further reading 

Bibliography

Views: 2280

Comment

You need to be a member of Data Science Central to add comments!

Join Data Science Central

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service