Home » Uncategorized

Tutorial – foundations of machine learning and data science for developers


Here is a tutorial I have created (foundations of machine learning and data science for developers)

It is based on my insights from the Enterprise AI course and also the Data Science for IoT course which I teach at Oxford University

The ultimate goal is to create a simple way for developers to understand the Maths and Stats foundations needed for Data Science

The Tutorial is targeted to developers (or anyone who is interested in Data Science) having some basic maths background

While the tutorial draws on existing knowledge on the Web, it has three unique considerations

a) We explain concepts in context. Many tutorials explain one specific aspect but do not show how it fits into the wider picture.

b) We approach maths and stats from the perspective/lens of linear regression. This approach has an advantage because linear regression is familiar to many from their high school days.

c) Less is more – we discuss a small number of examples. You can always find more examples of an algorithm type on the Web.

I also address a broader question: Which Maths and Stats Techniques do You Need for Data Science? 

A knowledge of algorithms (maths and stats) is the main differentiator between traditional programming and analytics -based programming.

Having said that, it helps to start with programming and approach the maths (initially) through APIs and libraries. I find that this technique works better because more people are familiar with programming than with maths. 

Here is a list of maths and stats techniques useful for Data Scientists:

  • Linear Algebra (e.g. vector algebra, matrices, transformations, eigenvalues).
  • Probability Theory.
  • Optimization techniques (eg. gradient descent).
  • Descriptive Statistics (eg. means, modes, standard deviations, variances, distributions).

Techniques used in Data Science such as Data transformations, Exploratory data analysis, Feature engineering, Ensemble strategies, and Visualization (story telling) all involve maths and stats.

Future versions of this tutorial will elaborate on this.

Comments welcome

Tutorial link is here (pdf download)

kind rgds