Home » Uncategorized

Essential Math for Data Science

This article was written by Tirthajyoti Sarkar. Below is a summary. The full article (accessible from link at the bottom) also features courses that you could attend to learn the topics listed below, as well as numerous comments. We also added a few topics that we think are important and missing in the original article.



  • Data summaries and descriptive statistics, central tendency, variance, covariance, correlation,
  • Basic probability: basic idea, expectation, probability calculus, Bayes theorem, conditional probability,
  • Probability distribution functions — uniform, normal, binomial, chi-square, student’s t-distribution, Central limit theorem,
  • Sampling, measurement, error, random number generation,
  • Hypothesis testing, A/B testing, confidence intervals, p-values,
  • ANOVA, t-test
  • Linear and logistic regression, regularization
  • Decision trees
  • Robust and non-parametric statistics

Linear Algebra

  • Basic properties of matrix and vectors — scalar multiplication, linear transformation, transpose, conjugate, rank, determinant,
  • Inner and outer products, matrix multiplication rule and various algorithms, matrix inverse,
  • Special matrices — square matrix, identity matrix, triangular matrix, idea about sparse and dense matrix, unit vectors, symmetric matrix, Hermitian, skew-Hermitian and unitary matrices,
  • Matrix factorization concept/LU decomposition, Gaussian/Gauss-Jordan elimination, solving Ax=b linear system of equation,
  • Vector space, basis, span, orthogonality, orthonormality, linear least square,
  • Eigenvalues, eigenvectors, and diagonalization, singular value decomposition (SVD)


  • Functions of single variable, limit, continuity and differentiability,
  • Mean value theorems, indeterminate forms and L’Hospital rule,
  • Maxima and minima,
  • Product and chain rule,
  • Taylor’s series, infinite series summation/integration concepts
  • Fundamental and mean value-theorems of integral calculus, evaluation of definite and improper integrals,
  • Beta and Gamma functions,
  • Functions of multiple variables, limit, continuity, partial derivatives,
  • Basics of ordinary and partial differential equations (not too advanced)

Discrete Math

  • Sets, subsets, power sets
  • Counting functions, combinatorics, countability
  • Basic Proof Techniques — induction, proof by contradiction
  • Basics of inductive, deductive, and propositional logic
  • Basic data structures- stacks, queues, graphs, arrays, hash tables, trees
  • Graph properties — connected components, degree, maximum flow/minimum cut concepts, graph coloring
  • Recurrence relations and equations
  • Growth of functions and O(n) notation concept

Optimization, Operations Research

  • Basics of optimization —how to formulate the problem
  • Maxima, minima, convex function, global solution
  • Linear programming, simplex algorithm
  • Integer programming
  • Constraint programming, knapsack problem
  • Randomized optimization techniques — hill climbing, simulated annealing, Genetic algorithms

To read the full article, click here.

DSC Resources