As a new sub-discipline of Data Science, I notice that SYSTEMS Analytics is starting to get some traction! There are a couple of Analytics graduate level programs with *Systems* in its title (Stevens Institute of Technology and University of North Carolina are the only ones I know). Web search brings up NO books on *Systems* Analytics. With the publication of my book with *Systems* in the title, that gap has been filled now! “SYSTEMS Analytics: Adaptive Machine Learning workbook”.
My last Analytics startup launched in 2013 explicitly used SYSTEMS Analytics in our Retail Recommendation and Uplift SaaS product; my initial bias for the Systems approach was confirmed by the success of our product. My book is partially an outcome of this experience and partly my strong sense that Systems thinking that has been lacking in Machine Learning (ML) till now will add a valuable new extension to the theoretical underpinning and practice of ML.
So what is Systems Analytics? It is a merger of Systems Theory and Machine Learning. One way to quickly relate to this new field is to think of it as a “dynamical” extension to ML theory and practice. What exactly does “dynamical extension” mean? Let us restate what Machine Learning is in a manner suitable for our purpose here . . .
Machine Learning =
In plain English: What is the likely Class that the measured Attributes belong to?
In Probability speak: What is the Conditional Expectation of Class (y) given Attributes (x)? Or E[ y | x].
Systems Theory gives us a principled framework to answer the “probability speak” question. It starts with the data models.
We are very familiar with Multiple Linear Regression:
y = a0 + a1 x1 + a2 x2 + . . . + aM xM + w (A)
Many of us are cool with Auto Regressive Moving Average (ARMA) model:
y[n] = – a1[n] y[n-1] – . . . – aD[n] y[n-D] + b1[n] x1[n] + . . + bM[n] xM[n-M+1] + e[n] (B)
Equation (A) is an example of a “static” data model; equation (B) is an example of a “dynamical” model. A more elaborate and powerful data model is a *time-varying dynamical* one; State-space model is such a data model.
s[n] = A s[n-1] + B x[n] + D q[n-1] (C)
y[n] = H[n] s[n] + r[n]
The first equation of (C) is called the “state” equation and the second, the “measurement” equation. State equation adds additional “degrees of freedom” allowing more “dynamics” and a more sophisticated form of ML.
We defined the ML problem as estimating Conditional Expectation, E[ y | x] above. It turns out that for data models of the form in equation (C), much work has been done in the past 50+ years and powerful solutions are already in hand. So, we have an opportunity to bring this heavy machinery into ML without much of the heavy lifting of discovering them from scratch! This is exactly the subject matter of my book, “Systems Analytics”.
Bayesian estimation approach to finding E[ y | x] is hugely simplified for the state-space data model in equation (C). There exist “Bayes Filter” algorithms to estimate E[s | y, x] WITHOUT obtaining the Conditional pdf explicitly first. Then, it is a simple matter of obtaining the Bayesian estimate we seek in ML, E[y | x] as = H[n] s[n] since H[n] are known, non-random quantities.
For different cases, different “Bayes Filters” have to be used. Here is a list –
Bayes Filter algorithms:
- Linear Gaussian case – Kalman Filter.
- Mild Non-linear Gaussian case – Extended Kalman Filter (EKF).
- Non-linear Gaussian case – Cubature Kalman Filter (CKF), Unscented Kalman Filter (UKF).
- Non-linear distribution-free case – Particle Filter, Markov Chain Monte Carlo (MCMC) Filter.
Filters #1 and #2 have a long history of successful applications from Apollo space missions to our every-day GPS gadgets! Pure non-linear cases are harder but over the past decade, much progress have been made.
NOW, for the most important question – why bother with the sophistication of State-space data models, Kalman Filters and such . . .? This is where the theme of *Dynamical* Machine Learning becomes important.
In Machine Learning applications TODAY, almost all *business* Data Science applications learn a static “mapping” between inputs and outputs using Training Set data; then this map is moved into “production”. The implicit assumption is that the relationship between the input & output (= the underlying real “system”) remains unchanged in the future during production usage! This assumption is patently untenable in real-life situations.
The other reason to start moving to Dynamical ML/ Systems Analytics is the realization that if learning is the process of “generalization from experience”, we can be more explicit and say that “generalization from past experience AND results of new action” is the true definition of learning! As such, a “static” solution will be inadequate if we want to incorporate the results of new action . . .
From a business perspective, I have often said that business solutions are not “one and done”! ML solutions should be administered like flu-shots; adjust the mix and apply on a regular basis . . . or *dynamically* learn and update your ML “map”. This is what SYSTEMS Analytics does . . .
Clearly, we can only skim the surface in a brief blog like this. If you are sufficiently motivated, especially if you are an “engineering” Data Science practitioner (see What exactly is Data Science? for different “types” of Data Science), specialization in SYSTEMS Analytics may be in your future! A good starting point will be “SYSTEMS Analytics: Adaptive Machine Learning workbook”.
My book has two parts: PART I – Machine Learning from multiple perspectives & PART II – SYSTEMS Analytics. Here are a few key chapters – each chapter appendix has MATLAB code which can be downloaded from the book website.
Chapter 2: A quick romp through ML: Many key “traditional” ML methods reviewed with worked out examples.
Chapter 3: Systems Theory, Linear Algebra & Analytics BASICS – old wine in a different bottle: ML practitioners come from STEM as well as Social Sciences backgrounds; this chapter creates a common language for better collaborative work among Data Scientists of different stripes. Linear Algebra IS the lingua franca of ML, no doubt!
Chapter 4: “Modern” Machine Learning: This chapter brings you up to date with current Mathematical aspects of ML.
Chapter 6: State space model & Bayes Filter: Covers the theory, algorithms and use cases of SYSTEMS Analytics.
Chapter 7: Kalman Filter for ADAPTIVE Machine Learning: Kalman Filter algorithm details including a “recurrent” architecture for ML. Solutions developed in this chapter can be applied to Machine Learning use cases that require either static or dynamical, dynamical or time-varying dynamical, linear or non-linear mapping!
I conclude my book by noting that while we have established the foundation of SYSTEMS Analytics, many more opportunities to extend the field in theory and business applications await the careful reader of “Systems Analytics: Adaptive Machine Learning workbook”!
PG Madhavan, Ph.D. – “Data Science Player+Coach with deep & balanced track record in Machine Learning algorithms, products & business”