"Machine Learning (ML)" and "Traditional Statistics(TS)" have different philosophies in their approaches. With "Data Science" in the forefront getting lots of attention and interest, I like to dedicate this blog to discuss the differentiation between the two. I often see discussions and arguments between statisticians and data miners/machine learning practitioners on the definition of "data science" and its coverage and the required skill sets. All is needed, is just paying attention to the evolution of these fields.
.
There is no doubt that when we talk about "Analytics," both data mining/machine learning and traditional statisticians have been a player. However, there is a significant difference in approach, applications, and philosophies of the two camps that is often overlooked.
Data mining and predictive analytics
Text processing & analysis
Graph mining
Other:
Historically, ML techniques and approach heavily relies on computing power. On the other hand, TS techniques were mostly developed where computing power was not an option. As a result, TS heavily relies on small samples and heavy assumptions about data and its distributions,
.
ML in general tends to make less preassumptions about the problem and is liberal in its approaches and techniques to find a solution, many times using heuristics. The preferred learning method in machine learning and data mining is inductive learning. At its extreme, in inductive learning the data is plentiful or abundant, and often not much prior knowledge exists or is needed about the problem and data distributions for learning to succeed. The other side of the learning spectrum is called analytical learning, (deductive learning), where data is often scarce or it is preferred (or customary) to work with small samples of it. There is also good prior knowledge about the problem and data. In real world, one often operates between these two extremes. On the other hand, traditional statistics is conservative in its approaches and techniques and often makes tight assumptions about the problem, especially data distributions.
.
The following table shows some of the differences in approach and philosophy between the two fields:
Machine Learning (ML)

Traditional statistics (TS)

Goal: “learning” from data of all sorts

Goal: Analyzing and summarizing data

No rigid preassumptions about the problem and data distributions in general

Tight assumptions about the problem and data distributions

More liberal in the techniques and approaches

Conservative in techniques and approaches

Generalization is pursued empirically through training, validation and test datasets

Generalization is pursued using statistical tests on the training dataset

Not shy of using heuristics in approaches in search of a “good solution”

Using tight initial assumptions about data and the problem, typically in search of an optimal solution under those assumptions

Redundancy in features (variables) is okay, and often helpful. Preferable to use algorithms designed to handle large number of features

Often requires independent features. Preferable to use less number of input features

Does not promote data reduction prior to learning. Promotes a culture of abundance: “the more data, the better”

Promotes data reduction as much as possible before modeling (sampling, less inputs, …)

Has faced with solving more complex problems in learning, reasoning, perception, knowledge presentation, …

Mainly focused on traditional data analysis

Comment
The statement 'Redundancy in features (variables) is okay, and often helpful' could be misleading. Feature (variable) selection was found to be important to improve predictive accuracy. Please see the brief review in the introduction and further discussion in the reference below for details.
Ah, statistics is very useful in linear, repeatable, scientific analysis, as science deals with the environment and very stable relationships. ML and other associative, non linear types of analysis and predictions are more applicable to human behaviors, which are not linear, not always repeatable, and have very fat tails of non standard behaviors. That is where the math models of traditional economics traditional rational assumptions and non standard non traditional theories of behavioral economics/finance don't intersect very well. That is where the generalizations of ML work better with human behaviors versus the traditional statistics of the environment.
Palu:
Thanks for the comments.
(1) Per what I wrote, the fact is that traditional statistical (descriptive and inferential) ideas originally developed for dealing with small datasets in the days of mechanical adding machines. However, they have remained and will remain relevant even in the world of big data and the techniques/theory is used in many disciplines. There are also new statisticians that have done great recent work in learning theory to take it beyond what is called traditional. You can google traditional statistics and see what you find. It is not my term. (See this one too: https://www.coursera.org/learn/reallifedatascience/lecture/nR3sM...).
(2) When I mentioned "Clustering", "Forecasting", and "regression", I do not mean any particular algorithm in clustering, forecasting, or regression (like logistic or linear regression). The list I put there is "General Functions" that was proposed by the software industry more than a decade ago to put some order on handling model interoperability between different systems as related to data mining, specifically PMML (Predictive Modeling Markup Language). Each function like regression or clustering or forecasting could be implemented by tens of different algorithms where some came of statistics and many others out of it.
Hi Palu,
How can you claim your statement is valid to void Traditional Statistics terms?
"Is there something called traditional machine learning? If the answer is yes, then the title is meaningful. However if the answer is no, then it is meaningless."
Author described whole blend of framework (ML+Statistics) and i don't think so he fabricated any terminology in terms of conveying exactly traditional Statistics without ML
Is there something called traditional machine learning? If the answer is yes, then the title is meaningful. However if the answer is no, then it is meaningless.
Clustering is statistics.
Forecasting is statistics (prominent researchers here were Box & Jenkin's et al)
Regression is statistics.
There are authors here on Data Science Central who write misleading articles. IMO authors should check their facts before posting here as this is the site where academics & expert analysts frequented to read articles being posted here. Its ok not to check facts, but this will get authors exposed as uninformed or ignorant.
© 2018 Data Science Central ® Powered by
Badges  Report an Issue  Privacy Policy  Terms of Service
You need to be a member of Data Science Central to add comments!
Join Data Science Central