Generally speaking descriptive analytics is the process of making some sense or adding some structure to a data set, which at times can be very large. For this reason, most discussions referring to “analytics” in business are actually talking about descriptive analytics (Bertolucci, 2013). The most obvious example is when we run descriptive statistics at the beginning of a study, and look at things like the range, mean, median, quartiles, skew and kurtosis. We’re gaining a picture of how the data breaks down. In some cases, that may be actually all that we’re looking for, but in most cases we’ll want to drill down further in our understanding of the data. For example, when we run a clustering algorithm, such as k-means, it allows us to organize the data into groups which appear to have something in common with each other. Sometimes this reflects divisions that are clear in real life, and sometimes it’s more like something that we impose on the data because it’s helpful for our analysis.

Predictive analytics, on the other hand, may also provide a look at the shape of the data, but it also allows us to identify a trend and make a mathematical prediction about a future event. In short, you’re analyzing the past (perhaps the very recent past, as in “real-time” data, but nevertheless the past), to forecast the future. Perhaps the easiest way to understand this is to think about regression techniques, where you’re identifying a trend line in data--its underlying mathematical formula allows you to make a prediction about what will happen in the future, under similar conditions. Anyone who has taken algebra can understand the principle--once you’ve determined the formula of the model along with its coefficients, you simply plug in your dependent variable, and get a predicted value for an outcome.

Where it gets a little bit confusing is when we bring “inference” into the mix, which as defined by Merriam-Webster involves “passing from statistical sample data to generalizations (as of the value of population parameters) usually with calculated degrees of certainty”. In other words, we’re making an educated guess about what may happen in the future based on what we’re seeing in the data. This kind of inference is broadly applied to data analytics that are *predictive*, and also to those considered *descriptive*. For example, if Netflix uses some form of clustering to group users according to common taste--such as people who like foreign films--they’re very much going to use that descriptive data to inform the movies that they “recommend” to you. They are, in fact, making a prediction based off of descriptive data. It may not have the mathematical equation attached to it (or maybe it does, as they may have more advanced tricks up their sleeves), but is nonetheless a prediction. On the other hand, if Netflix was using a regression model, it might find a correlation between one group of variables and another, and make a quantified prediction about something.

In summary, there seems to be a bit of overlap between descriptive and predictive techniques, but the definition may be similar to supervised and unsupervised learning, where one involves making a prediction based on past scenarios where we can identify a known outcome, and the other involves going through and mapping out what has happened in the past.

Bertolucci, J. (2013, January 12). Big Data Analytics: Descriptive Vs. Predictive Vs. Prescriptive. InformationWeek.Retrieved from: https://www.informationweek.com/big-data/big-data-analytics/big-data-analytics-descriptive-vs-predictive-vs-prescriptive/d/d-id/1113279

Definition of Inference. (n.d.). *Merriam-Webster*. Retrieved on Aug. 14, 2018 from the Merriam-Webster website: https://www.merriam-webster.com/dictionary/inference

© 2021 TechTarget, Inc. Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions

## You need to be a member of Data Science Central to add comments!

Join Data Science Central