Summary: Just supposing the big boss just gave you this assignment: I want to get our analysts and data scientists organized around a standard set of software – make a recommendation for which platform we should choose.
Just supposing the big boss just gave you this assignment: I want to get our analysts and data scientists organized around a standard set of software – make a recommendation for which platform we should choose.
Wow, what an opportunity to show your stuff. Wow, the nausea is starting to set in. How to begin? Well let’s make this example easy and assume that your existing analysts and data scientists are amenable to adopting a standard and aren’t going to rise up in armed rebellion at having to adopt a single system. And in truth, once you’ve got a single standard then recruiting more data science staff in the future can be more focused since experience with the system will become a prerequisite. But the hiring strategy is material for another time. Where to start looking for advice?
If this were perhaps a smaller project or one with greater specificity and defined specs and we didn’t personally understand all the ins and outs of the technology then a logical approach would be to go out and find an expert consultant and follow his advice. But this question is broader than that. It’s unlikely there’s a single consultant who is sufficiently expert in all the platforms to intelligently compare and contrast them. So logically we want to narrow our options first and almost always our first stop will be one of the expert review organizations – OK let’s just cut to the chase; we’re talking about Gartner.
Out of curiosity I recently undertook this challenge to see if there was any consistency of opinion. The result was much worse than I thought but for a reason you might not suspect.
First of all, if you look at all the Gartner reports which include say SAS, IBM, SAP, and Oracle combined with the word ‘analytics’ or other key words associated with data science there are eight, count them eight separate reports Gartner produces to review them.
And all but one of these (which is apparently too new) has a magic quadrant calling different winners and losers. Here are some observations about what I found when trying to use these to draw a conclusion to the big boss’s orders.
Note: There are actually 9 but I’ve elected to omit the Gartner study on Data Quality Tools for simplicity. Clearly there is overlap between MDM and the data cleaning and transformation steps but most companies aren’t ready to see this yet and treat MDM as an add-on after analytics are fully underway.
Source: All the following image sources are Magic Quadrants from Gartner’s most current review in each category.
BI and Analytics Platforms
Based on the title this is probably where you’d start. I did. But this would be misleading. This category is more appropriate for companies considering a complete replacement of their BI system and chances are this doesn’t happen very often. Gartner acknowledges that BI has changed a lot over the last ten years. Yes it still plays its central role as the system of record and single source of the truth. But any BI system worthy of consideration today includes a healthy dose of predictive analytic and data viz capability available on a decentralized basis. Gartner says they use these criteria in evaluating vendors:
And while there may be a strong impetus for some folks to remain a one-vendor shop, just because you’ve got Oracle or SAP doesn’t mean that those are necessarily the right choices for your analytics platform.
Advanced Analytics Platforms
This study would probably be your next stop since it seems to address the specific issue we’re trying to resolve.
As you read through this I hope you’ll look back and forth at the various Magic Quadrants and note how different the outcomes can be. In this study Gartner says vendors were evaluated based on these 10 criteria:
This looks about right for our decision making until you realize that most of the vendors offer multiple products with differing levels of capability and at different price points. SAS for example has at least seven foundational product offerings each with different and overlapping capabilities in advanced analytics. And like many vendors they use these same products along with UI templates to customize them for specific market verticals under different names. Is a lower cost SAS package better than a higher cost SPSS package? Your research has only just begun.
Also note that R never appears as a rated package. This makes sense since there is no single corporate entity offering R and since it’s more like infrastructure than product. But be aware that many if not all of the vendors in this group allow R integration and can therefore claim all of R’s thousands of unique capabilities for themselves.
That’s not the same as ‘being in the package’. And it may not be the recipe for standardization that you were looking for. On the one hand you might as well just have an R shop with everyone doing their own thing – goodbye standardization and control. On the other hand, the ability to use R when the package doesn’t have the analytic technique you want is pretty powerful.
Data Integration Tools
Wait a minute! Access to multiple data sources and blending or integrating that data is integral to advanced analytics. Why is this a separate category?
Remember that data access, filtering and manipulation was the number one criteria in the Advanced Analytics rating. Why are these results different?
Gartner would tell us that the vendors in this group actually do data blending and integration for a broader range of applications than just predictive analytics, specifically for:
Well these look suspiciously to me like the same things we look for in our analytics platform except perhaps for synchronizing data among operational apps. And there are some unusual omissions. Where for example is Alteryx that stakes it claim on data blending?
All the Rest
For the sake of brevity I’m going to lump the next three reports into a single discussion. Here’s my problem with these. First, you’d have to line up the definitions point by point to discern that these are actually different. OK if you’re a marketing guru then I’ve probably just stepped on your ontological toes, but for most of us, not so.
Second, these are largely if not completely the same as many vendors’ foundation analytic products with different UI templates and some techniques emphasized.
But my sense is that you could do essentially all of these things with a good foundational analytic package from the Advanced Analytics group above, and if you think you really need these features, then by all means do a cost/benefit study. What I find misleading is that the winners and losers in each category can be materially different on such narrow criteria. Also, what predictive analytic tools and capabilities are you giving up to get these simplified tools meant largely to be used by non-data scientists?
Integrated Marketing Management
Multichannel Campaign Management
The multichannel campaign management (MCCM) market comprises vendors that seek to orchestrate company communications and marketing offers to customer segments across channels, such as websites, mobile, social, direct mail, call centers and email. Capabilities include:
Marketing Resource Management
Marketing resource management (MRM) is a set of processes and capabilities designed to enhance a company's ability to orchestrate and optimize internal and external marketing resources. MRM applications enable companies to:
Operational Risk Management
This is another template product built on top of foundational analytic platforms and pitched to CFOs and Risk Managers in highly regulated or risk averse verticals like financial services and insurance. Most of this requires robust forecasting what-ifs and optimization tools.
User Behavior Analytics
There is one final template set of applications that might better be labeled Detection of Fraud and Abuse. The category is so new that it doesn’t yet have its own Magic Quadrant, just this table.
In many companies this is the stuff of Internal Audit, a new and slow to evolve vertical for analytics. Turns out fraud and abuse is tough to spot and Gartner lists lots of caveats for this group basically warning about the need for expert use and the wide variations in performance among vendors. Like the marketing applications above, this simplified and templated product group can be accomplished with almost any of the foundational analytic products and a good data scientist.
TMI! Too Much Information! Is this report overload really helpful? There are several warnings here:
August 5, 2015
Bill Vorhies, President & Chief Data Scientist – Data-Magnum - © 2015, all rights reserved.
About the author: Bill Vorhies is President & Chief Data Scientist at Data-Magnum and has practiced as a data scientist and commercial predictive modeler since 2001. Bill is also Editorial Director for Data Science Central. He can be reached at: