From episode 10 of my Naked Analyst Channel on YouTube.
I think I do - and it is the ‘appification’ of analytics. What I mean by this is the reduction of a complex analytic activity such as market segmentation, down to a single button on your computer interface. Very much like the Apps on your smartphone, tablet or increasingly your desktop.
That’s what it looks like but the impacts are more profound. That’s because it makes it possible for analytics to be successfully done by people who may not understand how it works, but do understand the ‘why’ and ‘when’ they need to do it.
For example, a marketer in a company can access more sophisticated views of their campaigns without the need of a specialist analyst. Appification extends the range of analytic things that a non-specialist can do.
This appification is made possible because of three things that have emerged in recent years:
The last enabler needs a little further explanation. R is a free software programming language and software environment for statistical computing and graphics. It contains thousands of packages (10,000?) specializing in topics like econometrics, data mining, spatial analysis, and bio-informatics. Nobody knows how many R users there are, but a reliable estimate (see http://spatial.ly/2013/06/r_activity/) puts it in the millions. Many thousands have helped R develop over the years. I think that this sort of large-scale self-organising open source effort is beginning to teach the world how to use analytic algorithms.
The above is all supposition, but I can back this up with evidence. Here are 4 examples of algorithm markets - or at least they exhibit varying degrees of ‘algorithm marketness’.
In conclusion, there is still one issue needing resolution before algorithm markets take-off: How will the world’s business people get data into and out of these algorithm apps? I’m not sure yet, but I think the answer will be more apps. Apps that themselves appify the transformation and loading of data into and out of algorithm apps.
Confused? Well don’t be, like most new and shiny things, what we are talking about is just the next generation of ETL - something business intelligence people like myself have been building for the last 20 or more years.
What’s old is new again? Maybe, but from my perspective the future looks exciting for analytics.
My YouTube channel contains tips from a 25 year veteran of the analytic profession.
Views: 3253
Tags: Analysis, Analytics, Applications, Big, Cloud, Computing, Data, Integration, Learning, Machine, More…Mining, Open, R, Self-service, Source, Statistical
Comment
Here's a free Matlab Dimensional-Reduction API that may be useful to users:
http://homepage.tudelft.nl/19j49/Matlab_Toolbox_for_Dimensionality_...
The author's paper on the review of the algorithms is here:
"Dimensionality Reduction: A Comparative Review"
http://homepage.tudelft.nl/19j49/Matlab_Toolbox_for_Dimensionality_...
I think that Matlab is wider than any other language in numerical computing. Not only the product has extensive libraries & toolboxes of its own, but other free Matlab APIs made available by academic researchers is abundant. For example, one can get free state-of-the-art algorithms (recently available in the literature) in "Dimensional-Reduction" on the net in Matlab, no on one algorithm but tons of them. One can find such algorithms in R or python packages, but not many variants (perhaps at most 4 with standard ones like PCA, SVD, etc...). The reason is because Matlab was built from day one to be wide as possible, like its the dominant tool in most or all engineering schools at universities around the world (with APIs in signal processing, statistics, image processing, control systems, econometric, system identifications & many more), but not like R that was originally developed only to target Statistical analysis but it has widen its domain of applications.
A good blog, but I have to respectfully disagree that with novel enabling data analytics products, platforms, and SaaS, data scientists will be replaced with full automation of data science. Arguably, these emerging products and solutions will empower data scientists to unlock greater value proposition from continuously growing big data volume, variety, and velocity, as opposed to replacing them.
R is unlikely to make much headway in terms of deploying real-life solutions as it is more of a "lab language" than something that is used in productization. This is why Python is rapidly becoming the data science language of choice. Black boxing data science with the expectation that someone from business can simply push a button is a dangerous proposition and not something that is going to help data science ROI. There will be some tasks that users in, say, BI could handle but unless the product is fully automated as in HFT, business users should not be managing algorithms or their use on their business models. It is more about sitting down with domain experts and ensuring they understand what assumptions to models are making and where naive aspects of the model could fail. Business data is in constant flux and the idea that a button can simply be pressed to act on that data means a regular and strong effort of many data science professionals actively working towards making sure that button does something valid.
© 2020 Data Science Central ® Powered by
Badges | Report an Issue | Privacy Policy | Terms of Service
Most Popular Content on DSC
To not miss this type of content in the future, subscribe to our newsletter.
Other popular resources
Archives: 2008-2014 | 2015-2016 | 2017-2019 | Book 1 | Book 2 | More
Most popular articles
You need to be a member of Data Science Central to add comments!
Join Data Science Central