Summary: Simpson’s Paradox. A source of risk for real time analytics and for the citizen data scientist.
Most of us practicing the predictive arts know to look for sources of bias in our data. There are seven that are common, the first six of which are:
Guest Blog by Jose Dianes at R-Data Science
The purpose of many data science projects is to end up with a model that can be used within an organisation to solve a particular problem. If this is our case, we need to determine the right representation of that model so it can be shared in the easiest, cheapest, and most effective way. Web data products are an ideal…
A lot has been said about the value of data viz, but the folks at R2D3 have truly taken this to a whole new level by using very sophisticated but also very intuitive data viz techniques to teach the basics of machine learning. I was really blown away by the way the step-by-step visualizations on this page lead the reader through all the intuitive steps to arrive at a pretty clear understanding of machine learning, in this case focusing on decision trees.
If you are an experienced…Continue
Guest blog by R. Bhargav
What does “Big Data” mean?
The term “big data” is self-explanatory -a collection of extremely big data sets that normal computing techniques cannot process. The term not only refers to the data, but also to…Continue
We have a lot to learn from our colleagues in medical research who use the same statistical tests as we do but on much smaller samples. Hilda Bastian gives us both a good review of the limits of statistical accuracy and always accompanies it with an original…Continue
Added by William Vorhies on August 24, 2015 at 3:24pm — No Comments
Guest blog by Fabio Souto
A curated list of awesome data visualizations frameworks, libraries and software. Inspired by awesome-python.Continue
Analytics are like a mosquito in a nudist colony. Why? Because there are so many opportunities!…Continue
Added by William Vorhies on August 18, 2015 at 10:00am — No Comments
It is with heavy heart that I must relay to you that Gartner has dropped “Big Data” from its 2015 Hype Cycle for Advanced Analytics and Data…Continue
When I think of Big Data and NoSQL I think of Big-Web-User companies like Amazon, Google, Twitter, Netflix, and other similar companies that amaze and entertain us by using the latest in NoSQL-based data science to bring us features that are useful and novel. Mostly that means using recommenders, NLP, IoT, and advanced search algorithms to present just the right part of the their Big Data databases to us users.
There aren't many examples of more traditional companies, even…Continue
Now that everyone is thinking about IoT and the phenomenal amount of data that will stream past us and presumably need to be stored we need to break out a vocabulary well beyond our comfort zone of mere terabytes (about the size of a good hard drive on your desk).
In this article Beyond Just “Big” Data author Paul McFedries argues…Continue
This is a guest blog from David Lefkowich, VP Sales and Marketing for FreeSight SoftwareContinue
Added by William Vorhies on August 12, 2015 at 6:30am — No Comments
Summary: Data Scientist may be a prestigious title but it doesn’t reflect our area of specialization or the depth of our experience. As legions of newly minted Data Scientists are granted degrees over the next few years the problem for both employee and employer will only grow worse.
With the explosion in undergraduate and graduate level offerings in data…Continue
Zoher Karu is Vice President of Global Customer Optimization and Data at eBay, where he works to use data, analytics, and insights to drive growth across all customer interactions, with…Continue
Added by William Vorhies on August 10, 2015 at 2:43pm — No Comments
Here's a different angle on a much analyzed question at the heart of our professional activities. In this article, Steve Miller of Inquidia tackles how NoSQL has changed our traditional understanding of Predictive Analytics and Data Science. You might also look back at our previous post How NoSQL Fundamentally Changed Machine Learning.
Here's the beginning…Continue
Added by William Vorhies on August 7, 2015 at 7:11am — No Comments
Guest blog sent by Veronica Johnson at Investintech.com
see the original here
In this day and age of big data and information overload, data visualizations are, hands down, the most effective way of filtering out and presenting complex data.
Added by William Vorhies on August 5, 2015 at 11:00am — No Comments
See the full blog here
R allows you to create different plot types, ranging from the basic graph types like density plots, dot…Continue
Added by William Vorhies on August 4, 2015 at 9:39am — No Comments
guest blog by Jin Kim, VP Product Development for…Continue
Added by William Vorhies on August 3, 2015 at 8:30am — No Comments