I've been writing a Tableau and Alteryx-focused blog for 1.5 years on Wordpress and haven't thought of writing anything here on DSC. I just completed a two-part series that discusses solving problems using innovative approaches with Alteryx and Tableau, which were my 99th and 100th blog posts. They are longer than usual but offer a good insight into my background and why I write a technical blog.
My blog is focused…Continue
From episode 10 of my Naked Analyst Channel on YouTube.
I think I do - and it is the ‘appification’ of analytics. What I mean by this is the reduction of a complex analytic activity such as market segmentation, down to a single button on your computer interface. Very much like the…Continue
Graphs are everywhere, used by everyone, for everything. Neo4j is one of the most popular graph database that can be used to make recommendations, get social, find paths, uncover fraud, manage networks, and so on. A graph database can store any kind of data using a Nodes (graph data records), Relationships (connect nodes), and Properties (named data values).
A graph database can be used for connected data which is otherwise not possible with either relational or other NOSQL databases…Continue
Added by Raghavan Madabusi on September 19, 2014 at 6:31pm — No Comments
This article provides a full demo application using both the C# and R programming languages interchangeably to rapidly identify and cluster similar images. The demo application includes a directory with 687 screenshots of webpages. Many of these images are very similar with different domain names but near identical content. Some images are only slightly similar with the sites using the same general layouts but different colors and different images on certain…
Added by Jake Drew Ph.D. on June 25, 2014 at 4:00pm — No Comments
I was reading through my Twitter feed the other day and saw a comment about the R language being too ad hoc for users. It got me thinking, "Is that bad? Aren't most languages initially seen as ad hoc?".
The beauty of R as a data science tool is its "ad hocedness" in that its use can satisfy multiple interests. Initially I can see this as troublesome in that learning the specificity of a tool's use can be daunting. But in the long-run I think this benefits a…Continue
Added by Justin on May 15, 2014 at 5:04pm — No Comments
I recently added two new data analytics books from Pearson to my growing Data Science and Big Data stack:Continue
Added by Kirk Borne on March 29, 2014 at 11:15am — No Comments
Hey Data Scientists,
I wanted to reach out about Plot.ly, a new startup for analyzing and beautifully visualizing data. We just launched a beta.
It is built for math, science, and data applications. We'd love your thoughts.
Added by Matthew Sundquist on November 9, 2013 at 10:40pm — No Comments
One of the most popular methods or frameworks used by data scientists at the Rose Data Science Professional Practice Group is Random Forests. The…Continue
Statistics.com, a provider of online education in statistics and analytics, announces a partnership with CrowdANALYTIX, a predictive modeling “managed crowdsourcing” company, offering a new online course, “Applied Predictive Analytics in partnership with CrowdANALYTIX“, which will run from Oct. 11 to Nov 8, 2013.
The goal of this course is to teach users (who have basic knowledge of R programming, predictive analytics and statistics)…Continue
Added by Janet Dobbins on September 11, 2013 at 6:58am — No Comments
Bob Muenchen's very useful work on this topic, SAS Dominates Analytics Job Market; R up 42% sent me back to some 2012 work we did at Statistics.com on the subject of what employers are looking for in the way of analytics skills. First, our main results:
1. Our numbers showed a much less SAS-dominant world: 1.92 SAS jobs for every R job. Bob had found the ratio to…Continue
I found it odd there was no way to automatically deskew data in R, so I wrote a short little function to do it. It noticeably improves the peformance of linear models and linear support vector machines.
Hadoop (MapReduce where code is turned into map and reduce jobs, and Hadoop runs the jobs) is the most well known technology used for "Big Data" because it allows an organization to store huge quantities of data at very low…Continue
Added by Michael Walker on November 7, 2012 at 3:57pm — No Comments
There is no question that the USA (in fact, most of the world) would be well-served with more quantitatively capable people to work in business and government. However, the current hysteria over the shortage of data scientists is overblown. To illustrate why, I am going to use an example from air travel.
On a recent trip from Santa Fe, NM to Phoenix, AZ, I tracked the various times:
Added by Neil Raden on June 27, 2012 at 10:00am — No Comments