I created this blog to further discuss the issue of mass data assignments, a methodology that allows qualitative data events to be incorporated into metrics such as performance indicators. These assignments are routine for me now after having developed a prototype. However, I am unaware of the prevalence of this or similar techniques in the broader community. So I periodically work the topic into my blogs to help stimulate discussion. When quantitative data exists, it means that we had…
Added by Don Philip Faithful on November 22, 2014 at 8:18am —
I attended the Kellogg Alumni event last week at the Facebook campus here in Bay Area. While I got the opportunity to meet some old friends and make some new acquaintances, I was thrilled to watch an expert panel discuss an exciting subject entitled, "Big Data does not Make Decisions - Leaders Do"; thrilled simply because the entire discussion was about what I live everyday in my work, and what I strive to achieve with my team and my organization.
Added by Ram Sangireddy on November 21, 2014 at 2:26pm —
Watching Super Bowl commercials is a celebrated American tradition. With more than a hundred million viewers tuned in, the pressure to stand out on Super Bowl Sunday is sky high. That's why many companies try to make their spots just-controversial-enough. But not going overboard is easier said than done. This tone deafness was on full display in 2011 when … Continue
Added by Renette Youssef on November 21, 2014 at 1:33pm —
This exercise was done to understand the software skills that are in high demand for Data Science. Analysis was done by extracting the job postings from popular online websites. The findings are interesting. R continues to be the most popular skill, found in 70% of the postings. Python follows as a close second. Surprisingly, in spite all the talk about "Big Data Science", SQL comes up third. This shows that traditional RDBMS still continue to be the base for machine learning work… Continue
Added by Kumaran Ponnambalam on November 21, 2014 at 1:30pm —
After getting oriented to the research problems of phenology, understanding data collection and storage, and discussing the statistical methods and approaches during the past few days of our expedition to Acadia National Park, we dug into solutions and designs on day four.
Fundamentally, more complete and accurate data sets around bird migration, barnacle abundance, weather, duck population, and water resource data all help us… Continue
Added by Srivatsan Ramanujam on November 21, 2014 at 10:22am —
Logi Analytics' recently published its second annual executive review of embedded analytics trends and tactics. It's called "2014 State of Embedded Analytics". In this report, they make an interesting claim:
"What's exciting to all of us at Logi Analytics is that ALL software applications are becoming analytic…
Added by Susan Bennett on November 21, 2014 at 7:32am —
SAS UK & Ireland recently ran a competition to find the region's 'top data scientist'; the competition challenge was to produce a forecast of energy demand for the UK in the year 2020 based on the data provided. Competition for this coveted award was fierce; with the winner claiming a trip to SAS Global Forum in the USA and the chance to feature their submission on the… Continue
Added by Philip Male on November 21, 2014 at 2:30am —
In this series, we provided an introduction to the project and cited specific technology improvements that could transform the way phenology is studied by using stationary camera networks and machine based image… Continue
Added by Srivatsan Ramanujam on November 20, 2014 at 12:30pm —
As per US Census data, following chart illustrates that if you have less than "college degree" then the jumps in your salary is very less as you progress in your age
Notice the big jump in median salary with better than college degrees.
This is a perfect example of how a single visualization can tell so many… Continue
Added by Nilesh Jethwa on November 20, 2014 at 8:45am —
In the first post of this series, we gave the background on our data science expedition to Acadia National Park, and now we are seeing its transformative potential.
As a representative from Pivotal and EMC, our goal is to help a team of phenology… Continue
Added by Srivatsan Ramanujam on November 19, 2014 at 11:42am —
The full version is always published Monday. Starred articles or sections are new additions or updated content, posted between Thursday and Sunday.
Added by Vincent Granville on November 19, 2014 at 11:30am —
While large data sets may provide significant value in certain cases, data diversity and integrating smart data points will provide more consistent actionable insights and high value intelligence leading to better decision-making. Continue
For example, consider NFL football data. Focusing on large football game data sets is usually not helpful and often misleading creating little,…
Added by Michael Walker on November 19, 2014 at 11:13am —
Obviously, there are few topics that are as morbid as our own morbidity, and it is for this reason that we often avoid thinking about it. However, as it turns out, how we view the probability of the causes and the timing of our own demise is very important. Why? It is because these views are subconsciously affecting how we approach health related decisions that we are making every day.
Think of it this way. If you believed that you would likely die from an… Continue
Added by Renette Youssef on November 19, 2014 at 8:13am —
A couple of weeks ago, my team was asked to come up with a solution for an Enterprise Complaints Platform with Advanced Analytics capability for a Fortune 50 Bank. The initial scope statements were high level requirements like for example, Identification of high risk complaints that were likely to be escalated to regulatory agencies, Complaint Root Cause Analysis, etc.
It quickly became apparent that while the solution did include Advanced Analytics components what was really needed… Continue
Added by Manoj Sharma on November 19, 2014 at 5:23am —
As data scientists, we get excited about using our talents to solve problems like global climate change and worldwide environmental policy.
This week, I have the opportunity to represent Pivotal and team with other experts from EMC, Earthwatch, and Schoodic Institute to spend a week at… Continue
Added by Srivatsan Ramanujam on November 18, 2014 at 11:30am —
Three thoughts this time, for our first edition of Thoughts of the Week.
An estimate that is slightly biased but robust, model-independent, easy to compute, and easy to interpret, is better than one that is a non-biased, difficult to compute, mysterious, or not robust. That's one of the differences between data science and statistics.
Learning how to code, especially SQL, should be the last step in… Continue
Added by Vincent Granville on November 18, 2014 at 10:30am —
I've been writing a Tableau and Alteryx-focused blog for 1.5 years on Wordpress and haven't thought of writing anything here on DSC. I just completed a two-part series that discusses solving problems using innovative approaches with Alteryx and Tableau, which were my 99th and 100th blog posts. They are longer than usual but offer a good insight into my background and why I write a technical blog.
My blog is focused… Continue
Added by Kenneth C Black on November 17, 2014 at 8:30pm —
Starred articles were potential candidates for our picture of the week published in our weekly digest. Enjoy our new selection of articles and resources (R, data science, Python, machine learning etc.) Comments are from Vincent… Continue
Added by Amy on November 17, 2014 at 7:00pm —
They say that breaking up is hard to do. Now, data scientists know that it’s true. Neil Sedaka songs aside, they know it’s true because, in 2013, with the help of public data sourced from Twitter, they were able to track and listen-in on conversations between 661 couples who were in the process of ending their relationships.
Researchers Venkata… Continue
Added by Renette Youssef on November 17, 2014 at 11:30am —
The case study presented here - including root cause analysis and solution - was performed for a digital publisher. It offers a different perspective on what data scientists are capable of. The expert involved here is not a coder, certainly not a production guy, yet is able to leverage his business acumen and domain expertise to
- Imagine dozens of scenarios and rank them by chance of occurring
- Get silo-ed data from various departments (finance, sales, marketing, product,…
Added by Vincent Granville on November 15, 2014 at 10:00am —