Here is how to begin your data science journey:
- Buy a book on modern data science, avoid statistics textbooks re-labeled as data science like plague: they will lead you to nowhere. Any public-domain stuff that's been invented 50 years ago will lead to a job that will eventually be replaced by a robot - we are working on this to make it happen. If you have an analytic background, my book…
Continue
Added by Vincent Granville on November 27, 2014 at 4:30pm —
6 Comments
The full version is always published Monday. Starred articles or sections are new additions or updated content, posted between Thursday and Sunday.
Featured
Continue
Added by Vincent Granville on November 27, 2014 at 11:00am —
No Comments
Here I describe a case study: a solution based on high-level data science. By high level, I mean data science not done by statisticians, but by decision makers accessing, digging into, and understanding summary (dashboard) data to quickly make a business decision with immediate financial impact. There is also a section on smart imputation techniques, with patentable, open intellectual property that we created after investigating this problem.
This article is articulated in three…
Continue
Added by Vincent Granville on November 26, 2014 at 10:30am —
No Comments
The word “Force Multiplier,” in military usage, refers to an enabler or a combination of enablers, which make a given force more effective than that same force would be without it.
So what force multipliers can be used in the world of Data Products ?
Based on our experience in curating intelligent data products for the industrial world we feel a Data Product needs a healthy dose of 2 things
1. Data Science = Seeing the unseen (…
Continue
Added by derick.jose on November 25, 2014 at 10:12pm —
1 Comment
Trying to deploy a pervasive analytics strategy for your entire organization, or even a department, is not an easy task. That being said, changing the way people work has never been an easy task, imagine cloud computing 15 years ago. Until the average employee can physically see how this new technology can change their live for the better, there is always friction in adoption because of traditional habits. This friction is why organizations need to make sure they have a well…
Continue
Added by TJ Laher on November 25, 2014 at 8:00am —
No Comments
Implementing a Distributed Deep Learning Network over Spark
Authors: Dr. Vijay Srinivas Agneeswaran, Director and Head, Big Data Labs, Impetus {[email protected]}
Ghousia Parveen Taj, Lead Software Engineer, Impetus { [email protected]}
Sai Sagar, Software Engineer, Impetus {[email protected]}
Padma Chitturi, Software Engineer,…
Continue
Added by Dr. Vijay Srinivas Agneeswaran on November 25, 2014 at 1:30am —
6 Comments
Companies build or rent grid machines when data length doesn't fit into HDFS, or the latency of parallel interconnects is too slow in the cloud. This review explores the overlap of the two paradigms at the ends of the parallel processing latency spectrum. The comparison is almost poetic and leads to many other comparisons in languages, interfaces, formats, and hardware, but there is amazingly little overlap.
Your Laptop Is A Supercomputer
To put things in perspective,…
Continue
Added by Peter Higdon on November 24, 2014 at 4:35am —
No Comments

Introduction
Trendwatching.com is a global, independent organization that helps forward-thinking business professionals understand “the new consumer” and subsequently uncover…
Continue
Added by Susan Bennett on November 23, 2014 at 5:42pm —
No Comments
I created this blog to further discuss the issue of mass data assignments, a methodology that allows qualitative data events to be incorporated into metrics such as performance indicators. These assignments are routine for me now after having developed a prototype. However, I am unaware of the prevalence of this or similar techniques in the broader community. So I periodically work the topic into my blogs to help stimulate discussion. When quantitative data exists, it means that we had…
Continue
Added by Don Philip Faithful on November 22, 2014 at 8:18am —
No Comments
I attended the Kellogg Alumni event last week at the Facebook campus here in Bay Area. While I got the opportunity to meet some old friends and make some new acquaintances, I was thrilled to watch an expert panel discuss an exciting subject entitled, "Big Data does not Make Decisions - Leaders Do"; thrilled simply because the entire discussion was about what I live everyday in my work, and what I strive to achieve with my team and my organization.
The panel…
Continue
Added by Ram Sangireddy on November 21, 2014 at 2:26pm —
No Comments
Watching Super Bowl commercials is a celebrated American tradition. With more than a hundred million viewers tuned in, the pressure to stand out on Super Bowl Sunday is sky high. That's why many companies try to make their spots just-controversial-enough. But not going overboard is easier said than done. This tone deafness was on full display in 2011 when …
Continue
Added by Renette Youssef on November 21, 2014 at 1:33pm —
1 Comment
This exercise was done to understand the software skills that are in high demand for Data Science. Analysis was done by extracting the job postings from popular online websites. The findings are interesting. R continues to be the most popular skill, found in 70% of the postings. Python follows as a close second. Surprisingly, in spite all the talk about "Big Data Science", SQL comes up third. This shows that traditional RDBMS still continue to be the base for machine learning work…
Continue
Added by Kumaran Ponnambalam on November 21, 2014 at 1:30pm —
3 Comments
After getting oriented to the research problems of phenology, understanding data collection and storage, and discussing the statistical methods and approaches during the past few days of our expedition to Acadia National Park, we dug into solutions and designs on day four.
Fundamentally, more complete and accurate data sets around bird migration, barnacle abundance, weather, duck population, and water resource data all help us…
Continue
Added by Srivatsan Ramanujam on November 21, 2014 at 10:22am —
No Comments
Logi Analytics' recently published its second annual executive review of embedded analytics trends and tactics. It's called "2014 State of Embedded Analytics". In this report, they make an interesting claim:
"What's exciting to all of us at Logi Analytics is that ALL software applications are becoming analytic…
Continue
Added by Susan Bennett on November 21, 2014 at 7:32am —
1 Comment
SAS UK & Ireland recently ran a competition to find the region's 'top data scientist'; the competition challenge was to produce a forecast of energy demand for the UK in the year 2020 based on the data provided. Competition for this coveted award was fierce; with the winner claiming a trip to SAS Global Forum in the USA and the chance to feature their submission on the…
Continue
Added by Philip Male on November 21, 2014 at 2:30am —
No Comments
In this series, we provided an introduction to the project and cited specific technology improvements that could transform the way phenology is studied by using stationary camera networks and machine based image…
Continue
Added by Srivatsan Ramanujam on November 20, 2014 at 12:30pm —
No Comments
As per US Census data, following chart illustrates that if you have less than "college degree" then the jumps in your salary is very less as you progress in your age
Small increase in salary for less than college degree

Notice the big…
Continue
Added by Nilesh Jethwa on November 20, 2014 at 8:30am —
2 Comments
In the first post of this series, we gave the background on our data science expedition to Acadia National Park, and now we are seeing its transformative potential.
As a representative from Pivotal and EMC, our goal is to help a team of phenology…
Continue
Added by Srivatsan Ramanujam on November 19, 2014 at 11:42am —
No Comments
The full version is always published Monday. Starred articles or sections are new additions or updated content, posted between Thursday and Sunday.
Featured
Continue
Added by Vincent Granville on November 19, 2014 at 11:30am —
No Comments
While large data sets may provide significant value in certain cases, data diversity and integrating smart data points will provide more consistent actionable insights and high value intelligence leading to better decision-making.
For example, consider NFL football data. Focusing on large football game data sets is usually not helpful and often misleading creating…
Continue
Added by Michael Walker on November 19, 2014 at 11:00am —
1 Comment