Subscribe to DSC Newsletter

December 2014 Blog Posts (51)

What’s Hot & What’s Not in Data Science 2015

Interesting infographics from CrowdFlower. In the hot category, I would add data plumbing, sensor data to better predict Earthquakes, weather or solar flares, predictive analytics for flu and other health or environmental issues, automating data science and man-made statistical analyses, pricing optimization for medical procedures, customized drugs, car traffic optimization via sensor data,…


Added by Mirko Krivanek on December 31, 2014 at 7:30pm — 2 Comments

Rigorous Generalized L^p Variance

An article by Vincent Granville posted to Hadoop360 introduces a formal method to generalize the notion of variance based on L^p norms. Whereas the formal generalization suggested in the article did meet several desired criteria, it left other desirable criteria unmet. In particular, there was no formal connection between the generalized variance and an associated generalized mean, and there was…


Added by Bryan M. Gorman on December 31, 2014 at 12:40pm — 1 Comment

Understanding Linear Regression

Abstract: Although Linear Regression is arguably one of the most popular analytical techniques, I believe it isn’t understood well. Several fundamental assumptions are violated during application. The objective of this note is to provide an overview of the assumptions and possible fixes.

Linear regression is arguably one of the most widely used techniques in the data science world. But, a comprehensive understanding of this technique is not universal and it is at a level that is…


Added by Jeevan Kumar R on December 30, 2014 at 9:00am — 3 Comments

Interactive Visualization enabled Feature Selection and Model Creation

Interactive Data Visualization or Visual Analytics

"A picture is worth a thousand words" or in the case of Data Science, we could say "A picture is worth a thousand statistics". Interactive Data Visualization or Visual Analytics has become one of the top trends in transforming business intelligence (BI) as technologies based on Visual Analytics have moved into widespread use.

Conventional Charts and Dashboards show conclusions but not the thinking behind it.…


Added by Mark Sharma on December 30, 2014 at 8:49am — No Comments

New Model for Scientific Research

This applies to data science research as well as any other analytic discipline. For centuries, scientific research was performed in Academia, by university professors managing their own labs. Much of the research was carried out by young scientists who just completed their PhD. The selection process has always favored the same type of personality. The basic rule is "publish or perish" which produces the following drawbacks:

  • Re-use of old material (rather than brand new material)…

Added by Vincent Granville on December 29, 2014 at 7:00pm — 13 Comments

Some statisticians have a biased view on data science

Most statisticians are great professionals, working on various data-intensive projects, and they don't care about their job title. You can say the same about data scientists, and me in particular. However, there is a small cluster of statisticians - Andrew Gelman seems to be their leader and their only influencer - who have been challenging us, even publicly insulting us recently.…


Added by Vincent Granville on December 28, 2014 at 9:00pm — 10 Comments

Engineering a far worse attack than Sony, without hacking

To be more precise, this kind of attack would rely on business hacking, rather than computer hacking. Other attacks, some potentially as massive as to turn Google into the worst search engine, are described below.

The Sony attack

I believe that such an attack could be accomplished by an insider…


Added by Vincent Granville on December 28, 2014 at 4:00pm — 1 Comment

Common Problems with Data

When learning data science a lot of people will use sanitized datasets they downloaded from somewhere on the internet, or the data provided as part of a class or book. This is all well and good, but working with “perfect” datasets that are ideally suited to the task prevents them from getting into the habit of checking data for completeness and accuracy.

Out in the real world, while working with data for an employer or client, you will undoubtedly run into issues with data that you…


Added by Randal Scott King on December 28, 2014 at 11:30am — 1 Comment

Data Science Meets Bubbly: What Data Says About Champagne Buying Patterns

Everyone loves champagne, right? But what strongly influences people’s behavior to purchase that bottle of bubbly? A growing body of research literature has found that a number of factors, including…


Added by Renette Youssef on December 24, 2014 at 3:00pm — 4 Comments

Weekly Digest - December 29

The full version is always published Monday. Starred articles or sections are new additions or updated content, posted between Thursday and Sunday. Articles marked with a + have interesting visualizations.



Added by Vincent Granville on December 24, 2014 at 11:00am — No Comments

Question: How to start in Big Data Analytics

Hi All, I am Oracle DBA by profession (9 years experience), but Mechanical engineer (in fact post graduate) by education. Now, I want to work in the field of BIG DATA & Analytics, using AI techniques (most are open source now-a-days like machine learning Mahout , Spark etc.). However, I am not getting any right opportunity/path to do so. None is such in my current profile. So, seek you all guys expert advise and guidance, what should I do ? How to reach someone who could harness my…


Added by Deepesh Chandra Shukla on December 24, 2014 at 2:00am — 2 Comments

Are Earthquakes becoming more severe?

Every data scientist worth her salt will immediately notice that the biggest Earthquakes (magnitude above 9) took place in the last 60 years or so.

Northridge Earthquake

Most journalists, and even some…


Added by Vincent Granville on December 23, 2014 at 11:30am — 1 Comment

Actionable Insights from Competitive Research

Keeping your eye on your competitors is a vital strategy for helping your business grow. By watching what they're doing and looking at their successes and failures, you'll be able to keep a leg up and a competitive edge. That being said, we're going to look a little more in-depth into why you need to be incorporating competitive research into your SEO and digital marketing strategy, some metrics you should be looking at, and actionable results that you can look at to know that…


Added by Robert Cordray on December 22, 2014 at 11:30am — No Comments

What can be predicted, and what can't?

Given the right data being correctly collected, and analyzed using sound predictive models, what can be predicted, and what can't be predicted no matter what?

I believe that I have an answer to this question. All systems and processes that rely on some energy source can be predicted, and the other way around. Note that energy…


Added by Vincent Granville on December 21, 2014 at 8:00pm — 1 Comment

Fallacy of Rational Prerequisite & My Fruitless Existence

Before elaborating on my fruitless existence - about my decision to avoid fruit - I want to emphasize how this blog is actually about something that I call the "Fallacy of Rational Prerequisite." There will be some misunderstanding about this term even after my prolonged explanation. I just want to state plainly at the outset that I am not proposing that people become irrational. If they are already so, I am not suggesting that they further the situation.…


Added by Don Philip Faithful on December 20, 2014 at 8:21am — No Comments

Why Media Bias Has Nowhere to Run and Hide from Data Science

When you want to see the face of biased reporting in online news, you may not have to go further than, the satirical news site, The Onion. Titles such as “Media Reports of Bear Attacks May Be Biased”, “Weather Channel Accused of Pro-Weather Bias”, and “Media Criticized for Hometown Sports Reporting” can make us laugh, but they can…


Added by Renette Youssef on December 19, 2014 at 10:46am — 1 Comment

The data science project lifecycle

How does the typical data science project life-cycle look like?

This post looks at practical aspects of implementing data science projects. It also assumes a certain level of maturity in big data (more on big data maturity models in the next post) and data science management within the organization. Therefore the life cycle presented here differs, sometimes significantly from purist definitions of 'science' which emphasize the…


Added by Maloy Manna on December 18, 2014 at 2:39pm — 6 Comments

Key Takeaways: Pivotal’s Top 10 2015 Predictions

On Tuesday 12/16, I attended Pivotal’s Top 10 Data Science Predictions in 2015 webinar.

The webcast was ran by leaders from the Pivotal Data Science  team – Annika Jimenez, Kaushik Das and Hulya Farinas – who shared their insights on the key Data Science industry trends for the coming year. The webcast came off as a bit scripted, but one could tell that these three individuals have a passion for Data Science discipline and it’s future.

In this post, I’d like to take a…


Added by Anthony Dutra on December 18, 2014 at 6:56am — No Comments

Weekly Digest - December 22

The full version is always published Monday. Starred articles or sections are new additions or updated content, posted between Thursday and Sunday. Articles marked with a + have interesting visualizations.



Added by Vincent Granville on December 17, 2014 at 7:30pm — No Comments

Infographic: Data Science 2015 -- What's Hot & What's Not

CrowdFlower is excited to release our first “What’s Hot & What’s Not in Data Science” infographic. According to our team of data scientists, the forecast for 2015 includes data’s major impact on the Internet of Things, changes in the skills and structure of the data scientist role and heavy emphasis on finding rich data within big data.…CrowdFlower_Graphic_Whats_Hot2015


Added by Renette Youssef on December 17, 2014 at 8:23am — No Comments

Blog Topics by Tags

Monthly Archives













  • Add Videos
  • View All

© 2020   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service