Subscribe to Dr. Granville's Weekly Digest

All Blog Posts (1,516)

New batch of ML resources and articles, March 31

Starred articles were potential candidates for our picture of the week published in our weekly digest. Enjoy our new selection of articles and resources (R, data science, Python, machine learning etc.) Comments are from Vincent…


Added by Data Science Girl on March 31, 2015 at 9:30am — No Comments

Model Management and Deployment

Have you experienced or thought how corporates manage their analytical assets which are mission critical to the business? A Bank or a Telecom Service Provider may often have more than 100 predictive model assets developed over a time period, but faces an important issue of how to effectively manage,store,share or archive these assets.

The next breakthrough in data analysis may not be in individual algorithms, but in the ability to rapidly combine, deploy, and…


Added by Ashish Jain on March 29, 2015 at 2:00pm — No Comments

How to Become a Data Scientist for Free

Big Data, Data Sciences, and Predictive Analytics are the talk of the town and it doesn’t matter which town you are referring to, it’s everywhere, from the White House hiring DJ Patil as the first chief data scientist to the United Nations using predictive analytics to forecast bombings on schools.…


Added by Zeeshan Usmani on March 28, 2015 at 4:03pm — No Comments

How a Chief Data Officer Can Make Your Data Great

Fresh data is usually pristine. It’s data in it’s clearest, most accurate form – straight from the customer or client. If you’ve put measures in place to cut back on data input errors, such as form validation, you can be reasonably sure that the newest records in your CRM…


Added by Martin Doyle on March 27, 2015 at 1:00am — No Comments

Weekly Digest - March 30

The full version is always published Monday. Starred articles or sections are new additions or updated content, posted between Thursday and Sunday.



Added by Vincent Granville on March 25, 2015 at 8:00am — No Comments

A 5-Step Approach to Manage Marketing Data (For Manufacturers)

Good data is the driving force behind successful marketing. Data can be analyzed to determine what your customers are looking for, what will drive them to purchase, and to establish a best prospect profile. According to a report by GlobalSpec, the primary marketing goals for manufacturers are customer acquisition (43%) and lead generation (29%), with 54% planning to increase marketing spend.…


Added by Larisa Bedgood on March 25, 2015 at 5:00am — No Comments

Do data scientists need to be domain experts to deliver good analytics?

During my 30 years of analytics career, prospective employers and clients have often asked me: "How can you help us with data-driven insights when you have not worked in this industry before?". I argue for greater emphasis on machine learning skills in the data scientist and their partnership with domain experts as an effective pathway to bring data science to a business.

Clearly, the description of data scientist as the mythical unicorn who has computer science skills,…


Added by Bhavani Raskutti on March 24, 2015 at 7:00pm — 6 Comments

The Data Science Ecosystem in One Tidy Infographic

It probably comes as no surprise, but we talk to a lot of data scientists at CrowdFlower. We like learning the tools they use, the programs that make their lives easier, and how everything works together. Today, we'll really pleased to unveil the first of a three-part series about the data science ecosystem. Here it is in infographic form because, let's face it, everybody likes infographics: …


Added by Renette Youssef on March 24, 2015 at 1:36pm — 7 Comments

27 free data mining books

I've received an unsolicited email today from Pedro Marcus, from DataOnFocus. While usually I don't even open them due to the volume that I get each day, this one was actually very interesting, thus I'm sharing it with you.

Free data mining books…


Added by Mirko Krivanek on March 24, 2015 at 7:30am — 4 Comments

Can you Be a Growth Hacker Without Being a Data Scientist?

Guest blog post.

Growth Hacking is turning out to be one of the hottest growing fields for data analysts & scientists. Although, there is controversy about the term & the specific meaning, the general connotation implies a function, activity or person which is primarily focused on growing a set of metrics such as users, revenue, visits &…


Added by Vincent Granville on March 23, 2015 at 3:00pm — 2 Comments

Be as Smart as Your Devices: Learn About Big Data

When Apple CEO Tim Cook finally unveiled his company’s new Apple Watch in a widely-publicized rollout earlier this month, most of the press coverage centered on its cost ($349 to start) and whether it would be as popular among consumers as the iPod or iMac.

Nitin Indurkhya saw things differently.

“I think the…


Added by Peter Bruce on March 23, 2015 at 4:35am — No Comments

Predicting Car Prices Part 1: Linear Regression

1. Introduction:

Let’s walk through an example of predictive analytics using a data set that most people can relate to:prices of cars. In this case, we have a data set with historical Toyota Corolla prices along with related car attributes.

Let’s load in the Toyota Corolla file and check…


Added by Peter Chen on March 22, 2015 at 11:00am — 1 Comment

Predicting Car Prices Part 2: Using Neural Network

1. Introduction

This is part two of the series. In part one, we used linear regression model to predict the prices of used Toyota Corollas. There are some overlap in the materials for those just reading this post for the first time. For those who read the part 1 of the series using linear regression, then you can safely skip to the section where I applied neural networks to the same data set.

In this post, we will…


Added by Peter Chen on March 22, 2015 at 10:30am — No Comments

Value-Liquidity Cycle

I made a recent discovery that I would like to share with the community. In my previous blog, I introduced a special algorithmic shell that distributes stocks based on their price movements (along the x-axis) and volume movements (y-axis). Using this shell, it is possible to visualize the trading behaviours of dozens of stocks simultaneously. I noticed one day that the stocks seemed to be lining up in formation. I decided to test the accuracy of my visual interpretation. Below I present the…


Added by Don Philip Faithful on March 22, 2015 at 5:22am — No Comments

Impact of target class proportions on accuracy of classification

When we try to build classification models from training data, the proportion of target classes do impact the accuracy levels of predictions. This is an experiment to measure the level of impact of these proportions.

Let us say you are trying to predict which visitors to your website would buy a product. You collect historical data about the visitor's characteristics and actions and also whether they brought something or not. This is the model building data…


Added by Kumaran Ponnambalam on March 20, 2015 at 12:00pm — 5 Comments

How to Ask an Analyst a Question

Asking questions is easy. It’s so easy that, as askers, we often don’t think about the quality of our questions. Poorly framed questions waste everyone’s time—yours included—because they require the answerer to make assumptions. When it comes to asking analysts to explore a problem you’re trying to solve, better questions will drive better analysis and, ultimately, more actionable answers.

Here’s an example:

Marketer: “How many people converted from paid ad…


Added by Derek Steer on March 19, 2015 at 11:49am — No Comments

Mining Web Pages in Parallel

A Visual Studio 2013 demo project including the WebpageDownloader and LinkCrawler can be downloaded here.


The US digital universe currently doubles in size approximately every three years [1].  In fact, Hewlett Packard estimates that by the end of this decade, the digital universe will be measured in ‘Brontobytes’, which…


Added by Jake Drew on March 18, 2015 at 7:00pm — 1 Comment

Weekly Digest - March 23

The full version is always published Monday. Starred articles or sections are new additions or updated content, posted between Thursday and Sunday.


  • Think Big, a Teradata company, provides data science and engineering services that enable organizations to accelerate their time to value from big data. As the first and only big data services firm, Think Big’s…

Added by Vincent Granville on March 18, 2015 at 1:00pm — No Comments

Hacking Y Combinator

This idea came to me out of the blue. I was scrolling through Y Combinator's Hacker News board in search of inspiration. I noticed that some posts (very insightful ones, I thought) ended up at the bottom of the list, other posts were popular but had no comments and some triggered lots of comments but were ranked very low. I can imagine that being popular on Hacker News means a lot to a contributor: s/he gets a ton of views, the post generates…


Added by Tatiana Sorokina on March 18, 2015 at 9:30am — No Comments

Blog Topics by Tags

Monthly Archives







Follow Us



  • Add Videos
  • View All

© 2015   Data Science Central

Badges  |  Report an Issue  |  Terms of Service