Subscribe to DSC Newsletter

Featured Blog Posts – February 2017 Archive (79)

Exclusive Tutorial on Data Manipulation with R (50 Examples)

It's a complete tutorial on data wrangling or manipulation with R. This tutorial covers one of the most powerful R package for data wrangling i.e. dplyr. This package was written by the most popular R programmer Hadley Wickham who has written many useful R packages such as ggplot2, tidyr etc. It's one of the most popular R package as of date. This post includes several examples and tips of how to use dply package for cleaning and transforming data.…


Added by Deepanshu Bhalla on February 6, 2017 at 8:00am — No Comments

Corporate Self Service Analytics: 4 Questions You Should Ask Yourself Before You Start

Today’s customers are socially driven and more value conscious than they were ever before. Believe it or not, everyday customer interactions create a whopping 2.5 exabytes of data, which is equal to 1,000,000 terabytes, and this figure has been predicted to grow by 40 percent with every passing year. As organisations face the…


Added by Ronald van Loon on February 6, 2017 at 8:00am — No Comments

Useful R Packages that Aligns with The CRISP DM Methodology

As we all know CRISP DM stands for Cross Industry Standard Process for Data Mining is a process model that outlines the most common approach to tackle data driven problems. Per the poll conducted by KDNuggets in 2014 this was and “is” one of the most popular and widest used methodology. This method of gleaning insights out of the data is very dear to the industry experts and data miners.

As the title suggest I will align some of the most useful R packages with this most popular and…


Added by Sunil Kappal on February 6, 2017 at 8:00am — 1 Comment

Bridging Data Science and Business Intuition

Target corporation’s massively profitable data science project threw them into the news spotlight a few years back. Their story makes for a valuable case study in bridging data science and business intuition.

After having painstakingly developed a ‘golden-goose’ analytic model that could flag pregnant shoppers based on seemingly normal purchase patterns,…


Added by David Stephenson on February 5, 2017 at 10:30pm — No Comments

Machine Learning Summarized in One Picture

Here is a nice summary of traditional machine learning methods, from Mathworks.

I also decided to add the following picture below, as it illustrates a method that was very popular 30 years ago but that seems to have been forgotten recently: mixture of Gaussian. In the example below, it is…


Added by Vincent Granville on February 5, 2017 at 10:00pm — 10 Comments

The Weak Karate of Six Sigma

Six Sigma is a quantitative approach to problem solving - to solve certain types of problems. At the root of Six Sigma is an improvement methodology that can be described by the acronym DMAIC: define, measure, analyze, improve, and control [1]. Those interested in reading up on Six Sigma might consider the book for dummies, which I found fairly succinct. Those wondering what I mean by "certain types of problems" should consider how to apply the approach to their own business circumstances. I…


Added by Don Philip Faithful on February 5, 2017 at 7:40am — 4 Comments

Screencast on debugging in R using wrapper functions

Added by John Mount on February 4, 2017 at 3:30pm — No Comments

Weekly Digest, February 6

Monday newsletter published by Data Science Central. Previous editions can be found here.  The contribution flagged with a + is our selection for the picture of the week.



Added by Vincent Granville on February 4, 2017 at 11:30am — No Comments

Plotting Multiple Columns in D3

Guest blog post by by Brian Back.

From the wide range of things you can do with D3, still one of the best things to make is the timeseries plot. In this post, I’ll walk through the basics of making a multi-column point plot/scatter plot. We’ll use a GISS dataset from NASA; dataset can be found …


Added by Vincent Granville on February 4, 2017 at 11:00am — No Comments

All You Need To Know About Business Models in Digital Transformation

In very simple terms, Business model is how you plan to make money from your business.

A refined version is how you create and deliver value to customers. Your strategy tells you where you want to go and the business model tells you how you are going to do it.

In this time of industry 4.0 with Digital Transformation, businesses are getting disrupted faster than they get established. We all know what Apple did for music, Uber did for taxis and Airbnb did for…


Added by Sandeep Raut on February 4, 2017 at 7:00am — No Comments

Predicting House Sales

 “Half the money I spend on advertising is wasted; the trouble is I don't know which half.”

– John Wanamaker

The sale of a house is a valuable event for many parties. Real estate brokers, mortgage originators, moving companies – these businesses and more would greatly benefit from being able to get out in front of their competitors in…


Added by NYC Data Science Academy on February 3, 2017 at 2:00pm — 2 Comments

How to Build a Data Science Team

Businesses today need to do more than merely acknowledge big data. They need to embrace data and analytics and make them an integral part of their company. Of course, this will require building a quality team of data scientists to handle the data and analytics for the company. Choosing the right members for the team can be difficult,…


Added by Ronald van Loon on February 3, 2017 at 6:00am — No Comments

Artificial Intelligence and Education

The development of artificial intelligence (AI) has had a huge influence on today’s society, as ongoing discussions evaluate the impacts of creating machines and computer systems that can react and perform like humans. These systems can process information in a more cognitive way, making them capable of more human-like functions like learning, decision-making, and visual perception.…


Added by Bria Pierce on February 2, 2017 at 2:00am — 2 Comments

Linear Regression Geometry

Linear Regression is one of the most widely used statistical models. If Y is a continuous variable i.e. can take decimal values, and is expected to have linear relation with X's variables, this relation could be modeled as linear regression, mostly  the first model to fit,if we are planning to develop a model of forecasting Y or trying to build hypothesis about relation Xs on Y.



Added by Jishnu Bhattacharya on February 1, 2017 at 8:30pm — No Comments

Distribution of Arrival Times for Extreme Events

Most of the articles on extreme events are focusing on the extreme values. Very little has been written about the arrival times of these events. This article fills the gap. 

We are interested here in the distribution of arrival times of successive records in a time series, with potential applications to global warming assessment, sport analytics, or high frequency trading. The purpose here is to discover what the distribution of these arrival times is, in absence of any…


Added by Vincent Granville on February 1, 2017 at 7:30pm — 2 Comments

Data Science in Python: Pandas Cheat Sheet

This cheat sheet, along with explanations, was first published on DataCamp. Click on the picture to zoom in. To view other cheat sheets (Python, R, Machine Learning, Probability, Visualizations, Deep Learning, Data Science, and so on) click here

To view a…


Added by Vincent Granville on February 1, 2017 at 11:00am — 1 Comment

Google releases massive visual databases for machine learning

This article was written by Richard Lawler. Richard's been tech obsessed since first laying hands on an Atari joystick. 

Added by Emmanuelle Rieuf on February 1, 2017 at 10:00am — No Comments

Journey Science in Telecom: Take Customer Experience to the Next Level

Journey Science, being derived from connected data from different customer activities, has become pivotal for the telecommunications industry, providing the means to drastically improve the customer experience and retention. It has the ability to link together scattered pieces of data, and enhance a telco business’s objectives. Siloed…


Added by Ronald van Loon on February 1, 2017 at 7:30am — 2 Comments

Are typos and small mistakes making your business data inaccurate?

As typos and spelling mistakes make up to 58% of data inaccuracy issues, here we look at how much even a small mistake can cost your business…

According to  Experian, when it comes to data inaccuracy, much of it is down to human error, in particular, spelling mistakes. The reason for this lies in an over-reliance on manual data entry and the lack of…


Added by Martin Doyle on February 1, 2017 at 5:30am — No Comments

Featured Monthly Archives












© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service