Subscribe to DSC Newsletter

February 2017 Blog Posts (85)

Selecting Forecasting Methods in Data Science

We are dealing with plethora of data and information in the world today and expectation is to predict and forecast how we can gain competitive advantage based on the information that we have, to act in advance. We look forward to define and furnish various methods based on our gut feel, past historical data, simple mathematical averages, and many more to get an incredibly precise prediction. With advanced analytics and data science, we develop “always-on” forecasting…


Added by Kamala Kanta Mishra on February 13, 2017 at 11:30pm — 1 Comment

A Quick Guide on How to Prevail in the Graph Database Arena


There are endless discussions on the databases arena about which DBMS is best suited for operational or data warehousing analytics, which one is the most efficient for online transaction processing, or which one is suitable for semantic integration. Recently graph databases are growing in popularity, especially in the enterprise space, and perhaps that adds more headache on those vendors that try to differentiate from competition…


Added by Athanassios Hatzis on February 13, 2017 at 10:30pm — 2 Comments

A Discussion: IT Data, Ambiguities & Classification model performance

“Ambiguity is pervasive” – true to its definition, as increasingly data getting generated, system connectivity reaching its peak, data and outcome are diverging. IT systems are evolving from “BIG DATA” to “BIGGER DATA” systems. Not all of this data is structured and easily consumable, thus challenge is posed by nexus of technology & “Data Greed”.

Having said this, fact is that future is found in ambiguity and chaos. We will never have complete and perfect information or a full…


Added by Awadesh Tiwari on February 13, 2017 at 7:00pm — No Comments

23 types of regression

This contribution is from David Corliss. David teaches a class on this subject, giving a (very brief) description of 23 regression methods in just an hour, with an example and the package and procedures used for each case. 

Here you can check the webcast done for Central Michigan University. The slide deck can be found…


Added by Vincent Granville on February 13, 2017 at 5:00pm — 3 Comments

Designing the Data Management Infrastructure of Tomorrow

Today, more than ever before, organisations realise the strategic importance of data and consider it to be a corporate asset that must be managed and protected just like any other asset. Considering the strategic importance of data, increasing number of farsighted organisations are investing in the tools, skills, and infrastructure…


Added by Ronald van Loon on February 13, 2017 at 8:00am — No Comments

Book: Python For Dummies

Python is one of the most powerful, easy-to-read programming languages around, but it does have its limitations. This general purpose, high-level language that can be extended and embedded is a smart option for many programming problems, but a poor solution to others. 

Python For Dummies is the quick-and-easy guide to getting…


Added by Emmanuelle Rieuf on February 12, 2017 at 1:30pm — No Comments

Deep Learning: Artificial Intelligence Is Important?

You've probably comes to mind the question, whether the Artificial Intelligence (AI) is important ?; what are the benefits of AI for human life ?. So easy to understand I will deliver with a fictitious example, if the reader ever watched the movie Spiderman-2 wherein the antagonist named Dr. Octopus background a genius scientist who has discovered solar fusion energy, simply Dr. Octopus is able to create a clone of the sun in the lab, both in terms of form, the energy…


Added by Jeefri A. Moka on February 12, 2017 at 1:04am — No Comments

How To Interpret R-squared and Goodness-of-Fit in Regression Analysis

This article was written by Jim Frost from Minitab. He came to Minitab with a background in a wide variety of academic research. His role was the “data/stat guy” on research projects that ranged from osteoporosis prevention to quantitative studies of online user behavior. Essentially, his job was to design the appropriate research conditions, accurately generate a vast sea of measurements, and then pull out patterns and meanings from it. 

After you have fit a linear model…


Added by Emmanuelle Rieuf on February 11, 2017 at 6:00pm — No Comments

Weekly Digest, February 13

Monday newsletter published by Data Science Central. Previous editions can be found here.  The contribution flagged with a + is our selection for the picture of the week.


  • Marketing Analytics and Data Science 2017April 3 - 5 2017, JW Marriott Union Square, San Francisco, CA  -- Empower yourself to become more valuable in your…

Added by Vincent Granville on February 11, 2017 at 2:00pm — No Comments

From Data Analysis to Machine Learning

This article is no longer available. See here for original version.

Top DSC Resources


Added by Emmanuelle Rieuf on February 11, 2017 at 10:00am — 1 Comment

Externalizing to Structural Capital

I recall somebody mentioning that the former definition for insanity is doing an action repeatedly while expecting different results.  Among the interests that I have in organizations is how at times many organizations make the same mistakes; or how sometimes the same mistake might be made by a particular organization repetitively.  So it is fascinating indeed when an airline facing an ice storm encounters much the same complaints from customers after a similar storm the previous year.  I…


Added by Don Philip Faithful on February 11, 2017 at 9:42am — No Comments

What is Regression Analysis?

Guest blog by Kevin Gray.. Kevin is president of Cannon Gray, a marketing science and analytics consultancy. 

Regression is arguably the workhorse of statistics. Despite its popularity, however, it may also be the most misunderstood. Why? The answer might surprise you: There is no such thing as Regression. Rather, there are a large number of statistical methods that are called…


Added by Vincent Granville on February 10, 2017 at 1:00pm — No Comments

t-SNE algo in R and Python, made with same dataset

t-Distributed Stochastic Neighbor Embedding (t-SNE) is a (prize-winning) technique for dimensionality reduction that is particularly well suited for the visualization of high-dimensional datasets. The technique can be implemented via Barnes-Hut approximations, allowing it to be applied on large real-world datasets.…


Added by Vincent Granville on February 10, 2017 at 12:30pm — 1 Comment

How to Boost Your Career in Big Data and Analytics

The world is increasingly digital, and this means big data is here to stay. In fact, the importance of big data and data analytics is only going to continue growing in the coming years. It is a fantastic career move and it could be just the type of career you have been trying to find.

Professionals who are working in this field can expect an impressive salary, with the median salary for data scientists being $116,000. Even those who are at the entry level will find high salaries, with…


Added by Ronald van Loon on February 10, 2017 at 7:00am — No Comments

Big Data changed my life. It can change yours too!

Hi, my name is Brontobyte and this is my story of how I grew up from a Byte, to Megabyte, to Gigabyte, to Brontobyte. I was born possibly in 1956 to unknown parents at an undisclosed place. All I know about my birth is that my Godfather Mr. Werner Buchholz from IBM gave me my name ‘Byte’ in July of 1956. I was told that Mr. Buchholz named me Byte (instead of bite) so I won’t be lost with all those bits and I am so thankful to him for making me feel special. So I am…


Added by Ramesh Dontha on February 10, 2017 at 6:00am — No Comments

State-of-the-Art Machine Learning Automation with HDT

The technique presented here blends non-standard, robust versions of decision trees and regression. It has been successfully used in black-box ML implementations.

In this article, we discuss a general machine learning technique to make predictions or score transactional data, applicable to very big, streaming data. This hybrid technique combines different algorithms to boost accuracy, outperforming each algorithm taken separately, yet it is simple enough to be reliably…


Added by Vincent Granville on February 9, 2017 at 10:00pm — 6 Comments

Digital Transformation and high-tech Robo-Advisor - do you need one?

How many times you have listened to the advice of your friend/colleague or someone you know, to invest in stock market? Many people have gained and lost their fortune with this guess work and now younger generation is more scared to hand over their hard earned money to someone for investing.
Until recently, you had 2 options for investments - either hire a human…

Added by Sandeep Raut on February 9, 2017 at 6:00pm — No Comments

Thursday News: AI, Data Cleaning, R, Outliers, Machine Learning, DataViZ

Here is our selection of featured articles and resources posted since Monday.


Added by Vincent Granville on February 9, 2017 at 10:00am — No Comments

Pandasql: Make python speak SQL

This article was written by Yhat. 


One of my favorite things about Python is that users get the benefit of observing the R community and then emulating the best parts of it. I'm a big believer that a language is only as helpful as its libraries and tools.

This post is about pandasql, a Python package…


Added by Emmanuelle Rieuf on February 9, 2017 at 8:30am — 1 Comment

Data Science: Identifying Variables That Might Be Better Predictors

I love the simplicity of the data science concepts as taught by the book “Moneyball.” Everyone wants to jump right into the real meaty, highly technical data science books. But I recommend to my students to start with the book “Moneyball.” The book does a great job of making the power of data science come to life (and the movie doesn’t count, as my wife saw it and “Brad Pitt is so cute!” was her only takeaway…ugh). One of my favorite lessons out of the book is the…


Added by Bill Schmarzo on February 9, 2017 at 5:30am — No Comments

Blog Topics by Tags

Monthly Archives













  • Add Videos
  • View All

© 2020   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service