Subscribe to DSC Newsletter

All Blog Posts (5,704)

Bill Schmarzo's Retrospective: Data Science, ML, Big Data Analytics, and more

Bill Schmarzo, also known as the “Dean of Big Data” is CTO at Hitachi Vantara, and former CTO at Dell EMC. He authored a series of articles on analytic applications, and is on the faculty of TDWI teaching a course on "Thinking Like A Data Scientist". Bill is the author of “Big Data: Understanding How Data Powers Big Business” and "Big Data MBA: Driving Business…


Added by Vincent Granville on January 22, 2019 at 9:00am — No Comments

Advice to a fresh graduate for getting a job in AI/ Data Science


After a recent webinar, I was asked about advice for getting a job in AI for a fresh graduate


This is a good question and not often answered

Here are my thoughts



  • Firstly, AI is a vast topic. Everyone has a limited view on AI based on their personal…

Added by ajit jaokar on January 21, 2019 at 2:22pm — No Comments

Doctors are from Venus, Data Scientists from Mars – or Why AI/ML is Moving so Slowly in Healthcare

Summary: The world of healthcare may look like the most fertile field for AI/ML apps but in practice it’s fraught with barriers.  These range from cultural differences, to the failure of developers to really understand the environment they are trying to enhance, to regulatory and logical Catch 22s that work against adoption.  Part 3 of 3.



Added by William Vorhies on January 21, 2019 at 7:59am — No Comments

Weekly Digest, January 21

Monday newsletter published by Data Science Central. Previous editions can be found here. The contribution flagged with a + is our selection for the picture of the week. To subscribe, follow this…


Added by Vincent Granville on January 20, 2019 at 9:00am — No Comments

How Do You Win the Data Science Wars?  You Cheat By Doing The Necessary Pre-work!

“If” by Rudyard Kipling

If you can keep your head when all about you

     Are losing theirs and blaming it on you,  

If you can trust yourself when all men doubt you,

    But make allowance for their doubting too;  

If you can wait and not be tired by waiting,

    Or being lied about, don’t deal in lies,

Or being hated, don’t give way to…


Added by Bill Schmarzo on January 19, 2019 at 6:56am — No Comments

Stocks, Significance Testing & p-Hacking: How volatile is volatile?

October is historically the most volatile month for stocks, but is this a persistent signal or just noise in the data?

Stocks, Significance Testing & p-Hacking. Follow me on Twitter ( for more. Over the past 32 years, October has been the most volatile month on average for the S&P500 and December the least, in this article we will use simulation to assess the…


Added by Patrick David on January 18, 2019 at 5:30am — No Comments

Your Guide to Natural Language Processing (NLP)

How machines process and understand human language

Everything we express (either verbally or in written) carries huge amounts of information. The topic we choose, our tone, our selection of words, everything adds some type of information that can be interpreted and value extracted from it. In theory, we can understand and even predict human behaviour using that information.…


Added by Diego Lopez Yse on January 18, 2019 at 3:46am — No Comments

The #10yearschallenge: What are the #datascience opportunities?

Massive unstructured and semi-structured data, in the form of images and texts, are being shared on #socialmedia platforms at unprecedented rates in the last few days?

Potential questions that should be on the minds of AI or #machinelearning initiates are: what research opportunities can be unlocked from these data?; what commercial or business values are derivable from committing to creating #AI - centric products and services from these opportunities? Aren't there beneficial…


Added by Michael Akinwumi on January 17, 2019 at 9:36pm — No Comments

Supervised vs Unsupervised Learning...Whats the Big Deal?

Leveraging the abbreviation "vs" in and of itself begins to stir the insides of the ever-faithful, "until death do we part" neural network enthusiasts because, lets face it, they are riding a wave that is driving the stock prices in both the private commercial and public commercials sectors. Most of the applications talked about today which leverage the all-too-mysterious but oh-so-exciting "AI" or "Artificial Intelligence" are implementing supervised learning approaches to solve their…


Added by Grant T on January 17, 2019 at 1:30pm — No Comments

Thursday News: Math of Deep Learning, Python, DataViz, R, NLP, AI in Healthcare, Books

Here is our list of featured articles and technical contributions posted since Monday.



Added by Vincent Granville on January 17, 2019 at 9:35am — No Comments

3 Ways How AI Will Augment the Human Workforce

The question in the AI market is no longer about whether AI can affect the workplace and the human workforce. Instead, the raging curiosity in the market revolves…


Added by Ronald van Loon on January 17, 2019 at 1:06am — No Comments

Five Latest Machine Learning eBooks

Working with real data means that you need to get real insight for your business; don't let the essential information slip out of your grasp - the world of machine learning awaits you!

Here is a list of Packt’s latest Machine Learning eBooks to help you stay updated with the technology.…


Added by Packt Publishing on January 16, 2019 at 11:33pm — No Comments

900 Most Popular DS & ML Articles in 2018

Not all these contributions were from 2018, but the few selected below were among the most visited in 2018. Some were heavily featured, so it does not mean that they represent the average DSC interest. A bigger list featuring 900+ most popular articles can be found here. I am still working on categorizing them, and may hire an intern to work on this project, using…


Added by Vincent Granville on January 16, 2019 at 2:51pm — No Comments

IBM i2 is Big Data

IBM, the 92nd in the Top 100 in the Fortune Global 500 list in 2018, according to the own Fortune Magazine websites,, and IBM, as we know, means International Business Machines, a company that started in 1911 (, has come up with a piece of software that is being tagged by some experts in Information…


Added by Marcia Ricci Pinheiro on January 16, 2019 at 11:30am — 1 Comment

Tableau in 10 Minutes: Step-by-Step Guide

Tableau is an innovative system of business intelligence enterprise-class, which can be used in both conventional and complex investigations: from visualization of questions and answers to complex data analysis (trends, correlations, and statistics). The convenience of the system is that it recognizes the data in any format,…


Added by Igor Bobriakov on January 16, 2019 at 12:30am — No Comments

Exploit the Economics of Artificial Intelligence with Design Thinking and Data Science

In my most recent blog “Design Thinking Humanizes Data Science”, I discussed how Design Thinking and Data Science complement each other.  They are not just two sides of the same coin, but the same side of the same coin in their objectives to “diverge before converging” in driving business stakeholder collaboration with respect to identifying, brainstorming and…


Added by Bill Schmarzo on January 15, 2019 at 12:24pm — No Comments

How I used NLP (Spacy) to screen Data Science Resumes

Resume making is very tricky. A candidate has many dilemmas,

  • whether to state a project at length or just mention the bare minimum
  • whether to mention many skills or just mention his/her core competency skill…

Added by Venkat Raman on January 15, 2019 at 2:30am — 1 Comment

Pancake: A Python package for model stacking

In a previous post, I have provided a discussion of model stacking, a popular approach in data science competitions for boosting predictive performance. Since then, the post has attracted some attention, so I have decided to put together a Python package which provides a simple API to stack models with minimal effort.

In this post, I will present the …


Added by Burak Himmetoglu on January 14, 2019 at 10:42pm — No Comments

The Mathematics of Data Science: Understanding the foundations of Deep Learning through Linear Regression


Note: This is a long post, but I kept it as a single post to maintain continuity of the thought flow

In this longish post, I have tried to explain Deep Learning starting from familiar ideas like machine learning. This approach forms a part of my forthcoming book. You can connect with me on Linkedin to know more about the book. I have used this approach in my teaching. It is based on ‘learning by…


Added by ajit jaokar on January 14, 2019 at 12:29pm — 1 Comment

19 Controversial Articles about Data Science

You will find here an unusual perspective and opinions about data science. These articles may resonate well with senior data scientists and decision makers who manage the data science budget in their companies, but not so well with junior data scientists (assuming it is possible to be both junior and data scientist.) Professionals in fields such as operations research, physics, engineering, FinTech, econometrics, biostatistics might find this content refreshing, even if they don't call…


Added by Vincent Granville on January 14, 2019 at 11:59am — No Comments

Monthly Archives












  • Add Videos
  • View All

Follow Us

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service