Subscribe to DSC Newsletter

All Blog Posts (5,793)

Python machine learning libraries


This blog is a part of the learn machine learning coding basics in a weekend . We recommend the book Python Data Science Handbook by Jake…


Added by ajit jaokar on February 19, 2019 at 1:30pm — No Comments

Tutorial: Statistical Tests of Hypothesis

This article is a solid introduction to statistical testing, for beginners, as well as a reference for practitioners. It includes numerous examples as well as illustrations and definitions for concepts such as rejecting the null hypothesis, one sample hypothesis testing, P-values, critical values, and Bayesian hypothesis testing. It has references to additional topics, such as 

  • What is Ad Hoc Testing?
  • What is a Rejection Region?
  • What is a Two Tailed…

Added by Capri Granville on February 19, 2019 at 9:30am — No Comments

How to Stabilize Data Systems, to Avoid Decay in Model Performance

Here we describe a simple methodology to produce predictive scores that are consistent over time and compatible across various clients, to allow for meaningful comparisons and consistency in actions resulting from these scores, such as offering a loan. Scores are used in various contexts, such as web page rankings in search engines, credit score, risk score attached to loans or credit card transactions, the risk that someone might become a terrorist, and more. Typically a score is a function…


Added by Vincent Granville on February 18, 2019 at 9:30pm — No Comments

From Optimization to Prescriptive Analytics

Summary:  True prescriptive analytics requires the use of real optimization techniques that very few applications actually use.  Here’s a refresher on optimization with examples of where and how they’re best used.


Predictive analytics and optimization have gone hand in hand since the very beginning.  But in…


Added by William Vorhies on February 18, 2019 at 10:01am — 1 Comment

Should you Add your Coursera, Udacity, or DataCamp Training to your Resume?

It all depends on the classes that you attended. Some are worth listing, some are best not to mention. Here I review of few of these data science curricula, and the impression it can make on hiring managers, depending on your profile, work experience, and strength (or lack of) of these programs.



Added by Vincent Granville on February 18, 2019 at 8:30am — No Comments

Image Recognition with Keras: Convolutional Neural Networks

Image recognition and classification is a rapidly growing field in the area of machine learning. In particular, object recognition is a key feature of image classification, and the commercial implications of this are vast.

For instance, image classifiers will increasingly be used to:

  • Replace passwords with facial recognition
  • Allow autonomous vehicles to detect obstructions
  • Identify geographical features from satellite imagery



Added by Michael Grogan on February 17, 2019 at 11:00am — No Comments

Weekly Digest, February 18

Monday newsletter published by Data Science Central. Previous editions can be found here. The contribution flagged with a + is our selection for the picture of the week. To subscribe, follow this link.  



Added by Vincent Granville on February 17, 2019 at 9:30am — No Comments

Math You Don't Need to Know for Machine Learning

Grab a copy of The Elements of Statistical Learning ("the machine learning bible") and you might be a little overwhelmed by the mathematics. For example, this equation (p.34), for a cubic smoothing spline, might send shivers down your spine if math isn't your forte:…


Added by Stephanie Glen on February 17, 2019 at 7:34am — No Comments

How to Configure the Number of Layers and Nodes in a Neural Network

This article was written by Jason Brownlee

Artificial neural networks have two main hyperparameters that control the architecture or topology of the network: the number of layers and the number of nodes in each hidden layer. You must specify values for these parameters when configuring your network. The most reliable way to configure…


Added by Andrea Manero-Bastin on February 17, 2019 at 2:00am — No Comments

Redefining the Role of Data

It wasn’t too long ago when somebody said to me, “You do reports when you get to doing them.”  To me, this position is most defensible if the reports are for bookkeeping purposes.  I pointed out one day that my reports are for management purposes; and for this reason timeliness is important.  For instance, when one is driving a car, and it is necessary to turn at the next right, turning at the next right five lights later is fairly relevant.  Timing counts.  The “active” process of driving…


Added by Don Philip Faithful on February 16, 2019 at 11:33am — No Comments

How the Economics of Data Science is Creating New Sources of Value

There are several technology and business forces in-play that are going to derive and drive new sources of customer, product and operational value. As a set up for this blog on the Economic Value of Data Science, let’s review some of those driving forces.

  • Artificial Intelligence holds the economic potential to drastic drive industry and business model disruptions. But having AI technologies is not sufficient, especially when most AI technologies are open source and…

Added by Bill Schmarzo on February 16, 2019 at 5:32am — No Comments

Wheel Of Fortune - Bayesian Inference


Previously, I tackled the Gambler's Ruin problem using conditional probability and difference equations as well as visualising the simulations of the problem in a random walk style using Python/Pygame. This can be found here: …


Added by Tansel Arif on February 15, 2019 at 9:52am — No Comments

The Rise of Strategy Analytics

If there was an AI winter, we are clearly in the peak of its summer. I do not know if we will ever build something as Skynet, but we are going to build much simpler things that will change the course of our lives. This shows a new application of analytics in the field of finance. A radical new approach that let flourish a new face of finance never seen…


Added by Ramon Serrallonga on February 15, 2019 at 6:06am — No Comments

Thursday News: DL, NLP, AI, Statistical Tests, Bayesian Reasoning, and more

Here is our selection of featured articles and technical resources posted since Monday. Enjoy the reading!



Added by Vincent Granville on February 14, 2019 at 12:30pm — No Comments

5 Minute Analysis: Simplifying Iowa Liquor Sales

In this 5 Minute Analysis we'll preprocess, map, and explore complicated sales data for liquor stores in Iowa. Then we’ll extract the relevant latitude and longitude from a problematic column of the data and discover the city with the most sales. Next we’ll filter the data to that city and prepare the data for easy loading into Business Analysis tools such as Tableau and PowerBI. Finally…


Added by Benjamin Waxer on February 14, 2019 at 9:32am — No Comments

IMPACT: Choose a domain which enables you to create scalable solutions to meaningful global problems

This is the first article in what will be a three-part series: 

"How to make your mark on the world as a talented, socially conscious data scientist."

  • In this article, we discuss how a socially-conscious data scientist might choose a domain to make the greatest impact.
  • The next article…

Added by Marshall Lincoln on February 13, 2019 at 5:50pm — No Comments

A Plethora of Original, Not Well-Known Statistical Tests

Many of the following statistical tests are rarely discussed in textbooks or in college classes, much less in data camps. Yet they help answer a lot of different and interesting questions. I used most of them without even computing the underlying distribution under the null hypothesis, but instead, using simulations to check whether my assumptions were plausible or not. In short, my approach to statistical testing is model-free, data-driven. Some are easy to implement even in Excel. Some of…


Added by Vincent Granville on February 13, 2019 at 3:30pm — No Comments

An Introduction to Bayesian Reasoning

An Introduction to Bayesian Reasoning

You might be using Bayesian techniques in your data science without knowing it! And if you're not, then it could enhance the power of your analysis. This blog post, part 1 of 2, will demonstrate how Bayesians employ probability distributions to add information when fitting models, and reason about uncertainty of the model's fit.

Grab a coin. How fair is the coin? What is the probability…


Added by Sean Owen on February 13, 2019 at 8:00am — 2 Comments

Optimizing Your Company Right Out of Business

Professional athletes know the importance of developing opposing or complementary muscles (quadriceps and hamstrings, biceps and triceps).  These complementary muscles are sets of muscles that “work together” to move your body in the most efficient ways. If these muscles are strengthened together, it creates a balance that can lead to optimal performance.  However, if these muscles are not strengthened together, then one significantly increases the risk of…


Added by Bill Schmarzo on February 13, 2019 at 4:44am — No Comments

Maslow's Hierarchy of Data Science: Why Math and Science Still Matter

As an academic discipline, the rate of maturation for data science should be measured in light years. Although it's really only about 10 years old as a field of study – with the first Ph.D. program in the country emerging just four years ago – most, major universities across the world have integrated data science into their portfolio of degree options. Universities…


Added by Jennifer Lewis Priestley on February 12, 2019 at 10:52am — No Comments

Monthly Archives












  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service