Subscribe to DSC Newsletter

All Blog Posts (5,510)

Weekly Digest, October 29

Monday newsletter published by Data Science Central. Previous editions can be found here. The contribution flagged with a + is our selection for the picture of the week.

Featured Resources and Technical Contributions 


Added by Vincent Granville on October 28, 2018 at 9:00am — No Comments

Challenge of Confirming Program Efficacy

Something that has always troubled me with statistics is the pretense of certainty.  The conclusions – being closely associated with calculations – tend to be reached rapidly.  I might only be starting to give a problem some thought – although a statistician has already drawn conclusions.  Over time, this can make a person feel insecure about his intellectual capacity – and perhaps cause him to write a blog on the subject.  Consider the simulated data below:  a special program was…


Added by Don Philip Faithful on October 28, 2018 at 8:05am — No Comments

Top 25 Mistakes Corporates Make in their Advanced Analytics Program

Raise your hand if your company is making more than 15!


1. Day-dreaming that analytics is a plug & play magic wand that will bring very short term ROI. Well…


Added by Pedro URIA RECIO on October 27, 2018 at 8:01pm — No Comments

One Trillion Random Digits

You will find here a few tables of random digits, used for simulation purposes and/or testing or integration in statistical, mathematical, and machine learning algorithms. These tables are particularly useful if you want to share your algorithms or simulations, and make them replicable. We also provide techniques to use in applications where secrecy is critical, such as cryptography, bitcoin or lotteries: in this case, you don't want to share your table of random numbers; to the contrary you…


Added by Vincent Granville on October 27, 2018 at 9:00am — No Comments

Essential Math for Data Science

This article was written by Tirthajyoti Sarkar. Below is a summary. The full article (accessible from link at the bottom) also features courses that you could attend to learn the topics listed below, as well as numerous comments. We also added a few topics that we think are important and missing in the original article.…


Added by Andrea Manero-Bastin on October 26, 2018 at 5:00pm — No Comments

Why Organisations Nowadays Want an Analytics Platform

With the nascent stage of the data revolution past us, organisations are entering a new level of proficiency in handling data expertly. Gone are the days when organisations…


Added by Ronald van Loon on October 26, 2018 at 3:35am — No Comments

5 Minute Analysis: Olympics, Rising Competition and Equality


Added by Benjamin Waxer on October 26, 2018 at 2:00am — No Comments

Why do people with no experience want to become data scientists?

Below is my contrarian answer to one question recently posted on Quora.

It depends on what you mean by “no experience”. An NASA scientist who has processed petabytes of data and found great insights, for example discovered exoplanets, is de facto a data scientist and may have no interest in having his job title changed.

Then there is a bunch of people who call themselves “data science enthusiasts” and know nothing other than what they learned in a two-hour…


Added by Vincent Granville on October 25, 2018 at 6:00pm — 1 Comment

Thursday News: Stats, Math, ML, Neural Nets, K-means, AI, Deep Learning, Python, Anomaly Detection

Here is our selection of featured articles and technical resources posted since Monday:



Added by Vincent Granville on October 25, 2018 at 8:30am — No Comments

Advertising & Marketing Fundamentals For Data Scientists

I am an advertising and marketing veteran who is currently transitioning towards data science. The purpose of this write-up is to give you some baseline understanding of marketing, grounded in my professional experience. I am hoping that my write-up will help you gain a bigger share of voice when working with advertising & marketing teams. Eventually, you might ask bigger questions and thus move beyond just optimizing their work.

I will expand this post into a…


Added by Rafael Knuth on October 25, 2018 at 2:00am — 2 Comments

Facial Recognition and its Applications

Facial Recognition

Facial recognition technology was always a mythical concept that we thought could be a tool that could solve many of our problems but would never see the light of day. Today, facial recognition is everywhere and is a part of the everyday technology that we use. The…


Added by Abhimanyu on October 25, 2018 at 12:50am — No Comments

Anomaly/Outlier Detection using Local Outlier Factors

Outliers are patterns in data that do not confirm to the expected behavior. While detecting such patterns are of prime importance in Credit Card Fraud, Stock Trading etc. Detecting anomaly or outlier observations are also of importance when training any of the supervised machine learning models. This brings us to two very important questions: concept of a local outlier, and why a local outlier?

In a multivariate dataset where the rows are generated independently from a probability…


Added by Deepankar Arora on October 25, 2018 at 12:30am — No Comments

The Math Behind Machine Learning

Let’s look at several techniques in machine learning and the math topics that are used in the process.

In linear regression, we try to find the best fit line or hyperplane for a given set of data points. We model the output of our linear function by a linear combination of the input variables using a set of parameters as weights.

The parameters are found by minimizing the residual sum of squares. We find a critical point by setting the vector of derivatives of…


Added by Richard Han on October 24, 2018 at 6:00pm — No Comments

Prediction at Scale with scikit-learn and PySpark Pandas UDFs

By Michael Heilman, Civis Analytics

scikit-learn is a wonderful tool for machine learning in Python, with great flexibility for implementing pipelines and running experiments (see,…


Added by Civis Analytics on October 24, 2018 at 10:44am — No Comments

Business Problems and Data Science Solutions Part 1

An important principle of data science is that data mining is a process. It includes the application of information technology, such as the automated discovery and evaluation of patterns from data. It also includes an analyst’s creativity, business knowledge, and common sense. Understanding the whole process helps to structure data mining projects.

Since the data mining process breaks up the overall task…


Added by Mehmet Gökce on October 24, 2018 at 8:58am — No Comments

Top Data Analysis Books and Videos to become an Expert in Data

Learn how to transform data into business insight with these Data Tutorials and eBooks.

Deep Reinforcement Learning Hands-On By Maxim Lapan

This practical guide will teach…


Added by Packt Publishing on October 24, 2018 at 3:19am — No Comments

29 Statistical Concepts Explained in Simple English - Part 1

This resource is part of a series on specific topics related to data science: regression, clustering, neural networks, deep learning, decision trees, ensembles, correlation, Python, R, Tensorflow, SVM, data reduction, feature selection, experimental design, cross-validation, model fitting, and many more. To keep receiving these articles, sign up on…


Added by Vincent Granville on October 23, 2018 at 4:30pm — 4 Comments

K-means: A step towards Marketing Mix Modeling



Added by Ridhima Kumar on October 23, 2018 at 11:00am — No Comments

Using Semantic Segmentation to identify rooftops in low-resolution Satellite images: Use case of Machine Learning in Clean Energy sector

The work is done by Jatinder Singh (also co-authored this article) and Iresh Mishra. Also thanks to Saurabh…

Added by Rudradeb Mitra on October 23, 2018 at 11:00am — No Comments

The Case for Just Getting Your Feet Wet with AI

Summary:  Even if you’re not big enough to have a full blown data science group that shouldn’t hold you back from benefiting from AI.  The market has evolved so that there are now industry and process specific vertical applications available from 3rd party AI vendors that you can implement.  There are just a few things to look out for.



Added by William Vorhies on October 23, 2018 at 7:30am — No Comments

Monthly Archives











  • Add Videos
  • View All

Follow Us

© 2018   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service