Subscribe to DSC Newsletter

Featured Blog Posts (4,902)

An Introduction to Bayesian Reasoning

An Introduction to Bayesian Reasoning

You might be using Bayesian techniques in your data science without knowing it! And if you're not, then it could enhance the power of your analysis. This blog post, part 1 of 2, will demonstrate how Bayesians employ probability distributions to add information when fitting models, and reason about uncertainty of the model's fit.

Grab a coin. How fair is the coin? What is the probability…


Added by Sean Owen on February 13, 2019 at 8:00am — 1 Comment

A Plethora of Original, Not Well-Known Statistical Tests

Many of the following statistical tests are rarely discussed in textbooks or in college classes, much less in data camps. Yet they help answer a lot of different and interesting questions. I used most of them without even computing the underlying distribution under the null hypothesis, but instead, using simulations to check whether my assumptions were plausible or not. In short, my approach to statistical testing is model-free, data-driven. Some are easy to implement even in Excel. Some of…


Added by Vincent Granville on February 13, 2019 at 3:30pm — No Comments

Adversarial Attacks on Deep Neural Networks: an Overview


Deep Neural Networks are highly expressive machine learning networks that have been around for many decades. In 2012, with gains in computing power and improved tooling, a family of these machine learning models called ConvNets started achieving state of the art…


Added by Anant Jain on February 10, 2019 at 5:49pm — No Comments

Optimizing Your Company Right Out of Business

Professional athletes know the importance of developing opposing or complementary muscles (quadriceps and hamstrings, biceps and triceps).  These complementary muscles are sets of muscles that “work together” to move your body in the most efficient ways. If these muscles are strengthened together, it creates a balance that can lead to optimal performance.  However, if these muscles are not strengthened together, then one significantly increases the risk of…


Added by Bill Schmarzo on February 13, 2019 at 4:44am — No Comments

Learn #MachineLearning Coding Basics in a weekend - Glossary and Mindmap

For background to this post, please see Learn #MachineLearning Coding Basics in a weekend. Here,we present the glossary that we use for the coding and the mindmap attached to these classes and upcoming book. …


Added by ajit jaokar on February 11, 2019 at 10:30am — 5 Comments

Text Encoding: A Review

The key to perform any text mining operation, such as topic detection or sentiment analysis, is to transform words into numbers, sequences of words into sequences of numbers. Once we have numbers, we are back in the well-known game of data analytics, where machine learning algorithms can help us with classifying and clustering.

We will focus here exactly on that part of the analysis that transforms words…


Added by Rosaria Silipo on February 11, 2019 at 3:09pm — No Comments

Capturing the Value of ML/AI – the Challenge of Offensive versus Defensive Data Strategies

Summary:  A major pain point is standing in the way of many companies’ ability to maximize the value of their ML/AI initiatives.  The competing goals of data flexibility versus single version of the truth can only be solved with an effective data governance strategy.


There are many reasons why companies…


Added by William Vorhies on February 11, 2019 at 10:15am — No Comments

Probability Cheat Sheet - Harvard University

Below is an extract of a 10-page cheat sheet about probability, compiled by William Chen ( and Joe Blitzstein, with contributions from Sebastian Chiu, Yuan Jiang, Yuqi Hou, and Jessy Hwang. Material based on Joe Blitzstein’s Harvard's introductory probability course (@stat110 - ( and Blitzstein / Hwang’s Introduction to Probability textbook (…


Added by Capri Granville on February 3, 2019 at 8:00am — No Comments

TriaClick - Associative Semiotic Hypergraph technology on a Columnar DBMS

Screen Capture Demo of TriaClick, the latest release of TRIADB, a python console application that implements associative, semiotic, hypergraph technology on top of ClickHouse columnar DBMS and …


Added by Athanassios Hatzis on February 11, 2019 at 10:30am — No Comments

30 Ways to Tell If You (or a Loved One) Loves Spreadsheets Too Much

You know that person who has a spreadsheet for everything... or maybe that person is you. It’s nothing to be ashamed of. I myself have been guilty of loving spreadsheets too much. And it didn’t stop there. I led people to believe that spreadsheets were the best way to make a …


Added by Trevor Fox on February 11, 2019 at 4:00pm — No Comments

5G Revolution: Telcos Becoming the New App Store for Industrial Apps

5G technology is knocking on our doors and is said to be around the corner. Mobile service providers have upped their game recently and are holding talks, events and discussions to formulate a plan for the launch of this futuristic network. Not…


Added by Ronald van Loon on February 11, 2019 at 6:48pm — No Comments

Maslow's Hierarchy of Data Science: Why Math and Science Still Matter

As an academic discipline, the rate of maturation for data science should be measured in light years. Although it's really only about 10 years old as a field of study – with the first Ph.D. program in the country emerging just four years ago – most, major universities across the world have integrated data science into their portfolio of degree options. Universities…


Added by Jennifer Lewis Priestley on February 12, 2019 at 10:52am — No Comments

31 Statistical Concepts Explained in Simple English - Part 9

This resource is part of a series on specific topics related to data science: regression, clustering, neural networks, deep learning, decision trees, ensembles, correlation, Python, R, Tensorflow, SVM, data reduction, feature selection, experimental design, cross-validation, model fitting, and many more. To keep receiving these articles, sign up on…


Added by Vincent Granville on February 11, 2019 at 5:30pm — No Comments

New Books and Resources for DSC Members

We are in the process of writing and adding new material (compact eBooks) exclusively available to our members, and written in simple English, by world leading experts in AI, data science, and machine learning. In the upcoming months, the following will be added:

  • The Machine Learning Coding Book
  • Off-the-beaten-path Statistics and Machine Learning Techniques 
  • Encyclopedia of Statistical Science
  • Original Math, Stat and Probability Problems - with…

Added by Vincent Granville on November 18, 2018 at 9:30am — 2 Comments

Weekly Digest, February 11

Monday newsletter published by Data Science Central. Previous editions can be found here. The contribution flagged with a + is our selection for the picture of the week. To subscribe, follow this…


Added by Vincent Granville on February 10, 2019 at 9:30am — No Comments

Unexpected Use of AI: Solving Complex Mathematical Problems. Will Mathematicians Become Obsolete?

This is an unusual application of AI, it is not fiction, and it works! It started probably more than 30 years ago. Sometimes referred to as symbolic mathematics, and usually relying on high performance and high precision computing, it aims at automatically solving or computing, with an exact solution (not an approximation) complicated…


Added by Vincent Granville on February 10, 2019 at 8:30am — 1 Comment

Intro to Robotic Process Automation

Majority of modern companies deal with processes which they want to be automated. This need can be caused by various reasons, in particular, due to the routine, repetitive and boring nature of manual processes. Another shortcoming is that such processes often require a lot of time and human resources; additionally, office…


Added by Igor Bobriakov on February 5, 2019 at 2:32am — No Comments

Achieving over 30% net profit with Pivot Billions and Deep Learning enhanced trading models

Integrating Pivot Billions with Keras Deep Learning to enhance currency trading models with AI to achieve over 30% net profit in less than 7 months.

Deep Learning has revolutionized the fields of image…


Added by Benjamin Waxer on February 8, 2019 at 5:29am — No Comments

Infographic: The Typical Data Scientist Profile in 2019

This infographic was produced by 365DataScience. Last year they completed a research on 1,001 data scientists to get a profile of the ‘typical’ data scientist in 2018. They replicated the study with new data. Below are the findings.

Here are some of our key findings:…


Added by Capri Granville on February 9, 2019 at 10:30am — 1 Comment

20 Handbooks on Modern Statistical Methods

With the two most recent ones, in this CRC series, published in January 2019. 

The objective of the series is to provide high-quality volumes covering the state-of-the-art in the theory and applications of statistical methodology. The books in the series are thoroughly-edited and present comprehensive, coherent and unified summaries of specific methodological topics from statistics. The chapters are written by the leading researchers in the field, and present a good balance of theory…


Added by Vincent Granville on February 9, 2019 at 7:30am — No Comments

Featured Monthly Archives











  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service