Subscribe to DSC Newsletter

Vincent Granville's Blog – June 2019 Archive (21)

Data Science Central Monday Digest, July 1

Monday newsletter published by Data Science Central. Previous editions can be found here. The contribution flagged with a + is our selection for the picture of the week. To subscribe, follow this…

Continue

Added by Vincent Granville on June 29, 2019 at 3:30pm — No Comments

Online Encyclopedia of Statistical Science (Free)

This online book is intended for beginners, college students and professionals confronted with statistical analyses. It is also a refresher for professional statisticians. The book covers over 600 concepts, chosen out of more than 1,500 for their popularity. Entries are listed in alphabetical order, and broken down into 18 parts. In addition to numerous illustrations, we have added 100 topics not covered in our online series Statistical Concepts Explained in Simple English. We also…

Continue

Added by Vincent Granville on June 28, 2019 at 8:00am — 2 Comments

Data Science Central Thursday Digest, June 27

Here is our selection of featured articles and technical resources posted since Monday.

Resources

Continue

Added by Vincent Granville on June 27, 2019 at 7:30am — No Comments

28 Statistical Concepts Explained in Simple English - Part 18

This resource is part of a series on specific topics related to data science: regression, clustering, neural networks, deep learning, decision trees, ensembles, correlation, Python, R, Tensorflow, SVM, data reduction, feature selection, experimental design, cross-validation, model fitting, and many more. To keep receiving these articles, sign up on…

Continue

Added by Vincent Granville on June 27, 2019 at 7:00am — No Comments

39 Statistical Concepts Explained in Simple English - Part 17

This resource is part of a series on specific topics related to data science: regression, clustering, neural networks, deep learning, decision trees, ensembles, correlation, Python, R, Tensorflow, SVM, data reduction, feature selection, experimental design, cross-validation, model fitting, and many more. To keep receiving these articles, sign up on…

Continue

Added by Vincent Granville on June 27, 2019 at 7:00am — No Comments

Data Science Central Monday Digest, June 24

Monday newsletter published by Data Science Central. Previous editions can be found here. The contribution flagged with a + is our selection for the picture of the week. To subscribe, follow this…

Continue

Added by Vincent Granville on June 23, 2019 at 6:00pm — No Comments

Free Book: Statistics -- New Foundations, Toolbox, and Machine Learning Recipes

This book is intended for busy professionals working with data of any kind: engineers, BI analysts, statisticians, operations research, AI and machine learning professionals, economists, data scientists, biologists, and quants, ranging from beginners to executives. In about 300 pages and 28 chapters it covers many new topics, offering a fresh perspective on the subject, including rules of thumb and recipes that are easy to automate or integrate in black-box systems, as well as new…

Continue

Added by Vincent Granville on June 23, 2019 at 1:00pm — No Comments

Data Science Central Thursday Digest, June 20

Here is our selection of featured articles and technical resources posted since Monday:

Resources:

Continue

Added by Vincent Granville on June 20, 2019 at 11:30am — No Comments

Data Science Central Monday Digest, June 17

Monday newsletter published by Data Science Central. Previous editions can be found here. The contribution flagged with a + is our selection for the picture of the week. To subscribe, follow this link.  

Announcement

  • According…
Continue

Added by Vincent Granville on June 15, 2019 at 2:00pm — No Comments

Data Science Central Thursday News, June 13

This is our selection of featured articles and resources posted since Monday:

Technical Resources

Continue

Added by Vincent Granville on June 13, 2019 at 11:30am — No Comments

Simplified Logistic Regression

Logistic regression is typically used when the response Y is a probability or a binary value (0 or 1). For instance, the chance for an email message to be spam, based on a number of features such as suspicious keywords or IP address.  In matrix notation, the model can be written as

where X is the observations matrix,…

Continue

Added by Vincent Granville on June 12, 2019 at 9:00am — No Comments

29 Statistical Concepts Explained in Simple English - Part 16

This resource is part of a series on specific topics related to data science: regression, clustering, neural networks, deep learning, decision trees, ensembles, correlation, Python, R, Tensorflow, SVM, data reduction, feature selection, experimental design, cross-validation, model fitting, and many more. To keep receiving these articles, sign up on DSC. The…

Continue

Added by Vincent Granville on June 11, 2019 at 3:30pm — No Comments

30 Statistical Concepts Explained in Simple English - Part 15

This resource is part of a series on specific topics related to data science: regression, clustering, neural networks, deep learning, decision trees, ensembles, correlation, Python, R, Tensorflow, SVM, data reduction, feature selection, experimental design, cross-validation, model fitting, and many more. To keep receiving these articles, sign up on…

Continue

Added by Vincent Granville on June 11, 2019 at 3:00pm — No Comments

33 Statistical Concepts Explained in Simple English - Part 14

This resource is part of a series on specific topics related to data science: regression, clustering, neural networks, deep learning, decision trees, ensembles, correlation, Python, R, Tensorflow, SVM, data reduction, feature selection, experimental design, cross-validation, model fitting, and many more. To keep receiving these articles, sign up on…

Continue

Added by Vincent Granville on June 11, 2019 at 3:00pm — No Comments

How to Lie with P-values

P-values are used in statistics and scientific publications, much less so in machine learning applications where re-sampling techniques are favored and easy to implement today thanks to modern computing power. In some sense, p-values are a relic from old times, when computing power was limited and mathematical / theoretical formulas were favored and easier to deal with than lengthy computations.

Recently, p-values have been criticized and even banned by some…

Continue

Added by Vincent Granville on June 11, 2019 at 7:30am — No Comments

Data Science Central Monday Digest, June 10

Monday newsletter published by Data Science Central. Previous editions can be found here. The contribution flagged with a + is our selection for the picture of the week. To subscribe, follow this link.  

Announcement…

Continue

Added by Vincent Granville on June 9, 2019 at 5:30am — No Comments

Growth Modeling for Business Managers and Executives

You don't need a sophisticated model nor advanced machine learning techniques to quickly get a high level picture and trends for bottom-line business metrics. Not only the concepts explained here are easy to grasp, but while being high level, it nevertheless includes granular effects. The methodology presented here was used in business contexts in the past, when I was working with enterprise executives, particularly finance people, to assess the overall health of their business, and the…

Continue

Added by Vincent Granville on June 8, 2019 at 4:49pm — No Comments

Free Book: Azure Machine Learning in a Weekend

By Ajit Jaokar and Ayse Mutlu. 

Exclusively for Data Science Central members, with free access. You can download this book (PDF) here

This tutorial is the second book in the ‘in a weekend’ series – after Classification and Regression…

Continue

Added by Vincent Granville on June 8, 2019 at 5:22am — No Comments

Data Science Central Thursday Digest, June 6

Here is our selection of featured articles and technical resources posted since Monday:

Resources

Continue

Added by Vincent Granville on June 6, 2019 at 9:30am — No Comments

Simple Trick to Normalize Correlations, R-squared, and so on

Many statistics, such as correlations or R-squared, depend on the sample size, making it difficult to compare values computed on two data sets of different sizes. Here, we address this issue.

Below is an example with 20 observations. The 10 last observations (the second half of the data set) is a mirror of the first 10, and the two correlations, computed on each subset, are identical and equal to  0.30. The full correlation computed on the 20 observations is 0.85.…

Continue

Added by Vincent Granville on June 2, 2019 at 7:30am — 1 Comment

Monthly Archives

2019

2018

2017

2016

2015

2014

2013

2012

2011

1999

Videos

  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service