Subscribe to DSC Newsletter

Vincent Granville's Blog (1,133)

The First Things you Should Learn as a Data Scientist - Not what you Think

The list below is a (non-comprehensive) selection of what I believe should be taught first, in data science classes, based on 30 years of business experience. This is a follow up to my article Why logistic regression should be taught last.

I am not sure whether these topics below are even discussed in data camps or college…

Continue

Added by Vincent Granville on May 24, 2018 at 1:00pm — No Comments

Thursday News: Logistic Regression, AI, R, NLP, ML, Courses, Books

Here is our selection of featured articles and resources posted since Monday:

Featured Resources

Continue

Added by Vincent Granville on May 24, 2018 at 8:00am — No Comments

From Petabytes to Nanobits, with Application to Blockchain

It is hard to imagine that some data element could contain less information than a bit (a digit equal to either 0 or 1.) Yet examples are abundant. Indeed, I am wondering if we should create a unit of information called microbit, or nanobit.

The first examples that come to my mind are some irrational numbers such as Pi: it's digits are widely believed to be indistinguishable from pure noise, thus carrying essentially no information. While there is not enough data storage in the…

Continue

Added by Vincent Granville on May 21, 2018 at 8:00am — No Comments

Why Logistic Regression should be the last thing you learn when becoming a Data Scientist

I recently read a very popular article entitled 5 Reasons “Logistic Regression” should be the first thing you learn when becoming a Data Scientist. Here I provide my opinion on why this should no be the case.

It is nice to have logistic regression on your resume, as many jobs request it, especially in some fields such as biostatistics. And if you learned the details during your college classes, good for you. However, for a beginner, this is not the first thing you should…

Continue

Added by Vincent Granville on May 20, 2018 at 7:00pm — 5 Comments

Weekly Digest, May 21

Monday newsletter published by Data Science Central. Previous editions can be found here.  The contribution flagged with a + is our selection for the picture of the week.

Announcements
  • Join 4,000 members of the Apache Spark™ community in San Francisco June 4th-6th for Spark + AI Summit, the conference for data scientists…
Continue

Added by Vincent Granville on May 19, 2018 at 12:00pm — No Comments

Thursday News: AI, R, Pandas, Machine Unlearning, IoT, GDPR, Black-Box Models

Here is our selection of featured articles and resources posted since Monday:

Continue

Added by Vincent Granville on May 17, 2018 at 8:30am — No Comments

From Machine Learning to Machine Unlearning

After all, the term Machine Learning was coined based on the way the human (or animal) brain learns, meaning that somehow, machines could also benefit from a similar kind of learning. 

But human beings, successful ones for sure, know how to un-learn. In my case, while I was always fascinated by mathematics since my very early years, the school system's training (as in training an algorithm in ML) failed on me. It failed not because I did not succeed at school (I…

Continue

Added by Vincent Granville on May 16, 2018 at 4:06pm — No Comments

19 Interesting Articles About Excel

This resource is part of a series on specific topics related to data science: regression, clustering, neural networks, deep learning, Hadoop, decision trees, ensembles, correlation, outliers, regression Python, R, Tensorflow, SVM, data reduction, feature selection, experimental design, time series, cross-validation, model fitting, dataviz, AI and many more. To keep receiving these articles, …

Continue

Added by Vincent Granville on May 15, 2018 at 6:00pm — No Comments

Bill Vorhies Retrospective: Part 1

Bill is the Editorial Director for Data Science Central, and President and Chief Data Scientist at Data-Magnum, providing predictive analytics and big data infrastructure projects as a service. Bill has been an active commercial predictive modeler since 2001.…

Continue

Added by Vincent Granville on May 15, 2018 at 12:00pm — No Comments

Understanding the limits of deep learning

This article has been moved here

Added by Vincent Granville on May 15, 2018 at 5:30am — No Comments

How Python fares as a data science language?

Originally posted by Vaishnavi Agrawal.

Did you know that Python’s usage in data science applications rose 51% in 2015? Did you know that youtube is heavily built on Python language consisting of over a million lines of code? Tech visionaries are predicting that it might soon overtake R and may well be the most popular language in data science industry. R is a language dedicated to statistics and data…

Continue

Added by Vincent Granville on May 13, 2018 at 2:59pm — No Comments

Weekly Digest, May 14

Monday newsletter published by Data Science Central. Previous editions can be found here.  The contribution flagged with a + is our selection for the picture of the week.

Announcements
  • SQL + Notebooks + Charts. All in one platform. …
Continue

Added by Vincent Granville on May 13, 2018 at 5:00am — No Comments

Selection of Great Data Science Articles still Worth Reading

These articles are between 3 and 5 year old, but are still valuable today. The methodology used in these articles is modern, and still state-of-the-art today. Some discuss immense data sets still available to the public, and that resulted in designing new machine learning techniques to handle them. 

I am in the process of organizing these articles (written by myself) to eventually self-publish data science tutorials, in a few separate booklets, that are easy to understand for the…

Continue

Added by Vincent Granville on May 12, 2018 at 4:30pm — No Comments

Deep Dive into Polynomial Regression and Overfitting

In this article, we show that the issue with polynomial regression is not over-fitting, but numerical precision. Even if done right, numerical precision still remains an insurmountable challenge. We focus here on step-wise polynomial regression, which is supposed to be more stable than the traditional model. In step-wise regression, we estimate one coefficient at a time, using the classic least square technique. …

Continue

Added by Vincent Granville on May 9, 2018 at 5:30pm — 5 Comments

Weekly Digest, May 7

Monday newsletter published by Data Science Central. Previous editions can be found here.  The contribution flagged with a + is our selection for the picture of the week.

Announcements
  • Calculating ROI for Your Investment in Data: When it comes to ROI, it always seems to be easier said than done. The reality of measuring the return on…
Continue

Added by Vincent Granville on May 5, 2018 at 6:30am — No Comments

Statisticians, like artists, have the bad habit of falling in love with their models

This is a quote by George E Box. Share if you like it.

In short, all models are approximations. All models are wrong, but some are useful. George E Box (18 October 1919 – 28 March 2013) was a British statistician, who worked in the areas of quality control, time-series analysis, design of experiments, and Bayesian inference. He has been called "one of the great statistical minds of the 20th…

Continue

Added by Vincent Granville on May 4, 2018 at 10:28am — No Comments

Showcase your Data Science Expertise, and Learn New Tricks from Pros

Share your knowledge with other professionals, be respected as an expert in the leading community for data science, stats, BI, operations research, and machine learning practitioners. Or find answers to your business, technical, or career questions. We have thousands of questions posted in our revamped forum section, covering all topics, and usually related to applications: You can reply, contact the authors, post a comment, or ask a new question.

Lists of 160 popular questions (with…

Continue

Added by Vincent Granville on May 4, 2018 at 8:30am — No Comments

Thursday News: AI, ML, Neural Networks, Decision Trees, NLP, SVM, R, Python...

Here is our selection of featured articles and resources posted since Monday:

Articles

Continue

Added by Vincent Granville on May 3, 2018 at 8:30am — No Comments

What do mathematicians mean by good math and bad math?

This is a popular question recently posted on Quora, with my answer viewed more than 8,000 times so far. I am re-posting it here. This post is much more detailed than my initial answer.

-------

My answer may appear sarcastic, after all, I am a math PhD and have published in journals such as Journal of Number Theory. But I left academia long ago, yet still doing what I think is ground-breaking research in…

Continue

Added by Vincent Granville on May 2, 2018 at 1:30pm — 2 Comments

Monthly Archives

2018

2017

2016

2015

2014

2013

2012

2011

1999

Follow Us

Videos

  • Add Videos
  • View All

Resources

© 2018   Data Science Central™   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service