Subscribe to DSC Newsletter

Vincent Granville's Blog (811)

Weekly Digest, February 20

Monday newsletter published by Data Science Central. Previous editions can be found here.  The contribution flagged with a + is our selection for the picture of the week.

Upcoming DSC Webinar

Continue

Added by Vincent Granville on February 18, 2017 at 10:30am — No Comments

Twilight Zone Between True and False

Recently we read a lot about fake news, alternate facts and journalism lies. Companies like Facebook develop data science algorithms to detect these postings, based among other things on crowd sourcing (collective intelligence.)

But can the data scientist, with her inquisitive mind and strong sense of numbers and probabilities, use her brain to assess how true a piece…

Continue

Added by Vincent Granville on February 16, 2017 at 4:30pm — No Comments

Thursday News: ML, Data Engineering, Python, Model Selection, AI

Here is our selection of featured articles and resources posted since Monday:

Continue

Added by Vincent Granville on February 16, 2017 at 9:00am — No Comments

The Mathematics of Machine Learning

Guest blog post by Wale Akinfaderin, PhD Candidate in Physics. 

In the last few months, I have had several people contact me about their enthusiasm for venturing into the world of data science and using Machine Learning (ML) techniques to probe statistical regularities and build impeccable data-driven products. However, I've observed that some actually lack the necessary mathematical intuition and…

Continue

Added by Vincent Granville on February 15, 2017 at 8:00pm — 1 Comment

23 types of regression

This contribution is from David Corliss. David teaches a class on this subject, giving a (very brief) description of 23 regression methods in just an hour, with an example and the package and procedures used for each case. 

Here you can check the webcast done for Central Michigan University. The slide deck can be found…

Continue

Added by Vincent Granville on February 13, 2017 at 5:00pm — 1 Comment

Weekly Digest, February 13

Monday newsletter published by Data Science Central. Previous editions can be found here.  The contribution flagged with a + is our selection for the picture of the week.

Announcement

  • Marketing Analytics and Data Science 2017April 3 - 5 2017, JW Marriott Union Square, San Francisco, CA  -- Empower yourself to become more valuable in your…
Continue

Added by Vincent Granville on February 11, 2017 at 2:00pm — No Comments

What is Regression Analysis?

Guest blog by Kevin Gray.. Kevin is president of Cannon Gray, a marketing science and analytics consultancy. 

Regression is arguably the workhorse of statistics. Despite its popularity, however, it may also be the most misunderstood. Why? The answer might surprise you: There is no such thing as Regression. Rather, there are a large number of statistical methods that are called…

Continue

Added by Vincent Granville on February 10, 2017 at 1:00pm — No Comments

t-SNE algo in R and Python, made with same dataset

t-Distributed Stochastic Neighbor Embedding (t-SNE) is a (prize-winning) technique for dimensionality reduction that is particularly well suited for the visualization of high-dimensional datasets. The technique can be implemented via Barnes-Hut approximations, allowing it to be applied on large real-world datasets.…

Continue

Added by Vincent Granville on February 10, 2017 at 12:30pm — No Comments

State-of-the-Art Machine Learning Automation with HDT

In this article, we discuss a general machine learning technique to make predictions or score transnational data, applicable to very big, streaming data. This hybrid technique combines different algorithms to boost accuracy, outperforming each algorithm taken separately, yet it is simple enough to be reliably automated It is illustrated in the context of predicting the performance of articles published in media outlets or blogs, and has been used by the author to build an AI (artificial…

Continue

Added by Vincent Granville on February 9, 2017 at 10:00pm — 1 Comment

Thursday News: AI, Data Cleaning, R, Outliers, Machine Learning, DataViZ

Here is our selection of featured articles and resources posted since Monday.

Continue

Added by Vincent Granville on February 9, 2017 at 10:00am — No Comments

JavaScript Library for Plotting Water Data for the Nation

A new JavaScript library, called GWIS (Graphing Water Information System), can create time-series plots of information measured at U.S. Geological Survey hydrologic data collection sites across the United States.

Developed by the USGS Texas Water Science Center, the user-friendly interface integrates the open-source dygraphs JavaScript charting library with hydrologic data…

Continue

Added by Vincent Granville on February 8, 2017 at 9:08am — No Comments

10 Articles and Tutorials about Outliers

This resource is part of a series on specific topics related to data science: regression, clustering, neural networks, deep learning, Hadoop, decision trees, ensembles, correlation, ouliers, regression Python, R, Tensorflow, SVM, data reduction, feature selection, experimental design, time series, cross-validation, model fitting, and many more. To keep receiving these articles, …

Continue

Added by Vincent Granville on February 7, 2017 at 10:00am — No Comments

Machine Learning Summarized in One Picture

Here is a nice summary of traditional machine learning methods, from Mathworks.

I also decided to add the following picture below, as it illustrates a method that was very popular 30 years ago but that seems to have been forgotten…

Continue

Added by Vincent Granville on February 5, 2017 at 10:00pm — 5 Comments

Weekly Digest, February 6

Monday newsletter published by Data Science Central. Previous editions can be found here.  The contribution flagged with a + is our selection for the picture of the week.

Announcement…

Continue

Added by Vincent Granville on February 4, 2017 at 11:30am — No Comments

Plotting Multiple Columns in D3

Guest blog post by by Brian Back.

From the wide range of things you can do with D3, still one of the best things to make is the timeseries plot. In this post, I’ll walk through the basics of making a multi-column point plot/scatter plot. We’ll use a GISS dataset from NASA; dataset can be found …

Continue

Added by Vincent Granville on February 4, 2017 at 11:00am — No Comments

Thursday News:: Deep Learning, Python, Outliers, Regression, Data Sets

Here is our selection of featured articles and resources posted since Monday.

Continue

Added by Vincent Granville on February 2, 2017 at 10:00am — No Comments

Distribution of Arrival Times for Extreme Events

Most of the articles on extreme events are focusing on the extreme values. Very little has been written about the arrival times of these events. This article fills the gap. 

We are interested here in the distribution of arrival times of successive records in a time series, with potential applications to global warming assessment, sport analytics, or high frequency trading. The purpose here is to discover what the distribution of these arrival times is, in absence of any…

Continue

Added by Vincent Granville on February 1, 2017 at 7:30pm — 1 Comment

Data Science in Python: Pandas Cheat Sheet

This cheat sheet, along with explanations, was first published on DataCamp. Click on the picture to zoom in. To view other cheat sheets (Python, R, Machine Learning, Probability, Visualizations, Deel Learning, Data Science, and so on) click here. …

Continue

Added by Vincent Granville on February 1, 2017 at 11:06am — No Comments

26 Great Articles and Tutorials about Regression Analysis

This resource is part of a series on specific topics related to data science: regression, clustering, neural networks, deep learning, decision trees, ensembles, correlation, ouliers, regression Python, R, Tensorflow, SVM, data reduction, feature selection, experimental design, cross-validation, model fitting, and many more. To keep receiving these articles, sign up on…

Continue

Added by Vincent Granville on February 1, 2017 at 10:30am — No Comments

Will Trump Kill Statistician's Jobs

Today Trump met with leaders of pharmaceutical companies, to discuss “astronomical” drug prices and reduce regulations, so that drug companies can still make hefty profits while charging less for drugs. The motivation could be to keep the costs of healthcare down to facilitate the…

Continue

Added by Vincent Granville on January 31, 2017 at 8:00pm — 3 Comments

Follow Us

Videos

  • Add Videos
  • View All

Resources

© 2017   Data Science Central   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service