Subscribe to DSC Newsletter

Vincent Granville's Blog – March 2016 Archive (21)

Spectral Clustering – How Math is Redefining Decision Making

Guest blog post by Gaurav Agrawal, COO at Soothsayer Analytics.

In today’s world of big data and the internet of things, it is common for a business to find itself sitting atop a mountain of data. Possessing it is one thing, but leveraging it for data driven decision making is a much different ball game. Gut-feelings and institutionalized heuristics have traditionally been used to guide development of…

Continue

Added by Vincent Granville on March 31, 2016 at 12:00pm — 1 Comment

Weekly Digest, April 4

Starred articles are new additions or updated content, posted between Thursday and Sunday. The weekly digest has six sections: (1) Featured Resources and Technical Contributions, (2) Featured Articles and Case Studies, (3) From our Sponsors, (4) News, Events, Books, Training, Forum Questions, (5) Picture of the Week, and (6) Syndicated Content.

The full version is always published Monday.…

Continue

Added by Vincent Granville on March 30, 2016 at 2:00pm — No Comments

Variance, Clustering, and Density Estimation Revisited

Introduction

We propose here a simple, robust and scalable technique to perform supervised clustering on numerical data. It can also be used for density estimation, and even to define a concept of variance that is scale-invariant. This is part of our general statistical framework for data science. Previous articles included in this series are:…

Continue

Added by Vincent Granville on March 29, 2016 at 9:00pm — 1 Comment

15 Most Controversial Data Science Articles

These articles were controversial in the sense that they highlighted the differences between data science and other disciplines, at a time when many believed that data science was just old stuff being re-branded, or being practiced by people knowing nothing about statistics. Ironically, some of the old stuff actually re-branded itself as data science, not the other way around.…

Continue

Added by Vincent Granville on March 24, 2016 at 6:30pm — No Comments

R, Python, Machine Learning, Dataviz: Most Popular Resources

A simple way to find great articles and resources on popular subjects such as data science, machine learning, deep learning, Python, R , data sets, dataviz, IoT, AI - or even Excel - is to use our data science search engine. This page, populated with pre-selected queries, is an excellent starting point. The search box can be found on DSC and all our channels, on all pages,…

Continue

Added by Vincent Granville on March 24, 2016 at 9:00am — No Comments

Weekly Digest, March 28

Starred articles are new additions or updated content, posted between Thursday and Sunday. The weekly digest has six sections: (1) Featured Resources and Technical Contributions, (2) Featured Articles and Case Studies, (3) From our Sponsors, (4) News, Events, Books, Training, Forum Questions, (5) Picture of the Week, and (6) Syndicated Content.

The full version is always published Monday.…

Continue

Added by Vincent Granville on March 23, 2016 at 10:00am — No Comments

Big Data and Data Science. Some reflections on compensation levels

Guest blog post by Harry Powell, Head of Advanced Data Analytics at Barclays.

I was at a meetup in Oxford recently and one of the speakers, the CEO of a tech start-up, brought up the subject of Data Scientists’ pay. Apparently they are paid too much. I am not sure whether the data supports this assertion, but it seems to be a common complaint amongst highly-paid CEOs. What…

Continue

Added by Vincent Granville on March 22, 2016 at 5:30pm — 2 Comments

Biased vs Unbiased: Debunking Statistical Myths

Anyone who attended statistical training at the college level has been taught the four rules that you should always abide by, when developing statistical models and predictions:

  1. You should only use unbiased estimates
  2. You should use estimates that have minimum variance
  3. In any optimization problem (for instance to compute an estimate from a maximum likelihood function, or to detect the best, most predictive subset of variables), you should always shoot for a…
Continue

Added by Vincent Granville on March 19, 2016 at 4:30pm — 3 Comments

Mars Craters: An Interesting Stochastic Geometry Problem

Impact craters are distributed randomly on Mars and many other celestial bodies. Their radius most likely follow an exponential distribution. By estimating the mean of the exponential distribution in question, selecting 100 random locations, and determining how many lie in (at least) one crater, you can determine the age of the celestial body. 

This…

Continue

Added by Vincent Granville on March 17, 2016 at 5:00pm — 2 Comments

Weekly Digest, March 21

Starred articles are new additions or updated content, posted between Thursday and Sunday. The weekly digest has six sections: (1) Featured Resources and Technical Contributions, (2) Featured Articles and Case Studies, (3) From our Sponsors, (4) News, Events, Books, Training, Forum Questions, (5) Picture of the Week, and (6) Syndicated Content.

The full version is always published Monday.…

Continue

Added by Vincent Granville on March 16, 2016 at 3:00pm — No Comments

14 Timeless Reference Books

These books have been published and re-published in the last 10 year. Most of them are encyclopedias, yet they are extremely useful resources for the data science beginner or expert. Just like top restaurants, they come with a steep price. I did not include two encyclopedias (each with 10+ volumes) that sell for over $5,000 because I felt they were truly overpriced. Also, I have just published a new book, entitled…

Continue

Added by Vincent Granville on March 15, 2016 at 7:00pm — 1 Comment

43 New External Machine Learning Resources and Updated Articles

Starred articles are candidates for the picture of the week. A comprehensive list of all past resources is found here. We are in the process of automatically categorizing them using indexation and automated tagging…

Continue

Added by Vincent Granville on March 14, 2016 at 5:00pm — 1 Comment

How to Use Cohort Data to Analyze User Behavior

Guest blog post by Jacob Joseph, originally posted here

In the world of data analysis, one tool is often left unused. While being a very powerful analytics tool, cohorts are often…

Continue

Added by Vincent Granville on March 13, 2016 at 8:40pm — No Comments

What Types of Questions Can Data Science Answer?

Guest blog post, authored by Brandon Rohrer, Senior Data Scientist at Microsoft. Originally posted here

Machine learning (ML) is the motor that drives data science. Each ML method (also called an algorithm) takes in data, turns it over, and spits out an answer. ML…

Continue

Added by Vincent Granville on March 12, 2016 at 7:00pm — 1 Comment

Performance From Various Predictive Models

Guest blog post by Dalila Benachenhou, originally posted here. Dalila is Professor at George Washington University. In this article, benchmarks were computed on a specific data set, for Geico Calls Prediction, comparing Random Forests, Neural Networks, SVM, FDA, K Nearest Neighbors, C5.0…

Continue

Added by Vincent Granville on March 11, 2016 at 10:30am — 1 Comment

Weekly Digest, March 14

Starred articles are new additions or updated content, posted between Thursday and Sunday. The weekly digest has six sections: (1) Featured Resources and Technical Contributions, (2) Featured Articles and Case Studies, (3) From our Sponsors, (4) News, Events, Books, Training, Forum Questions, (5) Picture of the Week, and (6) Syndicated Content.

The full version is always published Monday.…

Continue

Added by Vincent Granville on March 9, 2016 at 11:00am — No Comments

The Death of the Statistical Tests of Hypotheses

Some foundations of statistical science have been questioned recently, especially the use and abuse of p-values. See also this article published in FiveThirtyEight.com. Statistical tests of…

Continue

Added by Vincent Granville on March 8, 2016 at 1:00pm — 9 Comments

MIT Algorithm Predicts Rogue Waves in Real Time to Save Lives

Using AI and data science, an MIT team was able to accurately predict rogue waves coming out of the blue in the middle of the ocean, in near real time, to help sailors change their navigation path and avoid destruction and death. Rogue waves, while rare, are unpredictable, tall (up to 100 feet) and devastating. The physical mechanism producing these waves is well understood, and is typically modeled using rotating elements.…

Continue

Added by Vincent Granville on March 3, 2016 at 6:00pm — 3 Comments

Math or Engineers -- who will solve the big data problem?

Guest blog post by Charlie Silver, CEO of Algebraix Data. Originally entitled  'Data Algebra Does Big Data'.

Algebra is powerful. It enables people to solve for unknowns and frame problems in ways that are universally understandable. For the same reason, data algebra is powerful. Why? Because it can represent data – all data – mathematically.

What is Data Algebra (and when do I use…

Continue

Added by Vincent Granville on March 3, 2016 at 10:30am — 8 Comments

Weekly Digest, March 7

Starred articles are new additions or updated content, posted between Thursday and Sunday. The weekly digest has six sections: (1) Featured Resources and Technical Contributions, (2) Featured Articles and Case Studies, (3) From our Sponsors, (4) News, Events, Books, Training, Forum Questions, (5) Picture of the Week, and (6) Syndicated Content.

The full version is always published Monday.…

Continue

Added by Vincent Granville on March 2, 2016 at 2:00pm — No Comments

Monthly Archives

2019

2018

2017

2016

2015

2014

2013

2012

2011

1999

Videos

  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service