Unlike most other lists of top experts, this one is a hand-picked selection, not based on influence or Klout scores, or the number of Twitter followers and re-tweets, or other similar metrics. Each of these experts has his/her own Wikipedia page. Some might not even have a Twitter account. All of them have had a very strong academic and research career in the most prestigious…

ContinueThe following 50 companies received each between $77 MM and $1,200 MM in funding, for a grand total of nearly $9 billion when aggregated. We scraped Yahoo Finance, press releases, Wikipedia, and other sources to gather this data. If you include all big data companies with at least $10 MM in funding, the grand total is about $18 billion.

Of course, this is a typical example of a Zipf…

ContinueAdded by L.V. on January 20, 2016 at 8:30pm — No Comments

This article has two parts:

- Listing the top 20 experts, along with their Twitter handle, rank in reverse order, number of Twitter followers, and Klout score. We hope to soon see a woman among the top 10.The top woman is currently #11.
- Discussing a robust methodology to score experts

*Source for…*

Added by L.V. on January 15, 2016 at 3:30pm — 7 Comments

This was the subject of a popular discussion recently posted on Quora: *20 questions to detect a fake data scientist*. We asked our own data scientist, and he came up with a very different set of questions: compare his answer (#1 below - 20 questions) with Quora replies (#2 and #3 below - 30 questions). Note that #2 focuses on statistics, and #3 on architecture. The link to the original Quora…

Added by L.V. on January 8, 2016 at 10:30am — 9 Comments

Added by L.V. on December 24, 2015 at 9:30am — 2 Comments

First, we list the predictions from our data scientist in residence. Then we provide predictions from leading data scientists. We invite you to add your own predictions in the comments section below.

**Predictions from our data scientist in…**

\We asked our staff data scientist what motivates him, and here's what he said:

**My passions**:

- Data Science research but not in an academic or corporate environment.
- Developing new, synthetic metrics (to measure yield or for data reduction), and robust, simple, scalable techniques to handle big, unstructured, messy, flowing data -- avoiding the curse of big data.
- Offering awards to winners in our competitions.
- Delivering…

Added by L.V. on December 13, 2015 at 2:30pm — 5 Comments

Read the questions. At the bottom, you will find a link to the answers.

**The Questions**

**First Set**

- Explain what is R?
- List out some of the function that R provides?
- Explain how you can start the R commander…

Added by L.V. on December 6, 2015 at 9:00am — 2 Comments

There are various outlets publishing high quality articles about data science, analytics, big data, machine learning and related fields. These outlets can be broken down in the following categories:

- Professional associations: ASA (Amstat News), IEEE/Spectrum, Informs
- Corporate blogs and magazines: IBM big data hub, Pivotal, Teradata, Tableau
- Niche publishers: Data Science Central (check our…

Added by L.V. on December 6, 2015 at 9:00am — No Comments

This was the subject of a question asked on Quora: What are the top 10 data mining or machine learning algorithms?

Some modern algorithms such as collaborative filtering, recommendation engine, segmentation, or attribution modeling, are missing from the lists below. Algorithms from graph theory (to find the shortest path in a graph, or to detect connected components),…

ContinueAdded by L.V. on December 6, 2015 at 9:00am — 4 Comments

Tutorials, books, articles, data sets, certifications, you name it. All about data science, machine learning and related topics. You can find them with a simple keyword search: enter the keyword "free" in the DSC's search box, and here are the results.

Below is a screenshot of the DSC search results page, for the keyword "free".…

ContinueThis was a question recently posted on Quora: What are the best data science podcasts?. Users recommend the following ones:

- Talking Machines, 12 episodes, …

Added by L.V. on December 2, 2015 at 5:00pm — No Comments

Interesting infographics produced by DataCamp.com, an organisation offering R and data science training. Click here to see the original version. I would add that one of the core competencies of the data scientist is to automate the process of data analysis, as well as to create applications that run automatically in the background, sometimes in…

ContinueAdded by L.V. on November 17, 2015 at 5:30pm — No Comments

Since *data scientist* is a senior job title that comes with significant experience, even expertise, how can you become a data scientist fresh out of college, or from Coursera classes, or from some data science boot camp?

I believe you indeed learn data science on the job. Employers have been fooled into hiring people with R, Python, SQL, NoSQL, some outdated statistical knowledge, and little work on a small academic data project - who claim to be data scientists. There's even…

ContinueAdded by L.V. on November 8, 2015 at 9:30am — 2 Comments

Anyone interested in categorizing them? It could be an interesting data science project, scraping these websites, extracting keywords, and categorizing them with a simple indexation or tagging algorithm. For instance, some of these blogs cater about stats, or Bayesian stats, or R libraries, or R training, or visualization, or anything else. This indexation technique…

ContinueAdded by L.V. on November 7, 2015 at 2:30pm — 2 Comments

From external sources. For more articles, click here. The chart below is from the article flagged with a +.

- What’s the probability that a significant p-value indicates a true effect? - If the
*p*-value is < .05,…

Added by L.V. on November 7, 2015 at 2:00pm — No Comments

Books about the R programming language fall in different categories:

- Learning R
- Reference books for the professional R programmer
- Books about data science or visualization, using R to illustrate the concepts

Books are a great way to learn a new programming language. Code samples is another great tool to start learning R, especially if you already use a different…

ContinueAdded by L.V. on November 6, 2015 at 8:30am — 3 Comments

Most data scientists spend some amount of time coding in R, Python, SQL or other languages. Because these programming languages (unlike Perl) are not flexible, your code will always get syntax bugs, until someone improve these languages to the point that source code self-correct before executing. Algorithm bugs - as opposed to syntax errors - are of course much more challenging.

Anyway, enjoy this funny T-Shirt available…

ContinueAdded by L.V. on November 1, 2015 at 2:30pm — No Comments

*Another great article by Bernard Marr. Bernard Marr is a best-selling business author, keynote speaker and consultant in strategic performance, analytics, KPIs and big data. He is one of the world's most highly respected voices anywhere when it comes to data in business. His leading-edge work with major companies, organisations and governments across the globe makes him a globally acclaimed and…*

Added by L.V. on November 1, 2015 at 2:30pm — No Comments

- Deep Learning Networks: Advantages of ReLU over Sigmoid Function
- Deep Learning: AlphaGo Zero Explained In One Picture
- Choosing the Correct Type of Regression Analysis
- Book: Machine Learning: a Probabilistic Perspective
- Handbook of Statistical Analysis and Data Mining Applications - 2nd Edition
- The Gaussian Correlation Inequality in One Picture
- Machine Learning Glossary by Google

- 32 New External Machine Learning Resources and Updated Articles
- Data scientist paid $500k can barely code!
- Deep Learning Cheat Sheet (using Python Libraries)
- 50+ Free Data Science Books
- A Tour of Machine Learning Algorithms
- 27 Best "Picture of the Week" Over the Last 12 Months
- 18 Reasons Data Scientists are Difficult to Manage

© 2020 Data Science Central ® Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Statistics -- New Foundations, Toolbox, and Machine Learning Recipes
- Book: Classification and Regression In a Weekend - With Python
- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions