The first one is about the difference between Data Science, Data Analysis, Big Data, Data Analytics, and Data Mining:

The source for this one is, according to a tweet, onthe.io. I could not find the article in question, though this website is very interesting, but anyway, I love the above picture,…

ContinueAdded by L.V. on September 29, 2015 at 4:30pm — 4 Comments

Of course each data scientist is different, so please take this criticism with a grain of salt. By a long stretch, they don't apply to all data scientists.

**Reasons**:

- Data scientists are creative and bring disruptive IP (intellectual property), and this can cause havoc for their company.
- They can steal and…

Interesting article by Regina Nuzzo, posted in Nature.com. Indeed, it's not just *p*-values that are being questioned, but even the Fisher-Neyman-Pearson (FNP) paradigm and the concept of maximum likelihood estimates (MLE).

Here's an extract published on the American Statistical…

ContinueAdded by L.V. on September 24, 2015 at 12:00pm — No Comments

We've already published the top big data presentations on slideshare, as well as great Github list of public data sets, or…

ContinueAdded by L.V. on September 20, 2015 at 1:30pm — No Comments

Very interesting compilation published here, with a strong machine learning flavor (maybe machine learning book authors - usually academics - are more prone to making their books available for free). Many are O'Reilly books freely available. Here we display those most relevant to data science. I haven't checked all the sources, but they seem legit. If you find some issue, let us know in the…

ContinueAdded by L.V. on September 19, 2015 at 9:00am — 4 Comments

Many products or published articles based on data science are heavily regulated, and illegal to perform or publish or sell without a special license, especially in US. You may be doing research and development on a topic considered as classified by the US government. Here a few examples:

- Steganography (the art and science of hiding secret messages in…

Added by L.V. on September 13, 2015 at 11:00am — 2 Comments

Here's one of the main differences between data engineering and data science: ETL (Extract / Load / Transform) is for data engineers, or sometimes data architects or DBA's.

DAD (Discover / Access / Distill) is for data scientists. Sometimes data engineers do DAD, sometimes data scientists do ETL, but it's rather rare, and when they do it, it's purely internal…

ContinueAdded by L.V. on September 6, 2015 at 3:30pm — No Comments

Here we compare statistics about two well known top data science websites, 2015 vs. 2013. The 2013 data can be found here. Below are the same stats for these two web properties, as of today. From a methodology point of view, comparing two (or more) websites on two different time periods is much better than comparing just one website on…

ContinueAdded by L.V. on September 5, 2015 at 3:30pm — No Comments

Here's a selection from Udacity's website. Initially, I intended to post questions from Google or Microsoft hiring managers and recruiters, but you can find these questions by doing a Google search, or…

ContinueAdded by L.V. on September 5, 2015 at 12:00pm — No Comments

And for software engineers or data analysts as well, in random order:

**The list**:

- Not being able to work well in a team
- Being elitist
- Using jargon that stakeholders don't understand
- Being perfectionist: perfection is always associated with negative ROI, in the business world: 20% of your…

Added by L.V. on September 3, 2015 at 3:30pm — 2 Comments

- Deep Learning Networks: Advantages of ReLU over Sigmoid Function
- Deep Learning: AlphaGo Zero Explained In One Picture
- Choosing the Correct Type of Regression Analysis
- Book: Machine Learning: a Probabilistic Perspective
- Handbook of Statistical Analysis and Data Mining Applications - 2nd Edition
- The Gaussian Correlation Inequality in One Picture
- Machine Learning Glossary by Google

© 2020 Data Science Central ® Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Statistics -- New Foundations, Toolbox, and Machine Learning Recipes
- Book: Classification and Regression In a Weekend - With Python
- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions