Here I list my most interesting contributions published on Data Science Central. My plan is to categorize and aggregate this content to produce a few self-published books. The material below will always be available for free (from this webpage), but the books won't, or if they are, they will be free for members only. So you might want to bookmark this page.

I have also written a number of academic papers, you can find some of them here. Older articles from here, also here, and from past weekly digests will be added over time. Some older content can be found in my Wiley book.

*My home office, where I write most of my DSC articles *

The articles below are listed in reverse chronological order. This is a work in progress: I am still adding older entries. So check back again in a few weeks! Some of these articles are also featured on my Twitter profile at @GranvilleDSC.

**1. Core Articles**

Technical

- Variance, Attractors and Behavior of Chaotic Statistical Systems
- New Family of Generalized Gaussian Distributions
- Gentle Approach to Linear Algebra, with Machine Learning Applications
- Confidence Intervals Without Pain
- Re-sampling: Amazing Results and Applications
- How to Automatically Determine the Number of Clusters in your Data - and more
- New Perspectives on Statistical Distributions and Deep Learning
- A Plethora of Original, Not Well-Known Statistical Tests
- New Decimal Systems - Great Sandbox for Data Scientists and Mathema...
- Are the Digits of Pi Truly Random?
- Data Science and Machine Learning Without Mathematics
- Advanced Machine Learning with Basic Excel
- State-of-the-Art Machine Learning Automation with HDT
- Tutorial: Neutralizing Outliers in Any Dimension
- The Fundamental Statistics Theorem Revisited
- Variance, Clustering, and Density Estimation Revisited
- The Death of the Statistical Tests of Hypotheses
- 4 Easy Steps to Structure Highly Unstructured Big Data, via Automat...
- The best kept secret about linear and logistic regression
- Black-box Confidence Intervals: Excel and Perl Implementation
- Jackknife and linear regression in Excel: implementation and compar...
- Jackknife logistic and linear regression for clustering and predict...

Business

- New Stock Trading and Lottery Game Rooted in Deep Math
- Time series, Growth Modeling and Data Science Wizardy
- How to Stabilize Data Systems, to Avoid Decay in Model Performance
- 22 Differences Between Junior and Senior Data Scientists
- The First Things you Should Learn as a Data Scientist - Not what yo...
- Difference between Machine Learning, Data Science, AI, Deep Learnin...
- 21 data science systems used by Amazon to operate its business
- Life Cycle of Data Science Projects
- 40 Techniques Used by Data Scientists
- Designing better algorithms: 5 case studies
- Architecture of Data Science Projects
- 24 Uses of Statistical Modeling (Part II) | (Part I)
- The ABCD's of Business Optimization
- What you won't learn in stats classes
- Biased vs Unbiased: Debunking Statistical Myths

**2. Blog Posts About Data Science**

Technical

- Bernouilli Lattice Models - Connection to Poisson Processes
- Simulating Distributions with One-Line Formulas, even in Excel
- Simplified Logistic Regression
- Simple Trick to Normalize Correlations, R-squared, and so on
- Simple Trick to Remove Serial Correlation in Regression Models
- A Beautiful Result in Probability Theory
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- One Trillion Random Digits
- New Perspective on the Central Limit Theorem and Statistical Testing
- Simple Solution to Feature Selection Problems
- Scale-Invariant Clustering and Regression
- Deep Dive into Polynomial Regression and Overfitting
- Stochastic Processes and New Tests of Randomness - Application to Cool Number Theory Problem
- A Simple Introduction to Complex Stochastic Processes - Part 2
- A Simple Introduction to Complex Stochastic Processes
- High Precision Computing: Benchmark, Examples, and Tutorial
- Logistic Map, Chaos, Randomness and Quantum Algorithms
- Graph Theory: Six Degrees of Separation Problem
- Interesting Problem for Serious Geeks: Self-correcting Random Walks
- 9 Off-the-beaten-path Statistical Science Topics with Interesting A...
- Data Science Method to Discover Large Prime Numbers
- Nice Generalization of the K-NN Clustering Algorithm - Also Useful for Data Reduction
- How to Detect if Numbers are Random or Not
- Which classifier has the best performance?
- Introduction to Principal Component Analysis
- How and Why: Decorrelate Time Series
- Distribution of Arrival Times of Extreme Events
- Why Zipf's law explains so many big data and physics phenomenons

Problems

- Some Irresistible Integrals, Computed Using Statistical Concepts
- Curious Mathematical Problem
- Another Off-the-beaten-path Data Science Problem
- Two More Math Problems: Continued Fractions, Nested Square Roots, D...
- Mathematical Olympiads for Undergrad Students
- Difficult Probability Problem: Distribution of Digits in Rogue Systems
- Little Stochastic Geometry Problem: Random Circles
- Question: Correlation Coefficient in Flat Line Model
- Question about Some Statistical Distributions
- Coefficient of Correlation for Non-Linear Relationships
- Paradox Regarding Random (Normal) Numbers
- Curious Mathematical Object: Hyperlogarithms
- 88 percent of all integers have a factor under 100
- Math Challenge: Computing the Average Rotational Speed of Earth

Business and General

- Covid-19 Modeling: Impact of Missing Data and Ignoring Key Features
- Common Errors in Machine Learning due to Poor Statistics Knowledge
- How to Lie with P-values
- Growth Modeling for Business Managers and Executives
- Unexpected Use of AI: Solving Complex Mathematical Problems
- 8 Tips to Leverage Analytics: Advice for Small (and Big) Businesses
- Four Types of Data Scientist
- New Directions in Cryptography
- Black Hat Data Science
- From Petabytes to Nanobits, with Application to Blockchain
- Preventing Cambridge Analytica and Others to Hack into Facebook Data
- Interesting Application of the Zipf Distribution: Data Purging
- 22 tips for better data science
- Machine Learning Algorithm to Trade Bitcoin
- How Mathematical Discoveries are Made
- How to Solve the New $1 Million Kaggle Problem - Home Value Estimates
- Detecting Fake News, Fake Reviews, Fake Accounts, Fake Pictures
- 10 Data Science, Machine Learning and IoT Predictions for 2017
- Modern Computational Advertising on Social Networks: The Basics
- Massive Internet Attack Floods the World with Fake Data
- Building an Algorithm to Break Strong Encryption
- Why so many Machine Learning Implementations Fail?
- MIT Algorithm Predicts Rogue Waves in Real Time to Save Lives
- What statisticians think about data scientists

**3. Other Blog Posts**

Mathematics

- New Probabilistic Approach to Factoring Big Numbers
- Simple Trick to Dramatically Improve Speed of Convergence
- State-of-the-Art Statistical Science to Tackle Famous Number Theory...
- New Perspective on Fermat's Last Theorem
- Fun Math: Infinite Nested Radicals of Random Variables - Connection with Fractals and Brownian Motions
- Surprising Uses of Synthetic Random Data Sets
- Two New Deep Conjectures in Probabilistic Number Theory
- Extreme Events Modeling Using Continued Fractions
- A Strange Family of Statistical Distributions
- Some Fun with Gentle Chaos, the Golden Ratio, and Stochastic Number...
- Fascinating New Results in the Theory of Randomness
- From Infinite Matrices to New Integration Formula
- New Mathematical Conjecture?
- Cool Problems in Probabilistic Number Theory and Set Theory
- Fractional Exponentials - Dataset to Benchmark Statistical Tests
- Two Beautiful Mathematical Results - Part 2
- Two Beautiful Mathematical Results
- Four Interesting Math Problems
- Number Theory: Nice Generalization of the Waring Conjecture
- Fascinating Chaotic Sequences with Cool Applications
- Representation of Numbers with Incredibly Fast Converging Fractions
- Yet Another Interesting Math Problem - The Collatz Conjecture
- Simple Proof of the Prime Number Theorem
- Factoring Massive Numbers: Machine Learning Approach
- Representation of Numbers as Infinite Products
- A Beautiful Probability Theorem
- Fascinating Facts and Conjectures about Primes and Other Special Nu...
- Three Original Math and Proba Challenges, with Tutorial
- Challenges of the week

Opinion

- Debunking Forbes Article about the Death of the Data Scientist
- Why You Should be a Data Science Generalist - and How to Become One
- Becoming a Billionaire Data Scientist vs Struggling to Get a $100k Job - What is the difference?
- Is a PhD helpful for a data science career?
- If data science is in demand, why is it so hard to get a job?
- Why do people with no experience want to become data scientists?
- Why is Becoming a Data Scientist so Difficult?
- Full Stack Data Scientist: The Elusive Unicorn and Data Hacker
- Statistical Significance and p-Values Take Another Blow
- Are data science or stats curricula in US too specialized?
- How do you identify an actual data scientist?
- Is it still possible today to become a self-taught data scientist?
- Will the job outlook for data scientists severely decline after 2020?
- Why Logistic Regression should be the last thing you learn when bec...
- What do mathematicians mean by good math and bad math?
- Two Questions to Ask to a PhD Candidate for a Leadership Role
- 5 Myths About PhD Data Scientists
- The Future of Scientific Publishing
- The Slow Decline of Google Search
- Can you be sued for using the wrong data?

General

- Six Degrees of Separation Between Any Two Data Sets
- 7 Simple Tricks to Handle Complex Machine Learning Issues
- From Machine Learning to Machine Unlearning
- Should you Add your Coursera, Udacity, or DataCamp Training in your...
- First Doctorship in Data Science
- Python Overtakes R for Data Science and Machine Learning
- Mars Craters: An Interesting Stochastic Geometry Problem
- The art and (data) science of leveraging economic bubbles
- Data Scientist Breaks State Monopoly on Lotteries
- The Largest Number Ever Created
- Seasons in Binary Star Planetary Systems
- Predicting the next Eclipse

**4. Guides and References**

- Free Book: Statistics - New Foundations, Toolbox, and Machine Learn...
- Free Book: Applied Stochastic Processes
- Comprehensive Repository of Data Science and ML Resources
- Sample Projects for Data Scientists in Training
- Number Representation Systems Explained in One Picture
- Data Science Cheat Sheet
- Data Science 2.0. (About automated data science; in progress)
- Developing Analytic Talent - Becoming a Data Scientist.
- 8 Deep Data Science Articles
- Hitchhiker's Guide to Data Science, Machine Learning, R, Python
- Deep Learning: Definition, Resources, Comparison with Machine Learning
- What is Data Science? 24 Fundamental Articles Answering This Question
- Answers to dozens of data science job interview questions
- 15 Most Controversial Data Science Articles

Follow us on Twitter: @DataScienceCtrl | @AnalyticBridge

**DSC Resources**

- Subscribe to our Newsletter
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- Hire a Data Scientist | Search DSC | Classifieds | Find a Job
- Post a Blog | Forum Questions

© 2020 Data Science Central ® Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Upcoming DSC Webinar**

- DataOps: How Bell Canada Powers their Business with Data - July 15

Demand for data outstrips the capacity of IT organizations and data engineering teams to deliver. The enabling technologies exist today and data management practices are moving quickly toward a future of DataOps. DataOps is an automated, process-oriented methodology, used by analytic and data teams, to improve the quality and reduce the cycle time of data analytics. Register today.

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Statistics -- New Foundations, Toolbox, and Machine Learning Recipes
- Book: Classification and Regression In a Weekend - With Python
- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**Upcoming DSC Webinar**

- DataOps: How Bell Canada Powers their Business with Data - July 15

Demand for data outstrips the capacity of IT organizations and data engineering teams to deliver. The enabling technologies exist today and data management practices are moving quickly toward a future of DataOps. DataOps is an automated, process-oriented methodology, used by analytic and data teams, to improve the quality and reduce the cycle time of data analytics. Register today.

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions

## You need to be a member of Data Science Central to add comments!

Join Data Science Central