*This article was written by Graph Commons**.*

A common task for a data scientist is to identify clusters in a given data set. The idea is to simply find groups of objects that have more connections or similarities to one another than they do to outsiders. In the study of networks, we use clustering to recognize…

ContinueAdded by Andrea Manero-Bastin on March 9, 2020 at 3:30am — No Comments

*This article was written by Jim Frost**.*

Regression is a very powerful statistical analysis. It allows you to isolate and understand the effects of individual variables, model curvature and interactions, and make predictions. Regression analysis offers high flexibility but presents a variety of potential pitfalls. Great power requires great…

ContinueAdded by Andrea Manero-Bastin on March 9, 2020 at 3:00am — No Comments

*This article was written by Ray.*

Read an article in Quanta Magazine (New theory cracks open the black box of deep learning) about a talk (see 18: Information Theory of Deep Learning, YouTube video) done a month or so ago given by Professor Naftali (Tali) Tishby on his theory that all deep learning convolutional neural networks (CNN) exhibit an “information bottleneck”…

ContinueAdded by Andrea Manero-Bastin on March 6, 2020 at 3:00am — No Comments

*This article was written by Vasudev**.*

Lets get started quickly. Numpy is a math library for python. It enables us to do computation efficiently and effectively. It is better than regular python because of it’s amazing capabilities.

In this article I’m just going to introduce you to the basics of what is mostly required for machine learning and…

ContinueAdded by Andrea Manero-Bastin on March 1, 2020 at 12:00pm — 1 Comment

*This article was written by Prashant Gupta**.*

One of the major aspects of training your machine learning model is avoiding overfitting. The model will have a low accuracy if it is overfitting. This happens because your model is trying too hard to capture the noise in your training dataset. By noise…

ContinueAdded by Andrea Manero-Bastin on February 20, 2020 at 6:00am — No Comments

*This article was written by gk_**.*

Understanding how chatbots work is important. A fundamental piece of machinery inside a chat-bot is the text classifier. Let’s look at the inner workings of an artificial neural network (ANN) for text classification.

We’ll use 2 layers of neurons (1 hidden…

ContinueAdded by Andrea Manero-Bastin on February 9, 2020 at 12:30pm — No Comments

*This article was written by James Le**.*

Neural networks are one type of model for machine learning; they have been around for at least 50 years. The fundamental unit of a neural network is a node, which is loosely based on the biological neuron in the mammalian brain. The connections between neurons are also modeled on…

ContinueAdded by Andrea Manero-Bastin on February 9, 2020 at 12:00pm — No Comments

*This article was written by Matthew Mayo**.*

Scikit-learn is the de facto official machine learning library in use in the Python ecosystem. As described on its official website, Scikit-learn is:

- Simple and efficient tools for data mining and data analysis
- Accessible to everybody, and reusable in various contexts
- Built on NumPy, SciPy, and matplotlib
- Open…

Added by Andrea Manero-Bastin on February 1, 2020 at 9:00am — No Comments

*This article is on the blog* *artificialintelligenceml.*

This article features the following applications, one of them is pictured above (recommendation engine).

- Google’s AI-Powered Predictions
- Ridesharing Apps Like Uber and Lyft
- Commercial Flights Use an AI…

Added by Andrea Manero-Bastin on January 3, 2020 at 7:30am — No Comments

*This article was written by Jeffry Thurana**.*

Anybody who has tried Google Photos would agree that this free photo storage and management service from Google is smart. It packs in various smart features like advanced search, ability to categorize your pictures by locations and dates, automatically create albums and videos based on similarities, and walk you down the memory…

ContinueAdded by Andrea Manero-Bastin on January 3, 2020 at 7:00am — No Comments

*This article was written by Pranjal Srivastava**.*

Sequence prediction problems have been around for a long time. They are considered as one of the hardest problems to solve in the data science industry. These include a wide range of problems; from predicting sales to finding patterns in stock markets’ data, from understanding movie plots to recognizing your…

ContinueAdded by Andrea Manero-Bastin on January 1, 2020 at 4:30am — No Comments

*This article was written by Bob Hayes**.*

Data science requires the effective application of skills in a variety of machine learning areas and techniques. A recent survey by Kaggle, however, revealed that a limited number of data professionals possess competency in advanced machine learning skills. About half of data professionals said they were competent in…

ContinueAdded by Andrea Manero-Bastin on December 23, 2019 at 1:00pm — 1 Comment

*This article was written by Stuart Reid.*

* *

This tutorial covers regression analysis using the Python StatsModels package with Quandl integration. For motivational purposes, here is what we are working towards: a regression analysis program which receives multiple data-set names from Quandl.com, automatically downloads the data, analyses it, and plots the results in a new window.…

ContinueAdded by Andrea Manero-Bastin on December 8, 2019 at 6:30am — No Comments

*This article was written by Laura Ellis.*

* *

One of the reasons why I love R is that I feel like I’m constantly finding out about cool new packages through an ever-growing community of users and teachers.

To understand the current state of R packages on CRAN, I ran some code provided by Gergely Daróczi on Github . As of today there have been almost 14,000 R packages published on CRAN and the rate of…

ContinueAdded by Andrea Manero-Bastin on December 1, 2019 at 6:30am — No Comments

*This article was written by Tinniam V Ganesh.*

* *

This is the first in the series of posts, I intend to write on Deep Learning. This post is inspired by the Deep Learning Specialization by Prof Andrew Ng on Coursera and Neural Networks for Machine Learning by Prof Geoffrey Hinton also on Coursera.…

ContinueAdded by Andrea Manero-Bastin on November 30, 2019 at 9:00am — No Comments

*This article was written by Montana Low. *

* *

An open source framework for configuring, building, deploying and maintaining deep learning models in Python.

As Instacart has grown, we’ve learned a few things the hard way. We’re open sourcing Lore, a framework to make machine learning approachable for Engineers and maintainable for Machine Learning Researchers.…

ContinueAdded by Andrea Manero-Bastin on November 30, 2019 at 9:00am — No Comments

*This article was written by Mohammad Sajid**.*

* *

Statistical cluster analysis is an Exploratory Data Analysis Technique which groups heterogeneous objects(M.D.) into homogeneous groups. We will learn the basics of cluster analysis with mathematical way.

Cluster Analysis can be done by two…

Added by Andrea Manero-Bastin on November 30, 2019 at 8:30am — No Comments

*This article was written by Devin Soni. *

* *

Markov chains are a fairly common, and relatively simple, way to statistically model random processes. They have been used in many different domains, ranging from text generation to financial modeling. A popular example is r/SubredditSimulator, which uses Markov chains to automate the creation of content for an entire subreddit. Overall, Markov Chains are conceptually quite intuitive,…

ContinueAdded by Andrea Manero-Bastin on November 18, 2019 at 5:00am — 1 Comment

*This article was written by* Natalie Wolchover*.*

* *Even as machines known as “deep neural networks” have learned to converse, drive cars, beat video games and Go champions, dream, paint pictures and help make scientific discoveries, they have also confounded their human creators, who never expected so-called “deep-learning”…

Added by Andrea Manero-Bastin on November 1, 2019 at 5:30am — No Comments

*This article was written by Sarthak Jain.*

* *The real world poses challenges like having limited data and having tiny hardware like Mobile Phones and Raspberry Pis which can’t run complex Deep Learning models. This post demonstrates how you can do object detection using a Raspberry Pi. Like cars on a road,…

Added by Andrea Manero-Bastin on October 31, 2019 at 1:30pm — No Comments

- Finding organic clusters in complex data-networks
- Five Regression Analysis Tips to Avoid Common Problems
- Compressing information through the information bottleneck during deep learning
- Introduction to Numpy - A Math Library for Python
- Regularization in Machine Learning
- Text Classification using Neural Networks
- The 10 Deep Learning Methods AI Practitioners Need to Apply

- Machine Learning’s Limits (Part 1): Why machine learning works in some cases and not in others.
- The simplest explanation of machine learning you’ll ever read
- Object Detection with 10 lines of code
- What is a Generative Adversarial Network?
- How to Visualize a Decision Tree from a Random Forest in Python using Scikit-Learn
- Regression analysis using Python
- Introduction to Markov Chains

© 2020 Data Science Central ® Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Statistics -- New Foundations, Toolbox, and Machine Learning Recipes
- Book: Classification and Regression In a Weekend - With Python
- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions