Subscribe to DSC Newsletter

February 2018 Blog Posts (98)

Difficult Probability Problem: Distribution of Digits in Rogue Systems

I recently posted a table summarizing probabilistic properties of digits in various number representation systems, see here.  The topic is already rather difficult for well-behaved systems (those listed in my table) but some systems are rogue, and do not have these nice statistical properties. Here we focus on one of these less known systems,…


Added by Vincent Granville on February 22, 2018 at 7:00pm — No Comments

Book: Artificial Intelligence with Python

Build real-world Artificial Intelligence applications with Python to intelligently interact with the world around you.

About This Book

  • Step into the amazing world of intelligent apps using this comprehensive guide
  • Enter the world of Artificial Intelligence, explore it, and create your own applications
  • Work through simple yet insightful examples that will get you up and running with Artificial Intelligence in no time

Who This Book Is…


Added by Capri Granville on February 22, 2018 at 7:00pm — No Comments

Topology Data Analysis (TDA)

Topology is the branch of pure mathematics that studies the notion of shape.  In the context of large, complex, and high dimensional data sets, topology takes on two main tasks, the measurement of shape and the representation of shape.  One can measure shape related properties within the data, and create compressed representations of data sets retaining features which reflect the relationships among the points in the data set. The…


Added by Valentina Kibuyaga on February 22, 2018 at 5:30pm — 1 Comment

Thursday News: Correlation, Regression, R, AI, Books, Deep Learning, NLP

Here is our selection of featured articles and resources posted since Monday:

Forum Questions and Answers


Added by Vincent Granville on February 22, 2018 at 11:30am — No Comments

15 Great Articles About Decision Trees

This resource is part of a series on specific topics related to data science: regression, clustering, neural networks, deep learning, Hadoop, decision trees, ensembles, correlation, outliers, regression, Python, R, Tensorflow, SVM, data reduction, feature selection, experimental design, time series, cross-validation, model fitting, dataviz, AI and many more. To keep receiving these articles, …


Added by Vincent Granville on February 21, 2018 at 6:30pm — No Comments

Foolproof R package Install

The number of R packages associated cool new tricks available continues to grow every month.  To understand the current state of R packages on…

Added by Laura Ellis on February 21, 2018 at 12:30pm — No Comments

Text Classification: Applications and Use Cases

 text classification

Text analysis, as a whole, is an emerging field of study. Fields  such as Marketing, Product Management, Academia, and Governance are already leveraging the process of analyzing and extracting information from textual data. We discussed the technology behind Text Classification, one of the essential parts of Text…


Added by Shashank Gupta on February 21, 2018 at 5:30am — 1 Comment

Building a Data Quality Strategy

In this article I explore some of the key concepts of data quality management and how to build a strategy for continuous improvement. I won’t be covering every possible scenario, process, method or problem; only those that are common across most industries and those that have proved useful on my own personal journey.

Hopefully, we already agree that good data quality is an essential part of business intelligence and a foundation on which you build your systems, processes and…


Added by Richard Cook on February 21, 2018 at 5:00am — No Comments

A new kind of pooling layer for faster and sharper convergence

This article was written by Sahil Singla.



Added by Amelia Matteson on February 20, 2018 at 5:30pm — No Comments

An Executive Primer to Deep Learning


Added by Pradeep Menon on February 20, 2018 at 4:30pm — No Comments

Double-Yolk "Bayesian Egg": Bayes, Frequentist and a 250 years-old puzzle

The Backdrop. Bayesians and Frequentists have long been ambivalent toward each other. The concept of “Prior” remains the center of this 250 years old tug-of-war: frequentists view prior as a weakness that can cloud the final inference, whereas Bayesians view it as a strength to incorporate expert knowledge into the data analysis. So, the question naturally arises, how can we develop a Bayes-frequentist consolidated data analysis workflow that enjoys the best…


Added by Subhadeep (DEEP) Mukhopadhyay on February 20, 2018 at 3:30pm — No Comments

Selected Recent Articles from Top DSC Contributors - Part 6

This is a new series, featuring great content from our top contributors. Some of these articles are rather technical in nature, but many are business-oriented and written in simple English. The entire series consists of about 120 articles. We intend to publish a new set every two weeks or so. Click here to check out the…


Added by Vincent Granville on February 20, 2018 at 3:30pm — No Comments

New Marketing Insight from Unsupervised Bayesian Belief Networks


“Limited-Service Restaurants” (LSRs) is how the restaurant industry refers collectively to fast food and fast-casual dining establishments.  Marketers who specialize in LSRs often employ marketing research to evaluate hypotheses about their brands or to detect segments within their markets.  An important additional purpose of market research is to understand the total structure of a market, to find out what guests consider important…


Added by Charles Hammerslough on February 20, 2018 at 10:30am — 1 Comment

Off the Beaten Path - HTM-based Strong AI Beats RNNs and CNNs at Prediction and Anomaly Detection

Summary: This is the second in our “Off the Beaten Path” series looking at innovators in machine learning who have elected strategies and methods outside of the mainstream.  In this article we look at Numenta’s unique approach to scalar prediction and anomaly detection based on their own brain research.


Numenta, the…


Added by William Vorhies on February 20, 2018 at 8:30am — No Comments

Adding Program Evaluation to the Data Science Curriculum

We tried to do XYZ. Did it make a difference?”

Whether you are in the for-profit world or the not-for profit world, this is a very basic question that many people try to answer.  

You could be working at a bank trying to figure out which offer is most appealing to customers, at an online retailer figuring out which ad display gets the most clicks, at the Department of Education trying to test the effect of smaller class sizes, at the city government office trying to see if the…


Added by Howard Friedman on February 20, 2018 at 5:30am — No Comments

Application of Image Processing and Convolution Networks in Intelligent Character Recognition for Digitized Forms Processing


Image processing is a rapidly evolving field with immense significance in science and engineering. One of the latest

applications of Image processing is in Intelligent Character Recognition (ICR), that is the computer translation of

handwritten text into machine-readable and machine-editable…


Added by Valiance Solutions on February 20, 2018 at 12:30am — 1 Comment

List of Free Must-Read Machine Learning Books

machine learning books

Machine learning is an application of artificial intelligence that gives a system an ability to automatically learn and improve from experiences without being explicitly programmed. In this article, we have listed some of the best free machine learning books that you should consider going through (no order in particular).

Mining of Massive Datasets

Author: Jure Leskovec, Anand Rajaraman, Jeff…


Added by Shashank Gupta on February 19, 2018 at 11:00pm — No Comments

Top Trends in AI in 2018


Added by Pradeep Menon on February 19, 2018 at 10:00pm — 1 Comment

Data Science Simplified Part 11: Logistic Regression

In the last blog post of this series, we discussed classifiers. The categories of classifiers and how they are evaluated were discussed. We have also discussed regression models in depth. In this post, we dwell a little deeper in how regression models can be used for classification tasks.

Logistic Regression is a widely used regression model used for classification tasks. As usual, we will discuss by example. No Money bank approaches us with a problem. The bank wants…


Added by Pradeep Menon on February 19, 2018 at 10:00pm — No Comments

How Millennials are driving the Digital Age?

Digital Transformation has brought several changes in our lives, changes in technology, processes, workflow, communication, and even overall services and products. But more changes will come in near future due to millennials getting active in workforce environment.

Millennials are those who are born after 1985 and grown up with the internet. They are the first to grow up surrounded by digital technologies…


Added by Sandeep Raut on February 18, 2018 at 6:00pm — 1 Comment

Blog Topics by Tags

Monthly Archives












  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service