Subscribe to DSC Newsletter

All Blog Posts (4,745)

Off the Beaten path – Using Deep Forests to Outperform CNNs and RNNs

Summary:  How about a deep learning technique based on decision trees that outperforms CNNs and RNNs, runs on your ordinary desktop, and trains with relatively small datasets.  This could be a major disruptor for AI.


Suppose I told you that there is an algorithm…


Added by William Vorhies on February 12, 2018 at 5:00pm — 3 Comments

The Golden Record: Explained

Where ‘big data’ appears to be the skeleton key that will unlock everything and all you want to know about your business, there’s more than meets the eye when it comes to understanding your data. Yes, clean data will unlock incredible value for your enterprise; inaccurate records, on the other hand, are…


Added by Martin Doyle on February 12, 2018 at 4:00am — No Comments

Chicago Homicides 2017, a Follow-Up Look with R

Guest blog post by Steve Miller.

A year ago, I posted an article on the disturbing 57% increase in Chicago homicides for 2016. There's been no shortage of loaded commentary since, including strong statements by the…


Added by Vincent Granville on February 11, 2018 at 5:30pm — 1 Comment

Weekly Digest, February 12

Monday newsletter published by Data Science Central. Previous editions can be found here.  The contribution flagged with a + is our selection for the picture of the week.

Featured Resources and Technical Contributions


Added by Vincent Granville on February 11, 2018 at 10:30am — No Comments

Analyzing Geographic Data with QGIS - Part 1

Today I'm writing this post to explain how it's possible to make geographic analysis and answer questions like: which is the richest area in my city? How many people do live in one neighborhood? 

You can do it combining shape files with an excel spreadsheet, let's understand it together...

First of all, we need to install one Geographic Information System (GIS), and I recommend QGIS - free and open source GIS



Added by Thiago Buselato Maurício on February 11, 2018 at 9:30am — No Comments

Applied Ontology and the Drivers of Data Recognition

I shared my story in a few blogs about returning to university to do a graduate degree.  In my first class, I found myself being asked to define “ontology.”  It was a course on the Geography of Disability.  I returned to class the following week with some details.  I said that strangely enough, this is not a word that can be found in all of my dictionaries.  One dictionary listed “oncology,” which I believe is the study of cancerous tumours.  My Collins Cobuild dictionary says, “Ontology is…


Added by Don Philip Faithful on February 11, 2018 at 7:30am — No Comments

Facial recognition in Digital Age

Do you remember Hollywood movies Terminator: Rise of Machines or Ex Machina where facial recognition technologies are used in several ways?

Today with digital technological advances, face recognition has become very important for businesses, to know who the customer is and…


Added by Sandeep Raut on February 10, 2018 at 9:30pm — No Comments

Number Representation Systems Explained in One Picture

Back to the basics. Here we are dealing with the oldest data set, created billions of years ago -- the set of integers -- and mostly the set consisting of two numbers: 0 and 1.  All of us have learned how to write numbers even before attending primary school. Yet, it is attached to the most challenging unsolved mathematical problems of all times, such as the distribution of the digits of Pi in the decimal system. The table below reflects this contrast, being a blend of rudimentary and deep…


Added by Vincent Granville on February 10, 2018 at 5:30pm — No Comments

How much rain can you count

On the fifth day of February 2017, I embarked upon this journey to create a platform to make data analysis easy for at least a million people. When I began, and even now, I am no expert in this field. Over the years, I have learned several concepts from my mentors and other…


Added by Naresh Devineni on February 10, 2018 at 2:00pm — No Comments

26 Great Articles and Tutorials about Regression Analysis

This resource is part of a series on specific topics related to data science: regression, clustering, neural networks, deep learning, decision trees, ensembles, correlation, ouliers, regression Python, R, Tensorflow, SVM, data reduction, feature selection, experimental design, cross-validation, model fitting, and many more. To keep receiving these articles, sign…


Added by Vincent Granville on February 10, 2018 at 10:30am — No Comments

How to tailor your Academic CV for Data Science roles

The below is advice I was asked for by a friend in astronomy looking to tailor their academic CV for data science roles when I was organizing our team recruitment at Royal Mail, posted on my LinkedIn here .I thought it might be useful for other astronomers or, more generally, academics looking to make a similar transition.

The main comment I have is that you're…


Added by Jason Byrne on February 10, 2018 at 4:30am — No Comments

The thin lines: business intelligence (BI), business analytics (BA), and data analytics

There is so much hype and interest in Business Analytics (BA), it begs the question of digging deep into hives. I was curious in finding a little light of the following questions:

  • What are the differences/similarities between BI and BA?
  • Are there differences/similarities between Analysis vs Analytics?

So, I took a little detour to find about these fascinating queries.

Key Differences:

Data Analysis vs Data…


Added by ATM SHIRAJUL HAQUE on February 9, 2018 at 10:30pm — No Comments

Cognitive computing: Moving From Hype to Deployment

Although cognitive computing, which is many a times referred to as AI or Artificial Intelligence, is not a new concept, the hype surrounding it and the level of interest pertaining to it is definitely new. The combination of hype surrounding robot overlords, vendor marketing and concerns regarding job losses has fueled the hype into where we stand now.

But, behind the cloud of hype that is surrounding the technology currently, there lies a potential for increased…


Added by Ronald van Loon on February 8, 2018 at 8:30pm — No Comments

Best DSC Forum Questions - Part 8

This is a new series, featuring forum questions (new and old) that are still popular today. These questions were selected manually based on popularity, removing outdated material. The entire series consists of about 160 questions -- most with answers, sometimes several answers. We intend to publish a new set every two weeks or so. The previous edition was posted …


Added by Vincent Granville on February 8, 2018 at 11:00am — No Comments

Thursday News: ML, Marketing Bots, Time Series, AI, Predictive Modeling, Spark, Scala

Here is our selection of featured articles and resources posted since Monday:

Technical Resources


Added by Vincent Granville on February 8, 2018 at 10:00am — No Comments

Event Analytics: How to Define User Sessions with SQL

Quite recently we’ve built event analytics for our team and thought to share this experience with you in this post .

Many of “out-of-the-box” analytics solutions come with automatically defined user sessions. It’s good to start with, but as your company grows, you’ll want to have your own session definitions based on your event data. Analyzing user…


Added by Luba Belokon on February 8, 2018 at 7:30am — No Comments

Test Automation of Mobile Banking Apps

There is now constant pressure on technologies to adopt and align themselves with the growing requirements of the business environment. Modern-day engineering requires greater scalability, cross-platform abilities, and faster performances.

Therefore the requirement for a software architecture that is flexible and that helps in building systems that are more resilient, mo…


Added by Alisha Henderson on February 8, 2018 at 2:00am — No Comments

Using Analytics to Improve Customer Engagement

Here is a guest blog from MIT SMR by Sam Ransbotham. Organizations that turn data into insights are gaining competitive advantage through improved connections with consumers.

The 2018 Data & Analytics Global Executive Study and Research Report by MIT Sloan Management Review finds that innovative, analytically mature organizations make use of data from multiple sources: customers, vendors, regulators, and even competitors. The…


Added by Shay Pal on February 7, 2018 at 2:30pm — 1 Comment

Five Ways to Unleash the Power of a Marketing Bot

From advertising and promotion to selling and feedback, marketing chatbots are…


Added by Deena Zaidi on February 7, 2018 at 8:00am — No Comments

Data Scientists need designer labels too

Labels are how humans define and categorise different concepts. There’s lots of evolutionary psychology, neuroscience and linguistics behind this, but without going into that, without labels human (and other animal) intelligence would not be possible. Labels are the algebra of everyday life.

But what’s that got to do with Data Science? As it happens, quite a lot. When we want to understand what people believe or perceive, we do it by analysing their communication either written or…


Added by Dan Somers on February 7, 2018 at 4:30am — No Comments

Monthly Archives










Follow Us


  • Add Videos
  • View All


© 2018   Data Science Central   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service