Subscribe to DSC Newsletter

Featured Blog Posts – October 2016 Archive (76)

Migrating an Excel Spreadsheet Directly to HDFS and Spark 2.0.1 (Part 2)

Recently, in a previous post, we reviewed a path to leverage legacy Excel data and import CSV files thru MySQL into Spark 2.0.1. This may apply frequently in businesses where data retention did not always take the database route……


Added by Marc Borowczak on October 23, 2016 at 5:30am — No Comments

Classifications in R: Response Modeling/Credit Scoring/Credit Rating using Machine Learning Techniques

This article was written by Ariful Mondal. Artful is a senior manager, data science and big data analytics consultant at Tata Consultancy Services. 

1. Introduction

This is an attempt to showcase some worked out examples of Machine Learning (ML) use German Credit Data. Though we have selected credit…


Added by Emmanuelle Rieuf on October 22, 2016 at 9:30am — No Comments

Home Internet Data Usage - FAQS

Data transfer - what it is?

Whilst you are online, everything is about transfer of data – thus, emails and web pages are basically a file that when you read or log onto, you are in essence downloading the file or transferring it to your screen so you can view it. If you watch a film or play a game online, these activities send data backward and forward in…


Added by Glen Johnson on October 22, 2016 at 9:30am — No Comments

Structural Accommodation

A theme in my blogs is how the "structure" of data - rather than just the "content" - affects what that data can say and is capable of doing. In particular, I suggest that certain structures tend to reinforce certain contents; this means that a structural imposition can have an effect similar to a contextual imposition. Structure is an interesting conversation…


Added by Don Philip Faithful on October 22, 2016 at 5:30am — No Comments

Data Integration Tools – Market Study

This post is a brief review of leading Data Integration tools in the market. Heavily referencing from the Gartner 2016 report and peer reviews from my circle.


The Market

The data integration tool market was worth approximately $2.8 billion at the end of 2015, an increase of 10.5% from the end of 2014 [2016 Gartner…


Added by Kashif Saiyed on October 21, 2016 at 7:30pm — No Comments

Weekly Digest, October 24

Monday newsletter published by Data Science Central. Previous editions can be found here.  The contribution flagged with a + is our selection for the picture of the week.


  • Use data to drive decisions—and your career. Advance your knowledge through…

Added by Vincent Granville on October 21, 2016 at 12:00pm — No Comments

Data Science for Internet of Things methodology - Evolving CRISP-DM - Part Two

This set of blog posts is part of the book/course on Data Science for the Internet of Things. We welcome your comments atjjb at cantab dot net.  Jean-Jacques Bernard  has been a founding member of the Data Science for Internet of Things Course.

Please email at ajit.jaokar at if you are interested in joining the course.

You can find the first…


Added by Jean-Jacques Bernard on October 19, 2016 at 11:30pm — No Comments

DSC Top Resources

Here is our updated list of top Data Science Central (DSC) resources, including reference articles and tutorials, top categories, tools and techniques, as well as several useful links (jobs, events, training, webinars, books and so on) and information about our popular newsletter. You will also find information about blogging with us, or where to find us on Facebook, LinkedIn or Twitter.

  • Article: …

Added by Vincent Granville on October 19, 2016 at 8:00pm — No Comments

What is DS-BuDAI?

What is DS-BuDAI?

Data science[1] (covering data mining and related practices) is a multidisciplinary field that requires knowledge of a number of different skills, practices, and technologies, including but not limited to machine learning, pattern recognition, mathematics, programming, algorithms, statistics, and databases. In the…

Added by Khosrow Hassibi on October 19, 2016 at 10:00am — No Comments

11 Great Hadoop, Spark and Map-Reduce Articles

This reference is a part of a new series of DSC articles, offering selected tutorials, references/resources, and interesting articles on subjects such as deep learning, machine learning, data science, deep data science, artificial intelligence, Internet of Things, algorithms, and related topics. It is designed for the busy reader who does not have a lot of time digging into long lists of advanced publications.…


Added by Vincent Granville on October 18, 2016 at 7:30pm — No Comments

How To Implement Machine Learning Algorithm Performance Metrics From Scratch With Python

This article was written by Jason Brownlee. Jason is the editor-in-chief at has a Masters and PhD in Artificial Intelligence, has published books on Machine Learning and has written operational code that is running in production.

After you make predictions, you need to know if they are any good.

There are standard measures that we can use to summarize how good a set of predictions actually are.

Knowing how good a set of predictions is,…


Added by Emmanuelle Rieuf on October 18, 2016 at 10:30am — No Comments

What is Data Science? 24 Fundamental Articles Answering This Question

Many people new to data science might believe that this field is just about R, Python, Hadoop, SQL, and traditional machine learning techniques or statistical modeling. Below you will find fundamental articles that show how modern, broad and deep the field is. Some data scientists are actually doing none of the above. In my case, I don't even code, but instead, I make various applications talk to each other, in a machine-to-machine communication framework. It is true though that most data…


Added by Vincent Granville on October 18, 2016 at 9:00am — 4 Comments

More on 3rd Generation Spiking Neural Nets

Summary:  Here’s some background on how 3rd generation Spiking Neural Nets are progressing and news about a first commercial rollout.


Recently we wrote about the development of AI and neural nets beyond the second generation Convolutional and Recurrent Neural Nets (CNNs / RNNs) which have come on so strong and dominate the…


Added by William Vorhies on October 18, 2016 at 8:04am — 1 Comment

An Introduction to Implementing Neural Networks using TensorFlow

This article on an introduction to implementing neural networks using TensorFlow, was posted by Faizan Shaikh. Faizan is a Data Science enthusiast and a Deep learning rookie. A recent Comp. Sc. undergrad, he aims to utilize his skills to push the boundaries of AI research.


If you have been following Data Science / Machine Learning, you just can’t miss the buzz around Deep Learning and…


Added by Emmanuelle Rieuf on October 18, 2016 at 8:00am — No Comments

#Blockchain derivatives will replace structured products #fintech

The era of structured products is coming to an end. These are financial instruments that have different payouts, like derivatives, based on different outcomes. They are different from derivatives in that they are built like bonds. One of the biggest short coming of these is their lack of a large secondary trading market before maturity. That is, if you try to sell one before its maturity date, you might get what the financial industry calls a “haircut”. In finance this is commonly referred…


Added by Eduardo Siman on October 17, 2016 at 10:30pm — No Comments

How do you search for a #sensor in Antartica on your Internet Of Things (#IoT) browser? Aka: Does #Arduino dream of electric sheep?

If you have ever launched a web page on a Raspberry Pi or Arduino, you know that it feels a bit like magic. How is it possible that a device the size of a credit card can be a web server? Its awe inspiring to be sure. For me, it leads to one of the key questions I have about the Internet of Things and how it will affect our world. How are we going to organize and search for all of these billions of devices?

When the web began, there were millions of…


Added by Eduardo Siman on October 17, 2016 at 10:30pm — 1 Comment

Heart rate during conference presentation - with beta-blockers

This chart comes from Reddit. Since many data scientists occasionnally have to make a presentation at a conference (or when presenting their doctoral thesis), I thought you would relate to this chart.  .

To read the original article, …


Added by Emmanuelle Rieuf on October 17, 2016 at 5:00pm — No Comments

Ten Ways Big Data Is Revolutionizing Marketing And Sales

This article was written by Louis Columbus. Louis is currently serving as Director, Global Cloud Product Management at Ingram Cloud.

  • Customer Analytics (48%), Operational Analytics (21%), Fraud and Compliance (12%) New Product & Service Innovation (10%) & Enterprise Data Warehouse Optimization (10%) are among the most popular big data use cases in sales and marketing.
  • Customer Value Analytics (CVA) based on Big Data is making it possible for leading…

Added by Emmanuelle Rieuf on October 17, 2016 at 11:00am — No Comments

Building an Algorithm to Break Strong Encryption

Here I discuss breaking encryption keys that rely on the product of two very large prime numbers. In other words, the interest here is to factor a number (representing a key in some encryption system) that is the product of two very large primes. Once the number is factored, the key is compromised. Factoring such large numbers is believed to be computationally non-feasible, thus the interest in discovering new algorithms to disprove this conjecture, and specifically to factor large numbers…


Added by Vincent Granville on October 16, 2016 at 6:30pm — 1 Comment

Featured Monthly Archives












  • Add Videos
  • View All

© 2020   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service