Subscribe to DSC Newsletter

All Blog Posts (3,053)

Weekly Digest, August 29

Monday newsletter published by Data Science Central. Previous editions can be found here.  The contribution flagged with a + is our selection for the picture of the week.

Featured Resources and Technical Contributions


Added by Vincent Granville on August 26, 2016 at 9:30am — No Comments

Associative Data Modeling Demystified - Part1

Relation, Relationship and Association

While most players in the IT sector adopted Graph or Document databases and Hadoop based solutions, Hadoop is an enabler of HBase column store, it went almost unnoticed that several new DBMS, AtomicDB,…


Added by Athanassios Hatzis on August 26, 2016 at 4:00am — No Comments

Data-Driven UX/UI Design

Ever wondered why people have an affinity towards using certain apps vs others even though other apps provide the same functionality? Or, they are more attracted to certain features in the tab because it’s just easier to use? If not, then as a UX/UI designer it’s time for you to find that out,…


Added by Siddhi Shroff on August 25, 2016 at 1:00pm — No Comments

Thursday News: R, Python, Machine Learning, IoT, Coding Comics

Here are our most recent featured articles and resources, including three interesting books, as well as tutorials about R Data Frames, and Stats with Google Sheets.


Added by Vincent Granville on August 25, 2016 at 8:02am — No Comments

Before starting with R Programming: Basic R without any package installed

Open source software solutions have become so powerful and large corporates started to prepare their traditional business analysts to move to open source softwares, particularly R. I have prepared a basic document to train some of my clients and local communities in Dallas. This article is not intended for people who are exposed to R before; but, people who are new and want to learn ABCs of R.


Downloading R is rather simple on …


Added by Meltem Ballan on August 25, 2016 at 7:00am — No Comments

R Basics (stats): Data Frames

Data Frames are the tables to store data. If you recall the vectors from the first R notes data frames can be imagined as the collection of vectors with same dimension. We have already created vectors, named the vectors and plotted on histograms.

In this note we will create data frames, aggregate and plot.

Let’s start with baby steps and create a small data frame as a new script. You can open a new script by clicking on file and new script. You can copy and paste following…


Added by Meltem Ballan on August 25, 2016 at 6:30am — No Comments

Quick Review and intro for Budding Analysts: Reading and Aggregating CSV files

In this note I will quickly talk about csv files on a basic scenario.

I have loaded two csv files with customer complaints on my github account. The complaints are unique and a customer might complain more than once. Customer Id is encrypted as XX000 and assume that missing values don't have the same pattern and number of strings.

After the basic preprocessing we want to know the number of complaints by customers and…


Added by Meltem Ballan on August 25, 2016 at 6:30am — No Comments

The rise of advanced analytics

Five years ago, we had the technology available to enable us to undertake advanced analytics, yet there was no real interest in analysing data in a fast and efficient manner. Whilst the benefits were undeniable, businesses did not think that it was necessary to analyse information in real-time. However, this way of thinking has become dated, and now all those who once deemed it unnecessary are rushing to adopt advanced analytics and harness the insights it can provide them…


Added by Aaron Auld on August 25, 2016 at 2:30am — No Comments

18 Great Blogs Posted in the last 12 Months

This is part of a new series of articles: once or twice a month, we post previous articles that were very popular when first published. These articles are at least 6 month old but no more than 12 month old. The first digest in this series was posted here two weeks ago. Below is our second edition.…


Added by Vincent Granville on August 24, 2016 at 11:30am — No Comments

Book: Python Machine Learning Blueprints

Key Features

  • Put machine learning principles into practice to solve real-world problems
  • Get to grips with Python's impressive range of Machine Learning libraries and frameworks
  • From retrieving data from APIs to cleaning and visualization, become more…

Added by Emmanuelle Rieuf on August 24, 2016 at 11:00am — No Comments

Statistical analysis in Google Sheets

This article was originally posted here. It was written by Steven Scott, a Bayesian statistician interested in data augmentation methods and Markov chain Monte Carlo. Steven has applied these methods to problems in educational testing, network security, biometrics, web browsing, e-commerce, and medical applications.…


Added by Emmanuelle Rieuf on August 24, 2016 at 10:30am — No Comments

3 reasons why you should care about clean, and measurable data, and an example of where it’s working

The term Big Data is no longer a buzzword, it’s become an institution, and businesses all over the world are hiring Data Scientists, Chief Data Officers and the like to help them make sense of it all. But Big Data shouldn’t be thought of some scary, untouchable thing. We’ve been collecting data for decades and Big Data is well, just more of it.

Considering we now have a lot more data coming in, day on day, how best can we make it work for us? The first step is to ensure that the data…


Added by Gareth Forbes on August 24, 2016 at 12:30am — No Comments

The Benefits of Decentralizing Analytics Talent

In my experience, some of the most talented analytics professionals I’ve managed were ones that had intimate knowledge of the system limitations required to meet customer needs. These individuals came from a variety of roles, some from engineering, and others from customer service roles. Their strength was in forming specific hypotheses to pinpoint customer experience issues and then leveraging their curiosity to do whatever it took, including learning new statistical techniques and…


Added by Valiance Solutions on August 23, 2016 at 9:00pm — No Comments

Roadmap for data-driven organization: Known’s, unknowns and the elusive value from analytics

There are many who believe all the true transformational opportunities for quantum improvements in business are to found only by exploring the unknown-unknowns through big-data analytics. 

So what exactly are the unknown-unknowns and…


Added by Krishna Pera on August 23, 2016 at 12:30pm — No Comments

Book: Data Science Essentials in Python

Go from messy, unstructured artifacts stored in SQL and NoSQL databases to a neat, well-organized dataset with this quick reference for the busy data scientist. Understand text mining, machine learning, and network analysis; process numeric data with the…


Added by Emmanuelle Rieuf on August 23, 2016 at 9:00am — No Comments

These IoT Sensors Want to Know How You Feel – And Maybe Even Change Your Mood

Summary:  Sensors that know how you feel?  Sensors that want to change the way you feel?  When did that happen and better yet how?


We’re getting used to sensors finding out what we’re doing.  Apparently they are now sufficiently sophisticated that they can even tell…


Added by William Vorhies on August 23, 2016 at 3:00am — No Comments

Book: Statistics for Non-Statisticians

  • Aimed at practitioners
  • The presentation is as non-mathematical as possible
  • Includes many examples of the use of statistical functions in spreadsheets
  • Employs a realistic sample survey as an exemplar throughout the book
  • Fills a gap in…

Added by Emmanuelle Rieuf on August 22, 2016 at 4:00pm — No Comments

[Data Mining] Association Rules in R (diapers and beer)

[Introduction of Association Rules]

Sometimes, the anecdotal story helps you understand the new concept. But, this story is real. About 15 years ago, in Walmart, a sales guy made efforts to boost sales in his store. His idea was simple. He bundled the products together and applied some discounts to the bundled products. (Now, it became common practices in marketing) For example, this guy bundled bread with jam, so that customers easily found them together. Moreover,…


Added by Gregory Choi on August 22, 2016 at 7:30am — No Comments

5 Big Data Myths Businesses Should Know

Big data is seeping into every facet of our lives. Smart home gadgets are becoming part of the nerve systems of new and remodeled homes, and many renters are demanding these interconnected gadgets from landlords.


But nowhere has Big Data created a bigger buzz than in business.…


Added by Larry Alton on August 22, 2016 at 6:00am — No Comments

Conditional Random Fields (CRF): Short Survey

Currently, many of us are overwhelmed with mighty power of Deep Learning. We start to forget about humble graphical models. CRF is not so trendy as LSTM, but it is robust, reliable and worth noting.

In this post, you will find a short summary about CRF (aka Conditional Random Fields) – what is this thing, what is it for and some interesting facts. Enjoy!…


Added by Nikitinsky Nikita on August 22, 2016 at 5:00am — No Comments

Monthly Archives








Follow Us


  • Add Videos
  • View All


© 2016   Data Science Central   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service