Subscribe to DSC Newsletter

August 2016 Blog Posts (87)

R Basics (stats): Data Frames

Data Frames are the tables to store data. If you recall the vectors from the first R notes data frames can be imagined as the collection of vectors with same dimension. We have already created vectors, named the vectors and plotted on histograms.

In this note we will create data frames, aggregate and plot.

Let’s start with baby steps and create a small data frame as a new script. You can open a new script by clicking on file and new script. You can copy and paste following…


Added by Meltem Ballan on August 25, 2016 at 6:30am — No Comments

Quick Review and intro for Budding Analysts: Reading and Aggregating CSV files

In this note I will quickly talk about csv files on a basic scenario.

I have loaded two csv files with customer complaints on my github account. The complaints are unique and a customer might complain more than once. Customer Id is encrypted as XX000 and assume that missing values don't have the same pattern and number of strings.

After the basic preprocessing we want to know the number of complaints by customers and…


Added by Meltem Ballan on August 25, 2016 at 6:30am — 1 Comment

18 Great Blogs Posted in the last 12 Months

This is part of a new series of articles: once or twice a month, we post previous articles that were very popular when first published. These articles are at least 6 month old but no more than 12 month old. The first digest in this series was posted here two weeks ago. Below is our second edition.…


Added by Vincent Granville on August 24, 2016 at 11:30am — No Comments

Book: Python Machine Learning Blueprints

Key Features

  • Put machine learning principles into practice to solve real-world problems
  • Get to grips with Python's impressive range of Machine Learning libraries and frameworks
  • From retrieving data from APIs to cleaning and visualization, become more confident at tackling every stage of the data pipeline



Added by Emmanuelle Rieuf on August 24, 2016 at 11:00am — No Comments

Statistical analysis in Google Sheets

This article was originally posted here. It was written by Steven Scott, a Bayesian statistician interested in data augmentation methods and Markov chain Monte Carlo. Steven has applied these methods to problems in educational testing, network security, biometrics, web browsing, e-commerce, and medical applications.…


Added by Emmanuelle Rieuf on August 24, 2016 at 10:30am — No Comments

3 reasons why you should care about clean, and measurable data, and an example of where it’s working

The term Big Data is no longer a buzzword, it’s become an institution, and businesses all over the world are hiring Data Scientists, Chief Data Officers and the like to help them make sense of it all. But Big Data shouldn’t be thought of some scary, untouchable thing. We’ve been collecting data for decades and Big Data is well, just more of it.

Considering we now have a lot more data coming in, day on day, how best can we make it work for us? The first step is to ensure that the data…


Added by Gareth Forbes on August 24, 2016 at 12:30am — 1 Comment

The Benefits of Decentralizing Analytics Talent

In my experience, some of the most talented analytics professionals I’ve managed were ones that had intimate knowledge of the system limitations required to meet customer needs. These individuals came from a variety of roles, some from engineering, and others from customer service roles. Their strength was in forming specific hypotheses to pinpoint customer experience issues and then leveraging their curiosity to do whatever it took, including learning new statistical techniques and…


Added by Valiance Solutions on August 23, 2016 at 9:00pm — No Comments

Roadmap for data-driven organization: Known’s, unknowns and the elusive value from analytics

There are many who believe all the true transformational opportunities for quantum improvements in business are to found only by exploring the unknown-unknowns through big-data analytics. 

So what exactly are the…


Added by Krishna Pera on August 23, 2016 at 12:30pm — 1 Comment

Book: Data Science Essentials in Python

Go from messy, unstructured artifacts stored in SQL and NoSQL databases to a neat, well-organized dataset with this quick reference for the busy data scientist. Understand text mining, machine learning, and network analysis; process numeric data with the NumPy and Pandas modules; describe and analyze data using statistical and…


Added by Emmanuelle Rieuf on August 23, 2016 at 9:00am — No Comments

Top 10: Data Science and Machine Learning Articles in July

This article was originally posted here by Mike Tamir. Mike is a seasoned data science leader, who builded data science teams specializing in machine learning, data architecture, and predictive analytics solutions.

Top 10 most popular posts in…


Added by Emmanuelle Rieuf on August 23, 2016 at 8:30am — No Comments

These IoT Sensors Want to Know How You Feel – And Maybe Even Change Your Mood

Summary:  Sensors that know how you feel?  Sensors that want to change the way you feel?  When did that happen and better yet how?


We’re getting used to sensors finding out what we’re doing.  Apparently they are now sufficiently sophisticated that they can even tell if I’m sitting up straight (yes Mom – BTW using a camera is almost…


Added by William Vorhies on August 23, 2016 at 3:00am — 3 Comments

Book: Statistics for Non-Statisticians

  • Aimed at practitioners
  • The presentation is as non-mathematical as possible
  • Includes many examples of the use of statistical functions in spreadsheets
  • Employs a realistic sample survey as an exemplar throughout the book
  • Fills a gap in the existing literature on…

Added by Emmanuelle Rieuf on August 22, 2016 at 4:00pm — No Comments

[Data Mining] Association Rules in R (diapers and beer)

[Introduction of Association Rules]

Sometimes, the anecdotal story helps you understand the new concept. But, this story is real. About 15 years ago, in Walmart, a sales guy made efforts to boost sales in his store. His idea was simple. He bundled the products together and applied some discounts to the bundled products. (Now, it became common practices in marketing) For example, this guy bundled bread with jam, so that customers easily found them together. Moreover,…


Added by Gregory Choi on August 22, 2016 at 7:30am — 5 Comments

5 Big Data Myths Businesses Should Know

Big data is seeping into every facet of our lives. Smart home gadgets are becoming part of the nerve systems of new and remodeled homes, and many renters are demanding these interconnected gadgets from landlords.


But nowhere has Big Data created a bigger buzz than in business.…


Added by Larry Alton on August 22, 2016 at 6:00am — No Comments

Conditional Random Fields (CRF): Short Survey

Currently, many of us are overwhelmed with mighty power of Deep Learning. We start to forget about humble graphical models. CRF is not so trendy as LSTM, but it is robust, reliable and worth noting.

In this post, you will find a short summary about CRF (aka Conditional Random Fields) – what is this thing, what is it for and some interesting facts. Enjoy!…


Added by Nikitinsky Nikita on August 22, 2016 at 5:00am — No Comments

Future of IIoT - Intelligent Machines will Create a Positive Impact on Productivity

Industrial Internet of Things (IIoT)  s a system that integrates complex machines with high-end software programs and sensors for analyzing data to increase productivity and reduce operational time and costs. IIoT systems differs from Internet of Things (IoT) systems. Failure in IIoT systems would lead to disastrous situations where as in IoT systems the failure would barely lead to emergency situations. IoT systems are designed at consumer level device such…


Added by Aman on August 21, 2016 at 11:00pm — No Comments

Weekly Digest, August 22

Monday newsletter published by Data Science Central. Previous editions can be found here.  The contribution flagged with a + is our selection for the picture of the week.



Added by Vincent Granville on August 21, 2016 at 8:30am — No Comments

User Analytics using Metabase and MongoDB

Metabase, an open source, easy-to-use database visualization tool, is built and maintained by a dedicated Metabase team and comes with a Crate driver. It is written in Clojure and offers multiple options such as Mac application, Docker image, cloud images, and a jar file, which are specifically designed for particular use cases.

Metabase is mainly used for analyzing your existing data on a daily basis by quickly fetching answers to your common queries without dealing with complex…


Added by Raghavan Madabusi on August 21, 2016 at 12:30am — No Comments

An absolute beginner’s guide to machine learning, deep learning, and AI

This article was posted by SmileJet on Dev Battles.

Meet Samantha. She’s your friendly assistant from 2025. She sorts your mail, sets up your meetings, and orders groceries. She paints and writes poetry. She’s your best friend. She’s also an artificial intelligence from the movie Her, which imagines how a juiced-up Siri will change our lives.

Now, tech companies large and small are racing to make this a reality. You’ve read the news. You’ve heard the jargon: AI,…


Added by Emmanuelle Rieuf on August 20, 2016 at 5:30pm — No Comments

Coding Explained in 25 Profound Comics

This article was published on Free Code Camp. Free Code Camp publish stories about development, design, data science, and open source. They asked their open source community to share the comics they found most profoundly described coding, via their news site (on Reddit.)

Here are their 25 most upvoted comics:



Added by Emmanuelle Rieuf on August 20, 2016 at 4:30pm — No Comments

Blog Topics by Tags

Monthly Archives












  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service