# June 2015 Blog Posts (73)

### Naive Bayes for Dummies; A Simple Explanation

This blog post was originally published as part of an ongoing series, "Popular Algorithms Explained in Simple English" on the AYLIEN Text Analysis Blog.

Picture added by the…

Added by Mike Waldron on June 8, 2015 at 1:00am — 2 Comments

### Regression Prediction using AWS Machine Learning

We wanted to be able to predict median rent of a place given the median price of the home, median household income of the place and the percentage of homes vacant in that place. The data can be downloaded from…

Added by Vozag on June 7, 2015 at 9:30pm — 1 Comment

### Data Science: The numbers game Law almost lost.

On the face of it, Analytics and Law are manifestly divergent fields of practice. One need only consider the nature of Algorithms that require numerical attributes for their calculations and the textual rigidity of substantive law to realize this. The very first obstacle one will encounter in applying Analytics to Law is the absence of calculable numerical variables in raw legal data. No judicial precedent, statute or common law principle has ever been reduced to a mathematically sound…

Added by Mkhuseli Mthukwane on June 5, 2015 at 4:00am — 5 Comments

### Top Five Data Science Masters Programs

Which top Masters Courses should you consider for a great career in data-science?

A frequently cited study by McKinsey predicts that by 2018, the United States could face a shortage of 140,000 to 190,000 "people with deep analytic skills" as well as 1.5…

Added by Sudhanshu Ahuja on June 4, 2015 at 4:30pm — 3 Comments

### Predicting records (highs or lows) - how to do it right (and without statistics)

While everyone talk about unusual extreme weather events (heat waves, cold spells, floods, droughts), very few, including scientists, have been able to make sound predictions for extreme events, be it weather or stock market extreme behavior, or any bubble. Here you will learn how to produce simple model-free confidence intervals for extreme events in Excel, how to generate (correlated) simulated stock market data and (uncorrelated) natural data such as air pollution index, understand why…

Added by Vincent Granville on June 3, 2015 at 8:30pm — No Comments

### Weekly Digest, June 8

The full version is always published Monday. Starred articles are new additions or updated content, posted between Thursday and Sunday. The picture of the week is from the contribution marked with a +, where you will find the details.

Featured…

Added by Vincent Granville on June 3, 2015 at 9:30am — No Comments

### Data Scientists are Improving the World

MIT Sloan Management Review (MIT SMR), through an academic-industry collaborative partnership with SAS, has developed a long term research initiative connected with analytics and management innovation.

Last month I posted a blog pointing to the findings from our latest data and analytics global executive study and report, "The Talent Dividend," highlighting the role of analytics talent in creating competitive advantage at data-oriented…

Added by Robert Holland on June 2, 2015 at 11:57am — No Comments

### Environmental Monitoring using Big Data

In this post, I will cover in-depth a Big Data use case: monitoring and forecasting air pollution.

A typical Big Data use case in the modern Enterprise includes the collection and storage of sensor data, executing data analytics at scale, generating forecasts, creating visualization portals, and automatically raising alerts in the case of abnormal deviations or threshold breaches.

This article will focus on an implemented use case: monitoring and analyzing air quality…

Added by Axibase Corp on June 2, 2015 at 6:00am — No Comments

### 7 Ingredients for Great Visualizations

Great article by Bernard Marr. Here we present a summary. A link to the full article is provided below.

1. Identify your target audience.

2. Customize the…

Added by Vincent Granville on June 1, 2015 at 9:30am — No Comments

### Data Science Summer Reading List 2015

Added by Michael Walker on June 1, 2015 at 7:30am — No Comments

### Optimizing your search functionality on your website

Your website’s search capabilities may be a potential customer’s first (or only) interaction with your website.  Customers who can’t find relevant products based how they search are likely to  abandon and go to competitor websites.  For many retailers, 30% - 40% of search queries are under-performing.  Underperforming search queries are costing you sales and customers.

Some examples of under performing search queries are:

• Queries that return zero…
Added by John Thuma on June 1, 2015 at 7:30am — 1 Comment

### R Functions for Exploratory Analysis, Data Frame Merging & Map Displays

Given below is a list of R functions for quickly exploring the key attributes of the data set. The data set is based on car prices & insurance…

Added by Vozag on June 1, 2015 at 5:53am — 3 Comments

### NFL Play by Play analysis using Cloudera Impala

We had the chance to use the NFL play by play dataset all the way from 2002 through 2013 and the best part is the analysis was carried within Hadoop using Cloudera Impala.

For the analysis we wanted to be at the individual game level but the data contained mixed grain including the play by play data. So what we ended up doing was apply some SQL filters to restrict it to the first row of each play by play dataset.

Here are some interesting…

Added by Nilesh Jethwa on June 1, 2015 at 5:49am — 2 Comments

