In marketing, everybody talks about telling stories. Stories about audiences. Stories about ad concepts. Stories about brands … and about consumers who use them.
Analysts also see stories in data. Looking through lines of data or snippets of code, you can see the threads weaving themselves together to give you a clear understanding of what’s happening; you see offshoots that lead to questions about different segments or correlation/cause and effect. But if you were to show…
Continue
Added by Chris Atwood on May 31, 2016 at 12:30pm —
No Comments
Machine learning is being used in a variety of domains to restrict or prevent undesirable behaviors by hackers, fraudsters and even ordinary users. Algorithms deployed for fraud prevention, network security, anti-money laundering belong to the broad area of adversarial machine learning where instead of ML trying to learn the patterns of benevolent nature, it is confronted with a malicious adversary that is looking for opportunities to exploit loopholes…
Continue
Added by Arshak Navruzyan on May 31, 2016 at 9:00am —
No Comments
You may be thinking that this title makes no sense at all. ML, AI, ANN and Deep learning have made it into the everyday lexicon and here I am, proclaiming that ML is dead. Well, here is what I mean…
The open sourcing of entire ML frameworks marks the end of a phase of rapid development of tools, and thus marks the death of ML as we have known it so far. The next phase will be marked with ubiquitous application of these tools into software applications. And that is how ML…
Continue
Added by Srividya Kannan Ramachandran on May 31, 2016 at 8:00am —
No Comments
Summary: Picking an analytic platform when first starting out in data science almost always means working with what we’re most comfortable. But as organizations grow larger there is a need for standardization and for selecting one, or a few analytic tools.
Picking an analytic platform when first starting out in data science almost…
Continue
Added by William Vorhies on May 31, 2016 at 7:00am —
1 Comment
Building a Data-driven Organization requires identifying and prioritizing the opportunities where advanced analytics can make a material difference to the quality of decisions!…

Continue
Added by RADHA KRISHNA PERA on May 30, 2016 at 9:30am —
No Comments
Introduction
The City and County of San Francisco had launched an official open data portal called SF OpenData in 2009 as a product of its official open data program, DataSF. The portal contains hundreds of city datasets for use by developers, analysts, residents and more. Under the category of Public Safety, the portal contains the list of SFPD Incidents since Jan 1, 2003.
In this post I have done an exploratory time-series analysis on the crime incidents dataset to see…
Continue
Added by Vimal Natarajan on May 30, 2016 at 7:42am —
No Comments
The world moves with faster speed every day. Folk, companies, and entrepreneurs try to react at an ever-increasing speed. Reaching the limits of a human's ability to reaction, tools are build to process the massive and big amounts of data available to decision makers, analyze and present it. The processing of this data has a number of different application areas.…
Continue
Added by Priyanka Jain on May 30, 2016 at 1:00am —
No Comments
The importance of gesture recognition technology lies in developing efficient interaction between humans and machines. Increasing application of gesture recognition technology in consumer electronics is among the key factors boosting the gesture recognition market at the global level. 3D vision and gesture tracking solutions in gaming consoles along with 2D gesture recognition technology in smart TVs, PCs, and tablets is fueling the demand for touch less…
Continue
Added by Aman on May 29, 2016 at 11:13pm —
No Comments
Monday newsletter published by Data Science Central. Previous editions can be found here.
Featured Resources and Technical Contributions
Continue
Added by Vincent Granville on May 29, 2016 at 8:04am —
No Comments
In research, especially in medical research, we describe characteristics of our study populations through Table 1. The Table 1 contain information about the mean for continue/scale variable, and proportion for categorical variable. For example: we say that the mean of systolic blood pressure in our study population is 145 mmHg, or 30% of participants are smokers. Since is called Table 1, means that is the first table in the manuscript.
To create the Table 1…
Continue
Added by Klodian on May 29, 2016 at 6:46am —
No Comments
Most of these infographics are tutorials covering various topics in big data, machine learning, visualization, data science, Hadoop, R or Python, typically intended for beginners. Some are cheat sheets and can be nice summaries for professionals with years of experience. Some, popular a while back (you will find one example here) were designed as periodic tables.…

Continue
Added by Vincent Granville on May 28, 2016 at 8:09am —
No Comments
In May 2006, Larry page, one of Google’s co-founders had said “The ultimate search engine would understand everything in the world. It would understand everything that you asked it and give you back the exact right thing instantly. You could ask ‘what should I ask Larry?’ and it would tell you.” Come 2016, it seems at least part of his vision has been achieved through the release of Tensorflow, Google’s Artificial engine platform.
Tensorflow is a deep learning software…
Continue
Added by Tanmay Bhandari on May 27, 2016 at 5:00am —
1 Comment
Geospatial Analytics Market: Overview
Geospatial is the geographical data having locational information described in terms of coordinates (latitude and longitude), address, city or ZIP code. Geospatial data is gathered through satellite, global positioning system (GPS), geo tagging and remote sensing. Global Information System (GIS) is used for mapping and analyzing geospatial data. Remote sensing tool is used to acquire geographical data without…
Continue
Added by Alina John on May 27, 2016 at 2:12am —
No Comments
I’d like to personally invite our global community of Data Scientists to participate in this week’s DSC challenge. You are invited to create your own data video: we provide simple instructions on how to do it. All submissions of acceptable quality will be featured on DSC, reaching our entire community of just over 1M members. Each participant will receive a free copy of my (Vincent…
Continue
Added by Vincent Granville on May 26, 2016 at 2:30pm —
No Comments
Originally published on the Aster Community. We invite you to register for our upcoming webinar “Bridging the Gap Between Data Scientists and Analyst with Analytic Solutions” with Brandon Purcell of Forrester…
Continue
Added by Ryan Garrett on May 26, 2016 at 6:47am —
No Comments
Data Analytics favorites include Apache Spark, which is becoming a reference standard for Big Data, as a “fast and general engine for large-scale data processing”. Its built-in PySpark interface can run as a Jupyter notebook, but recent posts didn’t quite allow me to do…
Continue
Added by Marc Borowczak on May 26, 2016 at 6:43am —
No Comments
Yesterday evening,I attended a Tableau user group meeting to preview the new features expected in the upcoming Tableau 10 release. This meeting was hosted at the Toronto Public Library by none other than Tableau Maestro, Michael Martin!
The turn out was great with more than 50 people attending from various industries.
Here are some of the new…
Continue
Added by Salman Khan on May 25, 2016 at 7:03pm —
No Comments
Finally, a new version of DataMelt (http://jwork.org/dmelt/), a Java-based data-analysis framework based on open-source software, was released. This release features significantly improved graphics to display data and mathematical objects in 3D. The updated canvas (called HPlotXYZ) uses Jzy3d and JOGL 2 to deploy deploy native OpenGL library. A few examples of images with data in 3D…
Continue
Added by jwork.ORG on May 25, 2016 at 2:47pm —
No Comments
The Fallacies of Data Science
Adnan Masood, PhD. & David Lazar
- Correlation = Causation, and Big Data = Information and Insights because Data Context Doesn't Matter.
- The random nature of the event drives the distribution, therefore the likely distribution also drive the…
Continue
Added by Adnan Masood, PhD. on May 25, 2016 at 10:30am —
No Comments
Summary: Want to win a Kaggle competition or at least get a respectable place on the leaderboard? These days it’s all about ensembles and for a lot of practitioners that means reaching for random forests. Random forests have indeed been very successful but it’s worth remembering that there are three different categories of ensembles and some important hyper parameters tuning issues within each Here’s a brief review.
…
Continue
Added by William Vorhies on May 25, 2016 at 7:30am —
1 Comment