Subscribe to DSC Newsletter

July 2015 Blog Posts (85)

Teradata Aster: Multi-Channel Churn Prediction in Banking

Please watch this video and learn how we take multiple channels of data:  Bank systems, IVR, Call Center Notes, Clickstream, and others to perform behavioral churn prediction.  We do this at data scale and not using samples.

Added by John Thuma on July 11, 2015 at 9:30am — No Comments

Taking R to the Next Level

R has become a massively popular language for data mining and predictive model building with over two million users worldwide.  The wide adoption of R has to do with the fact that it is available as open source, runs on most technology platforms and is commonly taught in academic institutions in courses with significant components of data science, machine learning and statistics.  A recent study found that R is now cited in academic papers more often then SAS and SPSS, a change from previous…


Added by Mark Rabkin on July 11, 2015 at 4:41am — 1 Comment

How To Identify A Good/Bad Data Scientist In A Job Interview?

Data scientists are in notoriously high demand, so when your company is ready to make the leap into big data, it pays to understand how to tell if you’re getting a good one.


Source for picture: click here

Because of the vast…


Added by Bernard Marr on July 10, 2015 at 2:30pm — 3 Comments

How Data Science is influencing Buyer Behavior

The modern consumers have changed. With the advent of internet, and easy access of it, finding what they are looking for has become easier. Customers nowadays perform a lot of research before they purchase a product. They now expect to receive personalized service customized to fit their preferences, which also keeps continuously changing.  Converting a customer successfully requires a business know a lot about their buying behavior,…


Added by Sameer Dhanrajani on July 10, 2015 at 2:37am — No Comments

Analysis about marriage equality in US, based on 25,000 tweets

The data is in and it's conclusive: America loved the marriage equality ruling - by Justin Tenuto

Depending on the sites you frequent and the social media you consume, you might think the Supreme Court's marriage equality ruling went over rather poorly. Maybe you spend a lot of time in the comments section of the National Review or have…


Added by Leena Kamath on July 9, 2015 at 2:30pm — 2 Comments

Machine Learning at Scale with Spark

In my last post, I covered setting up the basic tools to start doing machine learning (Python, NumPy, Matplotlib and Scikit-Learn).  Now, you are probably wondering how to do this on a very large scale, involving terabytes (may be even petabytes) of data and across several server nodes.  

The best answer is Apache …


Added by Somnath Banerjee on July 9, 2015 at 8:30am — 4 Comments

What to do Once we Have Data in Our Hands?

Meeting the Data: What to do Once we Have Data in Our Hands? (a pathway for those starting with data science)

Many times those starting with data science don't know what to do with data once they have a dataset at hand. Where to start, which analysis to do, what to consider in the analysis and which tool to use are common questions not only posed by beginners. This article gives the author's…


Added by Flavio Bossolan on July 9, 2015 at 7:30am — 3 Comments

IoT projects analysis on Kickstarter Crowd Funding Platform

Given all the buzz happening in the market around IoT, We looked at related projects in the crowd funding website to see how are IoT projects doing with respect to all the other ones.

We chose projects which have either “IoT” or “Internet of Things” either in their title or description and here are our findings.

The success rate of projects at Kickstarter is around 37.5%, for Technology projects it is 21% which is a lot less than…


Added by Pansop on July 7, 2015 at 8:28pm — No Comments

3 Ways Marketers Use Data to Drive Higher Return on Investments

At the heart of good marketing is data. And when data drives your marketing, the average return on investment (ROI) is a whopping 224%! (VB Insight)

VB Insight surveyed over 3,000 marketers and looked at tools used on over 3 million websites. Even small improvements have the potential to drive higher ROI. According to…


Added by Larisa Bedgood on July 7, 2015 at 9:40am — No Comments

Weekly Digest - July 13

The full version is always published Monday. Starred articles are new additions or updated content, posted between Thursday and Sunday. The picture of the week is from the contribution marked with a +, where you will find the details.



Added by Vincent Granville on July 7, 2015 at 8:30am — No Comments

Brand Image, Sentiment Analysis and Social Media

Analyzing sentiments is a very subjective exercise. The other day, I was talking about this to a colleague of mine about how you could mine sentiments and he made a very interesting comment based out of the experience that he has had with this concept :

He has his own software company which is a mid-sized software one and that which was doing fairly well and at one time, he tried to analyze…


Added by Srinivasan Rakhunathan on July 6, 2015 at 7:29pm — 2 Comments

Identifying external data needs for driving conversion optimization

Marc Andreessen famously said ‘software is eating the world’. This appetite for software is fed by the fact that software driven businesses are not only way more effective than traditional businesses; they can also leverage the software to accomplish things unheard of before, such as new business models. Companies such as Google, eBay, and Amazon are well-known examples of…


Added by Martin Voorzanger on July 6, 2015 at 10:00am — No Comments

Which drink is more popular Tea, Coffee, Beer, Wine?

If number of searches were an indicator for gauging the popularity of a drink between Tea, Coffee, Beer and Wine then who wins the contest?

Let us analyze who wins the popularity contest by comparing the number of searches for each of the terms.

Data collection method: Each of the keywords Tea, Coffee, Beer and Wine were used with Google Trends to extract the weekly data and then plotted as below. Year 2015 is excluded due to partial…


Added by Nilesh Jethwa on July 6, 2015 at 6:35am — 4 Comments

Workshop: Big Data Could Spell Big Legal Troubles

Big Data is powering Big Business these days, but could also spell big legal troubles if not handled properly.

We invite you to a two-day workshop on Legal Issues and Big Data that Gary Rinkerman and I are conducting at New York University on July 30-31 in NYC. This two-day seminar covers the legal issues affecting big data, data analytics, and data-driven business…


Added by Andres Fortino on July 6, 2015 at 6:30am — No Comments

Great reports from scratch (even when you don't know what your stakeholder wants!)

Doing periodic reporting always feels like it's going to be easy, just make another copy of everything, change some dates, and run it again! But stakeholders have a way of looking at your hard work, thanking you for (if you're lucky) and asking for exactly the new view that was the hardest to implement. How do you build out a report that won't become tangled mess when all of these requests pile up?…


Added by Matt Ritter on July 6, 2015 at 2:01am — No Comments

High-Return Data Science: Modernizing / Automating Digital Publishing (Case Study)

Introduction and Purpose

Here I put together a number of new data science techniques to solve a real life problem: identifying good articles to write and publish (or to harvest and re-post) on a website, and re-tweet them with the optimum frequency, given a specific audience. The focus is on scoring articles based on selected features (keywords in subject line, author, channel and many more), feature selection, data generation and…


Added by Vincent Granville on July 5, 2015 at 11:30am — No Comments

5 Types of Data in Feedback

In this blog, I will be discussing some distinct types of data involved in feedback. The types that I will be covering are as follows: 1) structural; 2) event; 3) quantitative; 4) contextual; and 5) systemic. In 2014, I recall reading a number of blogs about three types of data: prescriptive, descriptive, and predictive. There was a data scientist apparently on tour lecturing extensively about these three types. I don't recall the individual's name. Well, prescription, description, and…


Added by Don Philip Faithful on July 5, 2015 at 4:56am — No Comments

See What You Can do With One Aster Command....

We would like to know if the customer discounts are having any effect on customer visits. We'll look to see if having a large discount (greater than .10 cents) leads to a greater number of additional purchases made at the store. Specifically, we want to know the date of the first large discount event ( > .10), the size of the discount, and the total number of unique products purchased after that discount. First, construct an nPath query that returns the total number of products…


Added by John Thuma on July 3, 2015 at 2:30pm — No Comments

Little Debate: Data Priorities for all Industries

The figure titled "Data Pipeline" is from an article by Jeffrey T. Leek & Roger D. Peng titled, "Statistics: P values are just the tip of the iceberg. These are both well known scientists in the field of statistics and data science, and for them, there is no need to debate the importance of data integrity; it is a fundamental concept. Current terminology uses the term "tidy data", a phrase coined by Hadley Wickham from an article by the same name. Whatever you…


Added by Randall Shane on July 3, 2015 at 12:00pm — 1 Comment

Aster: Video: Using Confusion Matrix in Machine Learning

Genre: Statistical Analysis (Machine Learning)

Background: Learn how easy it is to leverage Aster for implementing confusion matrices. A Confusion Matrix provides a visual representation of the performance of a supervised machine learning algorithm. It makes it easy to determine if a model is confusing or mislabeling classes. We also go over some of the math involved and help to understand how confusion matrices are used in supervised machine learning.

Use Cases:

-… Continue

Added by John Thuma on July 3, 2015 at 8:30am — No Comments

Blog Topics by Tags

Monthly Archives












  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service