Subscribe to DSC Newsletter

July 2015 Blog Posts (82)

How To Identify A Good/Bad Data Scientist In A Job Interview?

Data scientists are in notoriously high demand, so when your company is ready to make the leap into big data, it pays to understand how to tell if you’re getting a good one.


Source for picture: click here

Because of the vast…


Added by Bernard Marr on July 10, 2015 at 2:30pm — 3 Comments

How Data Science is influencing Buyer Behavior

The modern consumers have changed. With the advent of internet, and easy access of it, finding what they are looking for has become easier. Customers nowadays perform a lot of research before they purchase a product. They now expect to receive personalized service customized to fit their preferences, which also keeps continuously changing.  Converting a customer successfully requires a business know a lot about their buying behavior,…


Added by Sameer Dhanrajani on July 10, 2015 at 2:37am — No Comments

Analysis about marriage equality in US, based on 25,000 tweets

The data is in and it's conclusive: America loved the marriage equality ruling - by Justin Tenuto

Depending on the sites you frequent and the social media you consume, you might think the Supreme Court's marriage equality ruling went over rather poorly. Maybe you spend a lot of time in the comments section of the National Review or have…


Added by Leena Kamath on July 9, 2015 at 2:30pm — 2 Comments

Machine Learning at Scale with Spark

In my last post, I covered setting up the basic tools to start doing machine learning (Python, NumPy, Matplotlib and Scikit-Learn).  Now, you are probably wondering how to do this on a very large scale, involving terabytes (may be even petabytes) of data and across several server nodes.  

The best answer is Apache …


Added by Somnath Banerjee on July 9, 2015 at 8:30am — 4 Comments

IoT projects analysis on Kickstarter Crowd Funding Platform

Given all the buzz happening in the market around IoT, We looked at related projects in the crowd funding website to see how are IoT projects doing with respect to all the other ones.

We chose projects which have either “IoT” or “Internet of Things” either in their title or description and here are our findings.

The success rate of projects at Kickstarter is around 37.5%, for Technology projects it is 21% which is a lot less than…


Added by Pansop on July 7, 2015 at 8:28pm — No Comments

3 Ways Marketers Use Data to Drive Higher Return on Investments

At the heart of good marketing is data. And when data drives your marketing, the average return on investment (ROI) is a whopping 224%! (VB Insight)

VB Insight surveyed over 3,000 marketers and looked at tools used on over 3 million websites. Even small improvements have the potential to drive higher ROI. According to…


Added by Larisa Bedgood on July 7, 2015 at 9:40am — No Comments

Weekly Digest - July 13

The full version is always published Monday. Starred articles are new additions or updated content, posted between Thursday and Sunday. The picture of the week is from the contribution marked with a +, where you will find the details.



Added by Vincent Granville on July 7, 2015 at 8:30am — No Comments

Brand Image, Sentiment Analysis and Social Media

Analyzing sentiments is a very subjective exercise. The other day, I was talking about this to a colleague of mine about how you could mine sentiments and he made a very interesting comment based out of the experience that he has had with this concept :

He has his own software company which is a mid-sized software one and that which was doing fairly well and at one time, he tried to analyze…


Added by Srinivasan Rakhunathan on July 6, 2015 at 7:29pm — 2 Comments

Identifying external data needs for driving conversion optimization

Marc Andreessen famously said ‘software is eating the world’. This appetite for software is fed by the fact that software driven businesses are not only way more effective than traditional businesses; they can also leverage the software to accomplish things unheard of before, such as new business models. Companies such as Google, eBay, and Amazon are well-known examples of…


Added by Martin Voorzanger on July 6, 2015 at 10:00am — No Comments

Which drink is more popular Tea, Coffee, Beer, Wine?

If number of searches were an indicator for gauging the popularity of a drink between Tea, Coffee, Beer and Wine then who wins the contest?

Let us analyze who wins the popularity contest by comparing the number of searches for each of the terms.

Data collection method: Each of the keywords Tea, Coffee, Beer and Wine were used with Google Trends to extract the weekly data and then plotted as below. Year 2015 is excluded due to partial…


Added by Nilesh Jethwa on July 6, 2015 at 6:35am — 4 Comments

Workshop: Big Data Could Spell Big Legal Troubles

Big Data is powering Big Business these days, but could also spell big legal troubles if not handled properly.

We invite you to a two-day workshop on Legal Issues and Big Data that Gary Rinkerman and I are conducting at New York University on July 30-31 in NYC. This two-day seminar covers the legal issues affecting big data, data analytics, and data-driven business…


Added by Andres Fortino on July 6, 2015 at 6:30am — No Comments

Great reports from scratch (even when you don't know what your stakeholder wants!)

Doing periodic reporting always feels like it's going to be easy, just make another copy of everything, change some dates, and run it again! But stakeholders have a way of looking at your hard work, thanking you for (if you're lucky) and asking for exactly the new view that was the hardest to implement. How do you build out a report that won't become tangled mess when all of these requests pile up?…


Added by Matt Ritter on July 6, 2015 at 2:01am — No Comments

High-Return Data Science: Modernizing / Automating Digital Publishing (Case Study)

Introduction and Purpose

Here I put together a number of new data science techniques to solve a real life problem: identifying good articles to write and publish (or to harvest and re-post) on a website, and re-tweet them with the optimum frequency, given a specific audience. The focus is on scoring articles based on selected features (keywords in subject line, author, channel and many more), feature selection, data generation and…


Added by Vincent Granville on July 5, 2015 at 11:30am — No Comments

5 Types of Data in Feedback

In this blog, I will be discussing some distinct types of data involved in feedback. The types that I will be covering are as follows: 1) structural; 2) event; 3) quantitative; 4) contextual; and 5) systemic. In 2014, I recall reading a number of blogs about three types of data: prescriptive, descriptive, and predictive. There was a data scientist apparently on tour lecturing extensively about these three types. I don't recall the individual's name. Well, prescription, description, and…


Added by Don Philip Faithful on July 5, 2015 at 4:56am — No Comments

See What You Can do With One Aster Command....

We would like to know if the customer discounts are having any effect on customer visits. We'll look to see if having a large discount (greater than .10 cents) leads to a greater number of additional purchases made at the store. Specifically, we want to know the date of the first large discount event ( > .10), the size of the discount, and the total number of unique products purchased after that discount. First, construct an nPath query that returns the total number of products…


Added by John Thuma on July 3, 2015 at 2:30pm — No Comments

Little Debate: Data Priorities for all Industries

The figure titled "Data Pipeline" is from an article by Jeffrey T. Leek & Roger D. Peng titled, "Statistics: P values are just the tip of the iceberg. These are both well known scientists in the field of statistics and data science, and for them, there is no need to debate the importance of data integrity; it is a fundamental concept. Current terminology uses the term "tidy data", a phrase coined by Hadley Wickham from an article by the same name. Whatever you…


Added by Randall Shane on July 3, 2015 at 12:00pm — 1 Comment

Aster: Video: Using Confusion Matrix in Machine Learning

Genre: Statistical Analysis (Machine Learning)

Background: Learn how easy it is to leverage Aster for implementing confusion matrices. A Confusion Matrix provides a visual representation of the performance of a supervised machine learning algorithm. It makes it easy to determine if a model is confusing or mislabeling classes. We also go over some of the math involved and help to understand how confusion matrices are used in supervised machine learning.

Use Cases:

-… Continue

Added by John Thuma on July 3, 2015 at 8:30am — No Comments

Tell us: as a data scientist, what is your super power?

I read the question raised recently by the American Statistical Society (ASA) -   As a statistician, what is your super power - and after reading the numerous answers, all of them very geeky and not a single one mentioning producing yield, ROI, or added value, I could not resist to post this question on DSC, for data scientists. I could not believe…


Added by Vincent Granville on July 2, 2015 at 1:00pm — 4 Comments

3 Tips to Tame the Big Data Beast

It will come raw, naked and dirty. But before you clean, clothe and tame this beast, It would be a good idea to assess its value in domesticating it.

Big Data Junk Yard strategy lets you play around with big data in its natural form. The approach allows enough runway for technology to ramp up infrastructure and for business to find right use cases through data discovery. It’s easy, economical and quicker way to get on the big data band wagon.(…


Added by Ashu Kumar on July 2, 2015 at 5:02am — No Comments

The seven people you need on your Big Data team

Read the original version of this post on my blog here.

Congratulations! You just got the call – you’ve been asked to start a data team to extract valuable customer insights from your product usage, improve your company’s marketing effectiveness, or make your boss look all “data-savvy” (hopefully not just the last one of these). And even better, you’ve…


Added by Ian Thomas on July 1, 2015 at 8:53am — 7 Comments

Blog Topics by Tags

Monthly Archives













  • Add Videos
  • View All

© 2020   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service