Subscribe to Dr. Granville's Weekly Digest

All Blog Posts (1,767)

Weekly Digest - July 13

The full version is always published Monday. Starred articles are new additions or updated content, posted between Thursday and Sunday. The picture of the week is from the contribution marked with a +, where you will find the details.

Featured…

Continue

Added by Vincent Granville on July 7, 2015 at 8:38am — No Comments

Brand Image, Sentiment Analysis and Social Media

Analyzing sentiments is a very subjective exercise. The other day, I was talking about this to a colleague of mine about how you could mine sentiments and he made a very interesting comment based out of the experience that he has had with this concept :

He has his own software company which is a mid-sized software one and that which was doing fairly well and at one time, he tried to analyze…

Continue

Added by Srinivasan Rakhunathan on July 6, 2015 at 7:29pm — No Comments

Identifying external data needs for driving conversion optimization

Marc Andreessen famously said ‘software is eating the world’. This appetite for software is fed by the fact that software driven businesses are not only way more effective than traditional businesses; they can also leverage the software to accomplish things unheard of before, such as new business models. Companies such as Google, eBay, and Amazon are well-known examples of…

Continue

Added by Martin Voorzanger on July 6, 2015 at 10:00am — No Comments

Which drink is more popular Tea, Coffee, Beer, Wine?

If number of searches were an indicator for gauging the popularity of a drink between Tea, Coffee, Beer and Wine then who wins the contest?

Let us analyze who wins the popularity contest by comparing the number of searches for each of the terms.

Data collection method: Each of the keywords Tea, Coffee, Beer and Wine were used with Google Trends to extract the weekly data and then plotted as below. Year 2015 is excluded due to partial…

Continue

Added by Nilesh Jethwa on July 6, 2015 at 6:35am — No Comments

Workshop: Big Data Could Spell Big Legal Troubles

Big Data is powering Big Business these days, but could also spell big legal troubles if not handled properly.

We invite you to a two-day workshop on Legal Issues and Big Data that Gary Rinkerman and I are conducting at New York University on July 30-31 in NYC. This two-day seminar covers the legal issues…

Continue

Added by Andres Fortino on July 6, 2015 at 6:30am — No Comments

Great reports from scratch (even when you don't know what your stakeholder wants!)

Doing periodic reporting always feels like it's going to be easy, just make another copy of everything, change some dates, and run it again! But stakeholders have a way of looking at your hard work, thanking you for (if you're lucky) and asking for exactly the new view that was the hardest to implement. How do you build out a report that won't become tangled mess when all of these requests pile up?…

Continue

Added by Matt Ritter on July 6, 2015 at 2:01am — No Comments

High-Return Data Science: Modernizing / Automating Digital Publishing (Case Study)

Introduction and Purpose

Here I put together a number of new data science techniques to solve a real life problem: identifying good articles to write and publish (or to harvest and re-post) on a website, and re-tweet them with the optimum frequency, given a specific audience. The focus is on scoring articles based on selected features (keywords in subject line, author, channel and many more), feature selection, data generation and…

Continue

Added by Vincent Granville on July 5, 2015 at 11:30am — No Comments

5 Types of Data in Feedback

In this blog, I will be discussing some distinct types of data involved in feedback. The types that I will be covering are as follows: 1) structural; 2) event; 3) quantitative; 4) contextual; and 5) systemic. In 2014, I recall reading a number of blogs about three types of data: prescriptive, descriptive, and predictive. There was a data scientist apparently on tour lecturing extensively about these three types. I don't recall the individual's name. Well, prescription, description, and…

Continue

Added by Don Philip Faithful on July 5, 2015 at 4:56am — No Comments

Little Debate: Data Priorities for all Industries

The figure titled "Data Pipeline" is from an article by Jeffrey T. Leek & Roger D. Peng titled, "Statistics: P values are just the tip of the iceberg. These are both well known scientists in the field of statistics and data science, and for them, there is no need to debate the importance of data integrity; it is a fundamental concept. Current terminology uses the term "tidy data", a phrase coined by Hadley Wickham from an article by the same name. Whatever you…

Continue

Added by Randall V Shane on July 3, 2015 at 12:00pm — No Comments

Aster: Video: Using Confusion Matrix in Machine Learning

Genre: Statistical Analysis (Machine Learning)



Background: Learn how easy it is to leverage Aster for implementing confusion matrices. A Confusion Matrix provides a visual representation of the performance of a supervised machine learning algorithm. It makes it easy to determine if a model is confusing or mislabeling classes. We also go over some of the math involved and help to understand how confusion matrices are used in supervised machine learning.



Use Cases:

-… Continue

Added by John Thuma on July 3, 2015 at 8:30am — No Comments

Tell us: as a data scientist, what is your super power?

I read the question raised recently by the American Statistical Society (ASA) -   As a statistician, what is your super power - and after reading the numerous answers, all of them very geeky and not a single one mentioning producing yield, ROI, or added value, I could not resist to post this question on DSC, for data scientists. I could not believe…

Continue

Added by Vincent Granville on July 2, 2015 at 1:00pm — 4 Comments

Best Practices When Starting And Working On A Data Science Project

Several interesting questions were asked recently on Data Science Central by …

Continue

Added by Enda Ridge on July 2, 2015 at 7:29am — No Comments

3 Tips to Tame the Big Data Beast

It will come raw, naked and dirty. But before you clean, clothe and tame this beast, It would be a good idea to assess its value in domesticating it.

Big Data Junk Yard strategy lets you play around with big data in its natural form. The approach allows enough runway for technology to ramp up infrastructure and for business to find right use cases through data discovery. It’s easy, economical and quicker way to get on the big data band wagon.(…

Continue

Added by Ashu Kumar on July 2, 2015 at 5:02am — No Comments

The seven people you need on your Big Data team

Read the original version of this post on my blog here.

Congratulations! You just got the call – you’ve been asked to start a data team to extract valuable customer insights from your product usage, improve your company’s marketing effectiveness, or make your boss look all “data-savvy” (hopefully not just the last one of these). And even better, you’ve…

Continue

Added by Ian Thomas on July 1, 2015 at 8:53am — 4 Comments

How to assess quality and correctness of classification models? Part 4 - ROC Curve

In the previous parts of our tutorial we discussed:

Continue

Added by Algolytics on July 1, 2015 at 5:30am — No Comments

Machine learning is not better than Human learning.

Alan Turing was the first one to present the idea of simulating the machine thinking. Its been more than 60 years since the ground breaking paper of Alan Turing came out, The Imitation Game. The world has changed rapidly since then. 

The machines of today have become so powerful. They can actually think, which endorses the idea of Alan Turing presented in 50s. However, the machine thinking may be different. Alan Turing argued, just because the thinking can be…

Continue

Added by Rana Usman on June 30, 2015 at 12:54pm — No Comments

Weekly Digest - July 6

The full version is always published Monday. Starred articles are new additions or updated content, posted between Thursday and Sunday. The picture of the week is from the contribution marked with a +, where you will find the details.

Featured

Continue

Added by Vincent Granville on June 30, 2015 at 10:30am — No Comments

Why Protecting Data Privacy Matters, and When

(A Wake-Up Call to Data Geeks Who Doubt)

by Anne Russell

Anne is the Managing Partner and Founder of World Data Insights, a data consulting company helping customers transform data into information that matters. Anne has spent the last decade immersed in multiple aspects of…

Continue

Added by Anne Russell on June 30, 2015 at 10:06am — No Comments

How to score data in Hadoop/Hive in a flash

Reference to Hadoop implies huge amount of data. The intend of the data is of course to derive insights that will help businesses stay competitive. "Scoring" the data is a common exercise in determining e.g. customer churn, fraud detection, risk mitigation, etc... It is one of the slowest analytics activities and especially when very large data set is involved. There are various fast scoring products in the market but they are very specialized and/or are provided by one vendor, usually…

Continue

Added by Eddie Soong on June 29, 2015 at 9:10pm — 1 Comment

Blog Topics by Tags

Monthly Archives

2015

2014

2013

2012

2011

1999

Follow Us

Resources

Videos

  • Add Videos
  • View All

© 2015   Data Science Central

Badges  |  Report an Issue  |  Terms of Service