Subscribe to DSC Newsletter

Featured Blog Posts – July 2013 Archive (22)

Posted Since Monday

To subscribe, click here.

Featured Articles…

Continue

Added by Vincent Granville on July 30, 2013 at 4:00pm — No Comments

Data Scientists Wanted to Create New Tech Tools for Our Cities

Are you a hacker with a huge heart? Code for America is looking for developers, data scientists, designers, researchers, and product managers for its 2014 Fellowship. We connect talented technologists with municipal governments to explore new ways of resolving local challenges and create new web apps.…

Continue

Added by Vincent Granville on July 30, 2013 at 2:03pm — No Comments

Hadoop Falcon and Data Lifecycle Management

Data management in the Hadoop ecosystem is still in the early stages of development. The goal…

Continue

Added by Michael Walker on July 30, 2013 at 12:13pm — No Comments

How to Avoid Political Blunders in Analytical Discussions

A while back I was running a data mining project for a customer and made a conversational blunder. In one of the meetings, I mentioned seeing one interesting relationship in the data. Customers who purchased one particular product tended to buy and implement a second product at a later time. I did not realize that Everyone in the room INTUITIVELY knew that there is absolutely no relationship between the two products. A big blunder. After the meeting, two friends told me that my standing in…

Continue

Added by Stephen Penn, DM, PMP on July 30, 2013 at 5:45am — 8 Comments

Choropleth in D3.js and Pandas (iPython Notebook)

There have been various attempts to integrate the D3.js visualization framework into iPython Notebook, in order to provide more visualization options than available with the standard Matplotlib. In my blog post today, I take one of the better integration attempts out there, port it from Windows to the Mac, and demonstrate:

1. Passing a Pandas Dataframe from iPython Notebook into the D3.js Javascript

2. Generating geo color maps in D3.js (not a built-in…

Continue

Added by Michael Malak on July 29, 2013 at 4:23am — No Comments

Turning visitors into sales: seduction vs. analytics

The context here is about increasing conversion rate, from website visitor to active, converting user. Or from passive newsletter subscriber to a lead (a user who opens the newsletter, clicks on the links, and converts). Here we will discuss the newletter conversion problem, although it applies to many different settings.…

Continue

Added by Mirko Krivanek on July 27, 2013 at 4:00pm — 3 Comments

Update about our data science competition

What seemed to be an untractable problem involving trillions of quadrillions of computations - far more than required to process all the data produced or collected on Earth since the beginning of times - has been reduced to something computationally feasible and even possibly quite simple. One applicant…

Continue

Added by Vincent Granville on July 22, 2013 at 2:00pm — 1 Comment

My first impression about the Microsoft Surface

I was offered a surface for father's day this year. I had an old iPad that I've used for several years, and I was curious to know if you can use the Surface just like a Windows laptop. While it has great features, faster Internet, and much more, the answer is clearly no.…

Continue

Added by Vincent Granville on July 21, 2013 at 5:30pm — 5 Comments

DUI arrests decrease after state monopoly on liquor sales ends

This is another example where, if you lack analytic skills, you will jump to the wrong conclusions. This news article was published in MyNorthWest. It's about the new law that went into effect a year ago in WA, allowing grocery stores to sell hard liquor. Here we provide 16 reasons that…

Continue

Added by Vincent Granville on July 20, 2013 at 11:30am — 4 Comments

Botnets in the cloud: the new generation of spammers

Big data and data science is not just for good guys. If properly leveraged, it also provides competitive advantages for criminals, over their competitors, or to avoid detection.…

Continue

Added by Vincent Granville on July 17, 2013 at 10:00am — 1 Comment

Demand for Data Scientists and the Datification of Business

Source: EMC2 Survey.

You cannot improve and manage what you cannot…
Continue

Added by Michael Walker on July 16, 2013 at 1:00pm — 2 Comments

Big Data on the Big Data Conversation: Tracking the NSA Story

By Nicholas Hartman, Director

Recent revelations regarding the National Security Agency's (NSA) extensive data interception and monitoring practices (aka PRISM) have brought a branch of "Big Data's" research into the broader public light. The basic premise of such work is that computer algorithms can study…

Continue

Added by Nicholas Hartman on July 15, 2013 at 8:51am — No Comments

Weekly digest - July 15

Featured Articles

Continue

Added by Vincent Granville on July 11, 2013 at 6:30pm — No Comments

Rapid hadoop development with progressive testing

Debugging Hadoop jobs can be a huge pain.  The cycle time is slow, and error messages are often uninformative --- especially if you're using Hadoop streaming, or working on EMR.



I once found myself trying to debug a job that took a full six hours to fail.  It took more than a week -- a whole week! -- to find and fix the problem.  Of course, I was doing other things at the same time, but the need to constantly check up on the status of the job was a huge drain on my energy and…

Continue

Added by Abe Gong on July 10, 2013 at 10:47am — 1 Comment

Data Science Summer Reading List 2013

Machine Learning: A Probabilistic Perspective, by Kevin Murphy.



Boosting: Foundations and Algorithms, by Robert E. Schapire.



Models Behaving Badly: Why Confusing Illusion with Reality Can Lead to Disaster, by Emanuel Derman.



Doing Data Science, by Cathy O'Neil and Rachel…

Continue

Added by Michael Walker on July 9, 2013 at 3:00pm — 1 Comment

Information Singularity and the Principles of the Analytics Rock Star

Recently, Bernard Wehbe at StatSlice Systems wrote an intriguing and thought-provoking whitepaper on information singularity and the principles of the analytics rock star. 



Successful analytics professionals should follow a set of guiding principles which are very important and often missed by traditional…

Continue

Added by Jared Decker on July 9, 2013 at 8:23am — No Comments

Interesting database questions

I am not an expert in database design, since most of my career I have worked with alternate data storage / data access solutions. But one of the very first projects I had to do back in 1985 when I was a student was to write the code for a fully functional database architecture, in Pascal, from scratch. You will probably find some of my questions naive, and some intriguing.…

Continue

Added by Vincent Granville on July 8, 2013 at 2:00pm — 11 Comments

Big Data and Meaningful Storage Metrics

Big data has the potential to alter the calculus by which data management groups buy, manage, and structure information storage. 

Article originally published on TDWI, by Stephen…

Continue

Added by Vincent Granville on July 8, 2013 at 1:00pm — No Comments

Data Scientists vs. Data Engineers

More and more frequently we see organizations make the mistake of mixing and confusing team roles on a data science or "big data" project - resulting in over-allocation of responsibilities assigned to …
Continue

Added by Michael Walker on July 2, 2013 at 12:01pm — 9 Comments

Featured Monthly Archives

2019

2018

2017

2016

2015

2014

2013

2012

2011

Videos

  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service