Featured Blog Posts – July 2013 Archive (22)

Posted Since Monday

To subscribe, click here.

Featured Articles…


Added by Vincent Granville on July 30, 2013 at 4:00pm — No Comments

Data Scientists Wanted to Create New Tech Tools for Our Cities

Are you a hacker with a huge heart? Code for America is looking for developers, data scientists, designers, researchers, and product managers for its 2014 Fellowship. We connect talented technologists with municipal governments to explore new ways of resolving local challenges and create new web apps.…


Added by Vincent Granville on July 30, 2013 at 2:03pm — No Comments

Hadoop Falcon and Data Lifecycle Management

Data management in the Hadoop ecosystem is still in the early stages of development. The goal…


Added by Michael Walker on July 30, 2013 at 12:13pm — No Comments

How to Avoid Political Blunders in Analytical Discussions

A while back I was running a data mining project for a customer and made a conversational blunder. In one of the meetings, I mentioned seeing one interesting relationship in the data. Customers who purchased one particular product tended to buy and implement a second product at a later time. I did not realize that Everyone in the room INTUITIVELY knew that there is absolutely no relationship between the two products. A big blunder. After the meeting, two friends told me that my standing in…


Added by Stephen Penn, DM, PMP on July 30, 2013 at 5:45am — 7 Comments

Choropleth in D3.js and Pandas (iPython Notebook)

There have been various attempts to integrate the D3.js visualization framework into iPython Notebook, in order to provide more visualization options than available with the standard Matplotlib. In my blog post today, I take one of the better integration attempts out there, port it from Windows to the Mac, and demonstrate:

1. Passing a Pandas Dataframe from iPython Notebook into the D3.js Javascript

2. Generating geo color maps in D3.js (not a built-in…


Added by Michael Malak on July 29, 2013 at 4:23am — No Comments

Turning visitors into sales: seduction vs. analytics

The context here is about increasing conversion rate, from website visitor to active, converting user. Or from passive newsletter subscriber to a lead (a user who opens the newsletter, clicks on the links, and converts). Here we will discuss the newletter conversion problem, although it applies to many different settings.…


Added by Mirko Krivanek on July 27, 2013 at 4:00pm — 3 Comments

Update about our data science competition

What seemed to be an untractable problem involving trillions of quadrillions of computations - far more than required to process all the data produced or collected on Earth since the beginning of times - has been reduced to something computationally feasible and even possibly quite simple. One applicant…


Added by Vincent Granville on July 22, 2013 at 2:00pm — 1 Comment

My first impression about the Microsoft Surface

I was offered a surface for father's day this year. I had an old iPad that I've used for several years, and I was curious to know if you can use the Surface just like a Windows laptop. While it has great features, faster Internet, and much more, the answer is clearly no.…


Added by Vincent Granville on July 21, 2013 at 5:30pm — 5 Comments

DUI arrests decrease after state monopoly on liquor sales ends

This is another example where, if you lack analytic skills, you will jump to the wrong conclusions. This news article was published in MyNorthWest. It's about the new law that went into effect a year ago in WA, allowing grocery stores to sell hard liquor. Here we provide 16 reasons that…


Added by Vincent Granville on July 20, 2013 at 11:30am — 4 Comments

Botnets in the cloud: the new generation of spammers

Big data and data science is not just for good guys. If properly leveraged, it also provides competitive advantages for criminals, over their competitors, or to avoid detection.…


Added by Vincent Granville on July 17, 2013 at 10:00am — 1 Comment

Demand for Data Scientists and the Datification of Business

Source: EMC2 Survey.

You cannot improve and manage what you…

Added by Michael Walker on July 16, 2013 at 1:00pm — 1 Comment

Big Data on the Big Data Conversation: Tracking the NSA Story

By Nicholas Hartman, Director

Recent revelations regarding the National Security Agency's (NSA) extensive data interception and monitoring practices (aka PRISM) have brought a branch of "Big Data's" research into the broader public light. The basic premise of such work is that computer algorithms can study…


Added by Nicholas Hartman on July 15, 2013 at 8:51am — No Comments

Weekly digest - July 15

Featured Articles


Added by Vincent Granville on July 11, 2013 at 6:30pm — No Comments

Rapid hadoop development with progressive testing

Debugging Hadoop jobs can be a huge pain.  The cycle time is slow, and error messages are often uninformative --- especially if you're using Hadoop streaming, or working on EMR.

I once found myself trying to debug a job that took a full six hours to fail.  It took more than a week -- a whole week! -- to find and fix the problem.  Of course, I was doing other things at the same time, but the need to constantly check up on the status of the job was a huge drain on my energy and…


Added by Abe Gong on July 10, 2013 at 10:47am — 1 Comment

Data Science Summer Reading List 2013

Machine Learning: A Probabilistic Perspective, by Kevin Murphy.

Boosting: Foundations and Algorithms, by Robert E. Schapire.

Models Behaving Badly: Why Confusing Illusion with Reality Can Lead to Disaster, by Emanuel Derman.

Doing Data Science, by Cathy O'Neil and Rachel…


Added by Michael Walker on July 9, 2013 at 3:00pm — 1 Comment

Information Singularity and the Principles of the Analytics Rock Star

Recently, Bernard Wehbe at StatSlice Systems wrote an intriguing and thought-provoking whitepaper on information singularity and the principles of the analytics rock star. 

Successful analytics professionals should follow a set of guiding principles which are very important and often missed by traditional…


Added by Jared Decker on July 9, 2013 at 8:23am — No Comments

Interesting database questions

I am not an expert in database design, since most of my career I have worked with alternate data storage / data access solutions. But one of the very first projects I had to do back in 1985 when I was a student was to write the code for a fully functional database architecture, in Pascal, from scratch. You will probably find some of my questions naive, and some intriguing.…


Added by Vincent Granville on July 8, 2013 at 2:00pm — 11 Comments

Big Data and Meaningful Storage Metrics

Big data has the potential to alter the calculus by which data management groups buy, manage, and structure information storage. 

Article originally published on TDWI, by Stephen…


Added by Vincent Granville on July 8, 2013 at 1:00pm — No Comments

Data Scientists vs. Data Engineers

More and more frequently we see organizations make the mistake of mixing and confusing team roles on a data science or "big data" project - resulting in over-allocation of responsibilities assigned to …

Added by Michael Walker on July 2, 2013 at 12:01pm — 9 Comments

Featured Monthly Archives












© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service