Arguably no technology conference in history has grown faster. Somehow we’ve achieved that growth with no background in the conference industry and no resources to speak of, and all from a pretty peripheral location called Dublin. - Paddy Cosgrave
In a …Continue
Added by Philippe Van Impe on October 31, 2014 at 11:55pm — No Comments
The constant search for something bigger might be part of the American culture. However, big data is often critical: without real time credit card fraud detection - a big data application - no store would accept credit cards.
There has been a few people questioning the value of big data recently, and predicting that big data is going to get smaller in the future. While most of these would-be oracles are traditional statisticians working on small data and worried about their…Continue
Added by Vincent Granville on October 31, 2014 at 9:00pm — No Comments
Summary: We’ve scoured the literature to bring you a complete listing of possible definitions of Big Data with the goal of being able to determine what’s a Big Data opportunity and what’s not. Our conclusion is that Volume, Variety, and Velocity still make the best definitions but none of these stand on their own in identifying Big Data from not-so-big-data. Understanding these characteristics will help you analyze whether an opportunity calls for a Big Data solution but…Continue
Added by William Vorhies on October 31, 2014 at 2:00pm — No Comments
Last weekend, I was waiting in New York’s Penn Station, when the public announcer gave the familiar “See Something Say Something” message. It took a minute to sink in, but I had to laugh. Midtown Manhattan IS suspicious and unusual activity.
Speaking of outliers
In practice, data is dirty and big data is filthy. Analysts munge, wrangle and clean their…Continue
Added by Michael Bryan on October 31, 2014 at 11:33am — No Comments
To be fair, our intern …Continue
Added by Vincent Granville on October 31, 2014 at 10:30am — No Comments
The popularity of Big Data lies within its broad definition of employing high volume, velocity, and variety data sets that are difficult to manage and extract value from. Unsurprisingly, most businesses can identify themselves as facing now or in future Big Data challenges and opportunities. This therefore is not a new issue yet it has a new quality as it has been exacerbated in recent years. Cheaper storage and ubiquitous data collection and availability of third party data outpaced the…Continue
Added by Christian Prokopp on October 31, 2014 at 2:30am — No Comments
When we perform machine learning of type classification, the target variable is a categorical (nominal) variable that has a set of unique values or classes . It could be a simple two class target variable like "approve application? " with classes (values) of "yes" or "no". Sometimes they might indicate ranges like "Excellent", "Good" etc. for a target variable like satisfaction score. We might also convert continuous variables like test scores (1 - 100) into classes like grades (A, B, C…Continue
The full version is always published Monday. Starred articles or sections are new additions or updated content, posted between Thursday and Sunday.
Added by Vincent Granville on October 29, 2014 at 4:00pm — No Comments
It seems like more and more companies are very interested in either, improving or setting up their analytical capabilities. All these companies are quite attracted to Hadoop, Spark or other similar solutions, not necessarily because they solve real problems they’re facing, but because they are shiny, trendy pieces of technology.
Hadoop, Spark and others are…Continue
Added by Anna Anisin on October 28, 2014 at 2:30pm — No Comments
If you haven't checked out our newsletter recently, I invite you to do so. The next weekly digest will announce our upcoming Data Science 2.0. book, and a complimentary copy (eBook) will be offered to our subscribers later on.
To make sure that you benefit from these exclusive advantages, check out if you receive our messages:
The sender (the name we use in the "From" field) is usually Data Science Central, and all messages have our physical address…Continue
Added by Vincent Granville on October 28, 2014 at 11:30am — No Comments
Data visualization is everywhere. Whether you check your online bank account, monitor your workouts, discover the energy consumption of your house, check your pipeline in your CRM system or view remaining vacation days on your HR application, visualizations are part of the large majority of web applications.
When data visualizations…Continue
Added by Michael Singer on October 28, 2014 at 8:34am — No Comments
As a long-term member of the Linked Data community, which has evolved from W3C's Semantic Web, the latest developments around Data Science have become more and more attractive to me due to its complementary perspectives on similar challenges. Both disciplines work on questions like these:
Added by Andreas Blumauer on October 28, 2014 at 12:27am — No Comments
This is an announcement regarding my upcoming book: Data Science 2.0. The subtitle is Automation, survival kit, career resources.
Just like our first book, it will first be available as a free PDF document to members of our community. It will…Continue
In this article, I compare two approaches (with their advantages and drawbacks) to compute a simple metric: the number of unique visitors ("uniques") per year for a website. I use the word user or visitor interchangeably.
Source for picture: …Continue
I’ve been thinking a lot about data, where it comes from, and what it looks like. I can’t help it. I’ve been a data geek for almost 15 years. And I find data beautiful. Not necessarily in its raw form, mind you. Then it’s just messy and more often than not a pain to deal with, especially when it gets really, really big. But when smart, creative people start to clean it up and use it in different ways to find the hidden stories that make sense, it can help us learn things in ways that we…Continue
Added by Anne Russell on October 27, 2014 at 6:30am — No Comments
Any author would like to know if his/her article will be successful or not. Here is an attempt to deal with this task.
Data and tools
Given the nature of the community, presumably many visitors already have a strong understanding of the nature of quantitative data. Perhaps more mysterious is the idea of qualitative data especially since it can sometimes be expressed in quantitative terms. For instance, "stress" as an internal response to an externality differs from person to person; yet it would be possible to canvas a large number of people and express stress levels as an aggregate based on a perceptual gradient: minimal,…Continue
Added by Don Philip Faithful on October 25, 2014 at 6:37am — No Comments
This happened tonight, shortly after Facebook took the same decision. Even Bit.ly itself is banned, see picture below. This happens only with Chrome, but not with other browsers such as IE or Firefox. The ban will probably be lifted in several hours.
This brings interesting questions:
Added by Vincent Granville on October 24, 2014 at 11:00pm — No Comments
Summary: Is the addition of “Prescriptive” analytics to our nomenclature really worthwhile or are we just confusing our customers?
I admit to being annoyed when this or that industry wag tries to coin a new term to describe some portion of the discipline we are already practicing. Some of these folks I think are…Continue