October 2014 Blog Posts (61)

Why Paddy Cosgrave hired Data Scientists to grow websummit from 400 to 20k attendees in just 4 years

Arguably no technology conference in history has grown faster. Somehow we’ve achieved that growth with no background in the conference industry and no resources to speak of, and all from a pretty peripheral location called Dublin. - Paddy Cosgrave

In a …


Added by Philippe Van Impe on October 31, 2014 at 11:55pm — No Comments

Do we really have a data obesity problem?

The constant search for something bigger might be part of the American culture. However, big data is often critical: without real time credit card fraud detection - a big data application - no store would accept credit cards.

There has been a few people questioning the value of big data recently, and predicting that big data is going to get smaller in the future. While most of these would-be oracles are traditional statisticians working on small data and worried about their…


Added by Vincent Granville on October 31, 2014 at 9:00pm — No Comments

How Many "V's" in Big Data? The Characteristics that Define Big Data

 Summary:  We’ve scoured the literature to bring you a complete listing of possible definitions of Big Data with the goal of being able to determine what’s a Big Data opportunity and what’s not.  Our conclusion is that Volume, Variety, and Velocity still make the best definitions but none of these stand on their own in identifying Big Data from not-so-big-data.  Understanding these characteristics will help you analyze whether an opportunity calls for a Big Data solution but…


Added by William Vorhies on October 31, 2014 at 2:00pm — No Comments

Treasure in the defects

Last weekend, I was waiting in New York’s Penn Station, when the public announcer gave the familiar “See Something Say Something” message. It took a minute to sink in, but I had to laugh. Midtown Manhattan IS suspicious and unusual activity.

Speaking of outliers

In practice, data is dirty and big data is filthy.  Analysts munge, wrangle and clean their…


Added by Michael Bryan on October 31, 2014 at 11:33am — No Comments

Data Science Apprenticeship: Announcing our First Graduate

Nikitinsky Nikita is the first to complete our DSA, using NLP, web crawling, statistical techniques and Python to cluster our content in top categories: click here to check his project.

To be fair, our intern …


Added by Vincent Granville on October 31, 2014 at 10:30am — No Comments

4 types of Big Data as a Service

The popularity of Big Data lies within its broad definition of employing high volume, velocity, and variety data sets that are difficult to manage and extract value from. Unsurprisingly, most businesses can identify themselves as facing now or in future Big Data challenges and opportunities. This therefore is not a new issue yet it has a new quality as it has been exacerbated in recent years. Cheaper storage and ubiquitous data collection and availability of third party data outpaced the…


Added by Christian Prokopp on October 31, 2014 at 2:30am — No Comments

Predictions - Effect of unique number of target classes on accuracy

When we perform machine learning of type classification, the target variable is a categorical (nominal) variable that has a set of unique values or classes . It could be a simple two class target variable like "approve application? " with classes (values)  of "yes" or "no". Sometimes they might indicate ranges like "Excellent", "Good" etc. for a target variable like satisfaction score. We might also convert continuous variables like test scores (1 - 100)  into classes like grades (A, B, C…


Added by Kumaran Ponnambalam on October 30, 2014 at 7:00am — 2 Comments

Weekly Digest - November 3

The full version is always published Monday. Starred articles or sections are new additions or updated content, posted between Thursday and Sunday. 



Added by Vincent Granville on October 29, 2014 at 4:00pm — No Comments

Don't Get Hadooped

It seems like more and more companies are very interested in either, improving or setting up their analytical capabilities. All these companies are quite attracted to Hadoop, Spark or other similar solutions, not necessarily because they solve real problems they’re facing, but because they are shiny, trendy pieces of technology.

Hadoop, Spark and others are…


Added by Anna Anisin on October 28, 2014 at 2:30pm — No Comments

Announcement to Data Science Central subscribers

If you haven't checked out our newsletter recently, I invite you to do so. The next weekly digest will announce our upcoming Data Science 2.0. book, and a complimentary copy (eBook) will be offered to our subscribers later on.

To make sure that you benefit from these exclusive advantages, check out if you receive our messages:

The sender (the name we use in the "From" field) is usually Data Science Central, and all messages have our physical address…


Added by Vincent Granville on October 28, 2014 at 11:30am — No Comments

3 Trends in Embedded Analytics

Data visualization is everywhere. Whether you check your online bank account, monitor your workouts, discover the energy consumption of your house, check your pipeline in your CRM system or view remaining vacation days on your HR application, visualizations are part of the large majority of web applications.

When data visualizations…


Added by Michael Singer on October 28, 2014 at 8:34am — 1 Comment

Linked Data meets Data Science

As a long-term member of the Linked Data community, which has evolved from W3C's Semantic Web, the latest developments around Data Science have become more and more attractive to me due to its complementary perspectives on similar challenges. Both disciplines work on questions like these:

  • How to extract meaningful information from large amounts of data?
  • How to connect pieces of information to other pieces in…

Added by Andreas Blumauer on October 28, 2014 at 12:27am — No Comments

Data Science 2.0.

This is an announcement regarding my upcoming book: Data Science 2.0. The subtitle is Automation, survival kit, career resources.

Just like our first book, it will first be available as a free PDF document to members of our community. It will…


Added by Vincent Granville on October 27, 2014 at 1:30pm — 16 Comments

Data science versus statistics, to solve problems: case study

In this article, I compare two approaches (with their advantages and drawbacks) to compute a simple metric: the number of unique visitors ("uniques") per year for a website. I use the word user or visitor interchangeably.

Source for picture: …


Added by Vincent Granville on October 27, 2014 at 9:30am — 7 Comments

The Richness and Reality of World Data

I’ve been thinking a lot about data, where it comes from, and what it looks like.  I can’t help it.  I’ve been a data geek for almost 15 years.  And I find data beautiful.  Not necessarily in its raw form, mind you. Then it’s just messy and more often than not a pain to deal with, especially when it gets really, really big.  But when smart, creative people start to clean it up and use it in different ways to find the hidden stories that make sense, it can help us learn things in ways that we…


Added by Anne Russell on October 27, 2014 at 6:30am — No Comments

My Data Science Apprenticeship Project


Any author would like to know if his/her article will be successful or not. Here is an attempt to deal with this task.

Data and tools

  1. We obtained 5000 most significant articles (Analytic Bridge and Data Science Central) from here (…

Added by Nikitinsky Nikita on October 26, 2014 at 10:30am — 1 Comment

Qualitative Engine for Organizational Simulations

Given the nature of the community, presumably many visitors already have a strong understanding of the nature of quantitative data. Perhaps more mysterious is the idea of qualitative data especially since it can sometimes be expressed in quantitative terms. For instance, "stress" as an internal response to an externality differs from person to person; yet it would be possible to canvas a large number of people and express stress levels as an aggregate based on a perceptual gradient: minimal,…


Added by Don Philip Faithful on October 25, 2014 at 6:37am — No Comments

Bit.ly banned on Google

This happened tonight, shortly after Facebook took the same decision. Even Bit.ly itself is banned, see picture below. This happens only with Chrome, but not with other browsers such as IE or Firefox. The ban will probably be lifted in several hours.

This brings interesting questions:

  • Bit.ly is a widely used URL shorterner and…

Added by Vincent Granville on October 24, 2014 at 11:00pm — No Comments

How Zipfian Academy Graduate Alex Mentch became a Data Scientist at Facebook

Zipfian Academy has graduated more than 50 alumni, placing graduates into data science roles at Facebook, Twitter, Airbnb, Tesla, Uber, Square,…


Added by Molly Larkin on October 24, 2014 at 10:22am — 3 Comments

Prescriptive versus Predictive Analytics - A Distinction without a Difference?

Summary:  Is the addition of “Prescriptive” analytics to our nomenclature really worthwhile or are we just confusing our customers?

I admit to being annoyed when this or that industry wag tries to coin a new term to describe some portion of the discipline we are already practicing.  Some of these folks I think are…


Added by William Vorhies on October 23, 2014 at 10:00am — 9 Comments

Blog Topics by Tags

Monthly Archives













© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service