Subscribe to DSC Newsletter

All Blog Posts (7,236)

True Freedom - Algorithmic Man Unplugged

Thermometers and scales to measure weight appeared in retail outlets long ago. Blood pressure monitors perhaps came later. Pedometers and heart-rate monitors seem more recent - possibly closer to my time. I saw several devices while doing this blog intended to electronically record among other things hours of sleep; these are designed to be worn on the body all the time. A couple of weeks ago, I bought something to give the heart rate and blood oxygen saturation level. I consider it a real…


Added by Don Philip Faithful on November 8, 2014 at 9:18am — No Comments

Weekly Digest - November 10

The full version is always published Monday. Starred articles or sections are new additions or updated content, posted between Thursday and Sunday. 

Sponsored Announcement

Just as information is the lifeblood of healthcare, data science is at the heart of the 100 percent online Master of Science degree in Health Informatics at the College…


Added by Vincent Granville on November 5, 2014 at 4:00pm — No Comments

Choosing a classifier for predictions

One of the biggest decisions that a data scientist need to make during a predictive modeling exercise is to choose the right classifier.There is no best classifier for all problems. The accuracy of the classifier varies based on the data set. Correlation between the predictor variables and the outcome is a key influencer. The choice need to be made based on experimentation. There are two main selection criteria here.

Accuracy:  While accuracy of the…


Added by Kumaran Ponnambalam on November 4, 2014 at 6:08pm — No Comments

Start with Good Science on Good Data, Then we'll Talk 'Big Data'

We are currently witnessing a land rush of investment in Big Data architectures promising companies that they can turn their data into gold using the latest in distributed computing and advanced analytical methods. Although there is indeed much potential in applying machine learning and statistical analysis to largedatasets, many companies…


Added by Sean McClure on November 3, 2014 at 6:46am — 3 Comments

Big Data Problem: Could Fake Reviews Kill Amazon?

Adversarial analytics and business hacking: Amazon case study.

Chances are that you might have purchased a book, or visited a restaurant, as a result of reading fake reviews. The problem impacts companies such as Amazon and Yelp, while on Facebook, massive disinformation campaigns are funded by political money, hitting thousands of profiles and managed by public relation companies: they create fake profiles and try to become friends with influencers. Here the focus is…


Added by Vincent Granville on November 2, 2014 at 7:30pm — 7 Comments

Data Science for Improved Democracy

The current (November 2014) United States election reminds us that sophisticated data science techniques are employed on the public in attempt to influence opinion and persuade votes. The slick television advertising, debate prevarications, and policy position distortions and exaggerations have soured many citizens on the current state of modern democracy. Indeed, most feel we are not getting the straight scoop - the…


Added by Michael Walker on November 2, 2014 at 10:30am — 2 Comments

Data Science: Paving the Way For More Open, Interactive Data Analysis

Guest blog post by Gemma. Gemma can be reached at [email protected].
Data science is still, in many ways, the new kid on the block. In fact, there is often a fair amount of confusion as to what…

Added by Vincent Granville on November 1, 2014 at 8:00am — No Comments

Why Paddy Cosgrave hired Data Scientists to grow websummit from 400 to 20k attendees in just 4 years

Arguably no technology conference in history has grown faster. Somehow we’ve achieved that growth with no background in the conference industry and no resources to speak of, and all from a pretty peripheral location called Dublin. - Paddy Cosgrave

In a …


Added by Philippe Van Impe on October 31, 2014 at 11:55pm — No Comments

Do we really have a data obesity problem?

The constant search for something bigger might be part of the American culture. However, big data is often critical: without real time credit card fraud detection - a big data application - no store would accept credit cards.

There has been a few people questioning the value of big data recently, and predicting that big data is going to get smaller in the future. While most of these would-be oracles are traditional statisticians working on small data and worried about their…


Added by Vincent Granville on October 31, 2014 at 9:00pm — No Comments

How Many "V's" in Big Data? The Characteristics that Define Big Data

 Summary:  We’ve scoured the literature to bring you a complete listing of possible definitions of Big Data with the goal of being able to determine what’s a Big Data opportunity and what’s not.  Our conclusion is that Volume, Variety, and Velocity still make the best definitions but none of these stand on their own in identifying Big Data from not-so-big-data.  Understanding these characteristics will help you analyze whether an opportunity calls for a Big Data solution but…


Added by William Vorhies on October 31, 2014 at 2:00pm — No Comments

Treasure in the defects

Last weekend, I was waiting in New York’s Penn Station, when the public announcer gave the familiar “See Something Say Something” message. It took a minute to sink in, but I had to laugh. Midtown Manhattan IS suspicious and unusual activity.

Speaking of outliers

In practice, data is dirty and big data is filthy.  Analysts munge, wrangle and clean their…


Added by Michael Bryan on October 31, 2014 at 11:33am — No Comments

Data Science Apprenticeship: Announcing our First Graduate

Nikitinsky Nikita is the first to complete our DSA, using NLP, web crawling, statistical techniques and Python to cluster our content in top categories: click here to check his project.

To be fair, our intern …


Added by Vincent Granville on October 31, 2014 at 10:30am — No Comments

4 types of Big Data as a Service

The popularity of Big Data lies within its broad definition of employing high volume, velocity, and variety data sets that are difficult to manage and extract value from. Unsurprisingly, most businesses can identify themselves as facing now or in future Big Data challenges and opportunities. This therefore is not a new issue yet it has a new quality as it has been exacerbated in recent years. Cheaper storage and ubiquitous data collection and availability of third party data outpaced the…


Added by Christian Prokopp on October 31, 2014 at 2:30am — No Comments

Predictions - Effect of unique number of target classes on accuracy

When we perform machine learning of type classification, the target variable is a categorical (nominal) variable that has a set of unique values or classes . It could be a simple two class target variable like "approve application? " with classes (values)  of "yes" or "no". Sometimes they might indicate ranges like "Excellent", "Good" etc. for a target variable like satisfaction score. We might also convert continuous variables like test scores (1 - 100)  into classes like grades (A, B, C…


Added by Kumaran Ponnambalam on October 30, 2014 at 7:00am — 2 Comments

Weekly Digest - November 3

The full version is always published Monday. Starred articles or sections are new additions or updated content, posted between Thursday and Sunday. 



Added by Vincent Granville on October 29, 2014 at 4:00pm — No Comments

Don't Get Hadooped

It seems like more and more companies are very interested in either, improving or setting up their analytical capabilities. All these companies are quite attracted to Hadoop, Spark or other similar solutions, not necessarily because they solve real problems they’re facing, but because they are shiny, trendy pieces of technology.

Hadoop, Spark and others are…


Added by Anna Anisin on October 28, 2014 at 2:30pm — No Comments

Announcement to Data Science Central subscribers

If you haven't checked out our newsletter recently, I invite you to do so. The next weekly digest will announce our upcoming Data Science 2.0. book, and a complimentary copy (eBook) will be offered to our subscribers later on.

To make sure that you benefit from these exclusive advantages, check out if you receive our messages:

The sender (the name we use in the "From" field) is usually Data Science Central, and all messages have our physical address…


Added by Vincent Granville on October 28, 2014 at 11:30am — No Comments

3 Trends in Embedded Analytics

Data visualization is everywhere. Whether you check your online bank account, monitor your workouts, discover the energy consumption of your house, check your pipeline in your CRM system or view remaining vacation days on your HR application, visualizations are part of the large majority of web applications.

When data visualizations…


Added by Michael Singer on October 28, 2014 at 8:34am — No Comments

Linked Data meets Data Science

As a long-term member of the Linked Data community, which has evolved from W3C's Semantic Web, the latest developments around Data Science have become more and more attractive to me due to its complementary perspectives on similar challenges. Both disciplines work on questions like these:

  • How to extract meaningful information from large amounts of data?
  • How to connect pieces of information to other pieces in…

Added by Andreas Blumauer on October 28, 2014 at 12:27am — No Comments

Data Science 2.0.

This is an announcement regarding my upcoming book: Data Science 2.0. The subtitle is Automation, survival kit, career resources.

Just like our first book, it will first be available as a free PDF document to members of our community. It will…


Added by Vincent Granville on October 27, 2014 at 1:30pm — 16 Comments

Blog Topics by Tags

Monthly Archives













  • Add Videos
  • View All

© 2020   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service