Subscribe to DSC Newsletter

All Blog Posts (7,302)

13 New Trends in Big Data and Data Science

Based on requests from clients - vendors of data processing platforms and products - as well as trends in popular blogs,  job postings, and my own reading. Here are a few topics recently gaining strong traction (items beyond #13 were recently added)::

  1. The rise of data plumbing, to make big data run smoothly, safely, reliably, and fast through all "data…

Added by Vincent Granville on November 11, 2014 at 10:30am — 4 Comments

Privacy, Personalization, and the IOT - Retail

Summary:  Thanks to the IOT (internet of things) an internet-like experience of recommendations and awareness of your preferences is coming to the brick and mortar store near you.

You’ve probably noticed the huge difference in the tone of the conversation between data scientists and the general public over the issue of privacy and…


Added by William Vorhies on November 10, 2014 at 10:48am — No Comments

Why Data Scientists create poor Data Products ? ( And 5 things that can be done to change this )

As the consumer and industrial world gets massively digitized Data products are being baked into critical processes at a very high rate. These data products distill signals from massive torrent of human generated and machine generated data to drive a front line action . At this point we wanted to distinguish between 2 types of data products which we have seen in the market place

  1. Consumer Data…

Added by derick.jose on November 9, 2014 at 1:30pm — 2 Comments

My thoughts on data science and big data

In this article, I share some of my unusual views about big data and data science. My next article will be about new trends in data science.

Big data is necessary in more applications than people think

And used with success. It includes

  • Scoring transactions (credit card purchases) in real time
  • Automated piloting
  • Monitoring car traffic (sending alerts, finding optimum routes). Same with planes.
  • Digital…

Added by Vincent Granville on November 9, 2014 at 9:30am — 1 Comment

22 tips for better data science

These tips are provided by Dr Granville, who brings 20 years of varied data-intensive experience working with successful start-ups, small companies across various industries, and eBay, Visa, Microsoft, GE and Wells Fargo.

  1. Leverage external data sources:…

Added by Vincent Granville on November 8, 2014 at 8:00pm — 5 Comments

The growth of data science over the last two years: 300%

A few websites catering to analytics and data science professionals have experienced tremendous growth recently. Organizations such as INFORMS or AMSTAT have seen their traffic explode, targeting high school students to join the ranks of data scientists. Niche publishers providing high quality, actionable content - and run by true data scientists rather than journalists - have also seen spectacular growth.…


Added by Vincent Granville on November 8, 2014 at 3:00pm — 1 Comment

True Freedom - Algorithmic Man Unplugged

Thermometers and scales to measure weight appeared in retail outlets long ago. Blood pressure monitors perhaps came later. Pedometers and heart-rate monitors seem more recent - possibly closer to my time. I saw several devices while doing this blog intended to electronically record among other things hours of sleep; these are designed to be worn on the body all the time. A couple of weeks ago, I bought something to give the heart rate and blood oxygen saturation level. I consider it a real…


Added by Don Philip Faithful on November 8, 2014 at 9:18am — No Comments

Weekly Digest - November 10

The full version is always published Monday. Starred articles or sections are new additions or updated content, posted between Thursday and Sunday. 

Sponsored Announcement

Just as information is the lifeblood of healthcare, data science is at the heart of the 100 percent online Master of Science degree in Health Informatics at the College…


Added by Vincent Granville on November 5, 2014 at 4:00pm — No Comments

Choosing a classifier for predictions

One of the biggest decisions that a data scientist need to make during a predictive modeling exercise is to choose the right classifier.There is no best classifier for all problems. The accuracy of the classifier varies based on the data set. Correlation between the predictor variables and the outcome is a key influencer. The choice need to be made based on experimentation. There are two main selection criteria here.

Accuracy:  While accuracy of the…


Added by Kumaran Ponnambalam on November 4, 2014 at 6:08pm — No Comments

Start with Good Science on Good Data, Then we'll Talk 'Big Data'

We are currently witnessing a land rush of investment in Big Data architectures promising companies that they can turn their data into gold using the latest in distributed computing and advanced analytical methods. Although there is indeed much potential in applying machine learning and statistical analysis to largedatasets, many companies…


Added by Sean McClure on November 3, 2014 at 6:46am — 3 Comments

Big Data Problem: Could Fake Reviews Kill Amazon?

Adversarial analytics and business hacking: Amazon case study.

Chances are that you might have purchased a book, or visited a restaurant, as a result of reading fake reviews. The problem impacts companies such as Amazon and Yelp, while on Facebook, massive disinformation campaigns are funded by political money, hitting thousands of profiles and managed by public relation companies: they create fake profiles and try to become friends with influencers. Here the focus is…


Added by Vincent Granville on November 2, 2014 at 7:30pm — 7 Comments

Data Science for Improved Democracy

The current (November 2014) United States election reminds us that sophisticated data science techniques are employed on the public in attempt to influence opinion and persuade votes. The slick television advertising, debate prevarications, and policy position distortions and exaggerations have soured many citizens on the current state of modern democracy. Indeed, most feel we are not getting the straight scoop - the…


Added by Michael Walker on November 2, 2014 at 10:30am — 2 Comments

Data Science: Paving the Way For More Open, Interactive Data Analysis

Guest blog post by Gemma. Gemma can be reached at [email protected].
Data science is still, in many ways, the new kid on the block. In fact, there is often a fair amount of confusion as to what…

Added by Vincent Granville on November 1, 2014 at 8:00am — No Comments

Why Paddy Cosgrave hired Data Scientists to grow websummit from 400 to 20k attendees in just 4 years

Arguably no technology conference in history has grown faster. Somehow we’ve achieved that growth with no background in the conference industry and no resources to speak of, and all from a pretty peripheral location called Dublin. - Paddy Cosgrave

In a …


Added by Philippe Van Impe on October 31, 2014 at 11:55pm — No Comments

Do we really have a data obesity problem?

The constant search for something bigger might be part of the American culture. However, big data is often critical: without real time credit card fraud detection - a big data application - no store would accept credit cards.

There has been a few people questioning the value of big data recently, and predicting that big data is going to get smaller in the future. While most of these would-be oracles are traditional statisticians working on small data and worried about their…


Added by Vincent Granville on October 31, 2014 at 9:00pm — No Comments

How Many "V's" in Big Data? The Characteristics that Define Big Data

 Summary:  We’ve scoured the literature to bring you a complete listing of possible definitions of Big Data with the goal of being able to determine what’s a Big Data opportunity and what’s not.  Our conclusion is that Volume, Variety, and Velocity still make the best definitions but none of these stand on their own in identifying Big Data from not-so-big-data.  Understanding these characteristics will help you analyze whether an opportunity calls for a Big Data solution but…


Added by William Vorhies on October 31, 2014 at 2:00pm — No Comments

Treasure in the defects

Last weekend, I was waiting in New York’s Penn Station, when the public announcer gave the familiar “See Something Say Something” message. It took a minute to sink in, but I had to laugh. Midtown Manhattan IS suspicious and unusual activity.

Speaking of outliers

In practice, data is dirty and big data is filthy.  Analysts munge, wrangle and clean their…


Added by Michael Bryan on October 31, 2014 at 11:33am — No Comments

Data Science Apprenticeship: Announcing our First Graduate

Nikitinsky Nikita is the first to complete our DSA, using NLP, web crawling, statistical techniques and Python to cluster our content in top categories: click here to check his project.

To be fair, our intern …


Added by Vincent Granville on October 31, 2014 at 10:30am — No Comments

4 types of Big Data as a Service

The popularity of Big Data lies within its broad definition of employing high volume, velocity, and variety data sets that are difficult to manage and extract value from. Unsurprisingly, most businesses can identify themselves as facing now or in future Big Data challenges and opportunities. This therefore is not a new issue yet it has a new quality as it has been exacerbated in recent years. Cheaper storage and ubiquitous data collection and availability of third party data outpaced the…


Added by Christian Prokopp on October 31, 2014 at 2:30am — No Comments

Predictions - Effect of unique number of target classes on accuracy

When we perform machine learning of type classification, the target variable is a categorical (nominal) variable that has a set of unique values or classes . It could be a simple two class target variable like "approve application? " with classes (values)  of "yes" or "no". Sometimes they might indicate ranges like "Excellent", "Good" etc. for a target variable like satisfaction score. We might also convert continuous variables like test scores (1 - 100)  into classes like grades (A, B, C…


Added by Kumaran Ponnambalam on October 30, 2014 at 7:00am — 2 Comments

Blog Topics by Tags

Monthly Archives













  • Add Videos
  • View All

© 2020   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service