Subscribe to DSC Newsletter

Featured Blog Posts – February 2014 Archive (30)

Data Scientist, please meet the Data Artist

Jim Sterne | Anametrix Blog

I am delighted to bring you this guest post from Jim Sterne, an international consultant who focuses on measuring the value of the Web as a medium for creating and strengthening customer relationships. He has written eight books on using the Internet for marketing, is the founding president and current chairman of the Digital Analytics Association, produces the eMetrics Summit and sits on Anametrix’s Board of Advisors.…

Continue

Added by Ryan Montano on February 28, 2014 at 8:30am — 4 Comments

The reason Facebook paid US$ 19 bn for Whatsapp…decoded

On 19th…

Continue

Added by Aatash Shah on February 27, 2014 at 3:30am — No Comments

Is Your Customer Data For Your Customers Benefit?

We are all increasingly active in the digital space. 70%+ of people in the EU, and growing, use the Internet*, all contributing towards more data generation. But public misperceptions and perspectives on data and how it’s used for marketing, threaten to limit data’s potential curtailing marketers’ abilities to provide personalised services. Careful data usage greatly enhances our lives. Unfortunately, fear or irresponsible use (a thankfully rare occurrence), along with some sensationalist…

Continue

Added by Jed Mole on February 27, 2014 at 3:30am — No Comments

Read this tutorial before you use Proc Corr

All of us at some point in the process of examining…

Continue

Added by Aatash Shah on February 27, 2014 at 3:23am — No Comments

The Data Science Toolkit - The Future Web Toolkit

There's a lot of confusing jargon and buzzwords in this new field. It helps to know who some of the major players are and what services they offer. This list is a mild introduction and far from exhaustive.



Amazon Web Services: Infrastructure as a service (IaaS). EC2 virtual servers, S3 storage, Mechanical Turk, analytics, and more.

Yandex: Russian competitor for google. Recently launched Cocaine server based on Docker.

Salesforce: Customer Relationship Management…

Continue

Added by Peter Higdon on February 25, 2014 at 7:51am — 1 Comment

Forecasting with the Baum-Welch Algorithm and Hidden Markov Models

Leonard Baum and Lloyd Welch designed a probabilistic modelling algorithm to detect patterns in Hidden Markov Processes. They built upon the theory of probabilistic functions of a …

Continue

Added by Michael Walker on February 24, 2014 at 10:02pm — 1 Comment

Big data: the door to co-operation and communication in telecommunications

Responsiveness and clarity, perhaps more than in any other industry are crucial to Telecommunications.

Challenged by the advancing communications demands of a ‘smartphone generation’ over the last few years, the role of communications service providers (CSP’s) and the data they offer is increasingly valuable, owing to the sheer quantity and quality of the unstructured data they produce.

Think about it. From mobile network to Internet providers and more, CSP’s are uniquely…

Continue

Added by Jed Mole on February 24, 2014 at 3:30am — No Comments

Risks Posed by Commodified Labour in Complex Fields

The commodification of labour coincides with technological advancements in production: it is perhaps most noticeable in relation to factories.  Factory processes replaced the labour once done by skilled tradespeople. It might not be obvious how this trend has continued to this day and is now affecting professionals in complex fields including those in the data sectors. I am talking about the "made to order" and "off the shelf" acquisition of labour commodities. What I describe as commodities…

Continue

Added by Don Philip Faithful on February 22, 2014 at 7:05am — No Comments

Big Data Vendor Revenue and Market Forecast 2013-2017

Originating Author: Jeff Kelly. Originally published on Wikibon.org.…

Continue

Added by Vincent Granville on February 19, 2014 at 9:00pm — 1 Comment

20 short tutorials all data scientists should read (and practice)

The new, completed version of this Data Science Cheat Sheet can be found here.

We are now at 20, up from 17. I hope I find the time to write a one-page survival guide for UNIX, Python and Perl.…

Continue

Added by Vincent Granville on February 15, 2014 at 7:00am — 12 Comments

Interview with Dr. Roy Marsten, the Man Shaping Big Data

By Vincent Granville

Dr. Roy Marsten, author of more than 30 papers on computational optimization in academic journals, has been a professor at MIT, Northwestern, University of Arizona, and the Georgia Institute of Technology before becoming a Big Data entrepreneur, founding several companies. Today, he has taken his…

Continue

Added by Vincent Granville on February 14, 2014 at 3:30pm — 3 Comments

Proposal for bulk email processing

Bulk email represents one of the largest portions of legitimate email (spam is not included in this category). Sending bulk email requires a lot of bandwidth, and technical expertize to obtain high delivery rates. Newsletters that you are subscribed to, are typically sent via newsletter management companies, such as Vertical Response, MailChimp, Constant Contact or iContact. It is also expensive, with $10,000 per…

Continue

Added by Vincent Granville on February 14, 2014 at 12:30pm — No Comments

The top 1% data users consume 99% of all the data being produced. True or false?

True or false? What would be your numbers, in your opinion? And how do you define data user, even data? Is most of the data dormant and getting deleted even before being processed or summarized to feed some reports, actions or decisions?

Also, not all data is equal, comparing sensor data (very big) with…

Continue

Added by Vincent Granville on February 14, 2014 at 10:30am — No Comments

Interesting cartoons

Here are a few ones:…

Continue

Added by Vincent Granville on February 14, 2014 at 10:30am — No Comments

Weekly Digest - February 17

Sponsored Announcement

Predictive Analytics World, March 16-21, 2014 in San Francisco is the business event for predictive analytics professionals, managers and commercial practitioners, covering today's commercial deployment of predictive analytics, across industries and across software vendors. The…

Continue

Added by Vincent Granville on February 13, 2014 at 12:30pm — No Comments

A Method for Predicting Fishing Activity Based on Geospatial Motion Behaviors - Summarized from an Analyze Technical Report

Illegal fishing is a significant economic and environmental challenge for countries around the world.  Up to 40% of fishing catch in certain parts of the world is unlawful or unregulated, resulting in approximately $10B to $20B in economic losses and significantly depleting…

Continue

Added by Analyze on February 13, 2014 at 5:08am — No Comments

Good paper on multidimensional outlier detection on time series

Abstract 

Market analysis is a representative data analysis process with many applications. In such an analysis, critical numerical measures, such as pro¯t and sales, °uctuate over time and form time-series data. Moreover, the time series data correspond to market segments, which are described by a set of…

Continue

Added by Romeo Kienzler on February 12, 2014 at 10:30pm — 3 Comments

Big data is cheap and easy

Big data is not expensive. You can process 10 terabytes of data per year on collocated servers using open source tools (Python - I do it in Perl), using your own home-made Hadoop system if needed, to score 100 billion transactions, all for less than $1,000 per year. It requires a bit of…

Continue

Added by Vincent Granville on February 12, 2014 at 9:00am — 1 Comment

Featured Monthly Archives

2019

2018

2017

2016

2015

2014

2013

2012

2011

Videos

  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service