All Blog Posts (8,167)

How Do I Become a Data Scientist? / Data Science Aspects

I asked myself this question a few months ago. Next I thought: What is the definition of Data Science? So the first thing I started to do is read as many posts on the topic as I could get my hands on and also lookup definitions of related topics such as Data Mining and Machine Learning. Looking at the discussions and posts around Data Science it …


Added by Michael Laux on May 20, 2015 at 5:30am — 1 Comment

Machine Learning Resources for Spam Detection

Spam is a kind of messaging where the cost of sending is usually negligible and the receiver and the ISP pays the cost in terms of bandwidth usage. 

An example of a manual approach to detecting spam is using knowledge engineering. When you are aware of what is spam and what is not, you can usually filter it by creating a set of rules like,

  • If the subject line of an email contains words ‘Buy viagra’ its…


Added by Pansop on May 19, 2015 at 1:00am — 1 Comment

Predictive Analytics Demystified

This 30 minute video aims to demystify predictive analytics and present the IBM SPSS predictive analytics portfolio. The contents of the video are as follows:

  • Evolution of Analytics 5:45
  • Why is Predictive Analytics Important? 11:35
  • Demystifying Predictive Analytics 21:30
  • IBM…

Added by Venky Rao on May 18, 2015 at 11:30am — No Comments

Welcome to Sparkling Land

Note: Opinions expressed are solely my own and do not express the views or opinions of my employer.

As a data scientist who has been munging data and building machine learning models in tools like R, Python and other software(s) (open source and proprietary), I had always longed for a world without technical limitations. A world which would allow me to create data structures (data scientists usually call them vectors, matrices or dataframes) of virtually any…


Added by Fawad Alam on May 18, 2015 at 8:30am — No Comments

Data science to understand and fight cancer

For higher resolution, interactive Tableau charts, read original article. In this version, only static screenshots are displayed. It does not give justice to Tableau.

Coming up with a topic for today's blog post was tough. My last blog about Wine got attention from wine entrepreneurs…


Added by Tatiana Sorokina on May 18, 2015 at 6:30am — No Comments

An Introduction to Deep Learning and it’s role for IoT/ future cities

By Ajit Jaokar @ajitjaokar Please connect with me if you want to stay in touch on linkedin and for future updates

Cross posted from my blog - I look forward to discussion/feedback here…


Added by ajit jaokar on May 18, 2015 at 6:30am — 1 Comment

Web Crawling & Analytics Case Study - Database Vs Self Hosted Message Queuing Vs Cloud Message Queuing

The Business Problem:


To build a repository of used car prices and identify trends based on data available from used car dealers. The solution to the problem necessarily involved building large scale crawlers to crawl & parse thousands of used…


Added by Pansop on May 17, 2015 at 6:50pm — 1 Comment

Experimenting with AWS Machine Learning for Classification

In this post, I'll explore the new AWS Machine Learning services.

The problem we are trying to solve is to classify auto accident severity given a set of features. I'll not go into further details of the data set and what classification algorithms,etc. here since the goal of this blog is to explore the new AWS Machine Learning service step by step.

In the next blog post, I'll explore another service: Microsoft Azure Machine…


Added by Peter Chen on May 17, 2015 at 6:00pm — 3 Comments

The Handbook Of Data science

“If you treat an individual as he is, he will stay as he is, but if you treat him as if he were what he ought to be and could be, he will become what he ought to be and could be." —JOHANN WOLFGANG VON GOETHE

The last few years I have been trying to get an handle on the field which encompasses  analytics , big data, modeling, prediction, machine learning, algorithms , data mining techniques, rules, computational complexity, latency, data products, data engineering, statistical…


Added by Vasanth Gopal on May 17, 2015 at 3:00am — 2 Comments

Self-learning Machines & Deep Convolutional Neural Networks Classify Scenes & Identify Objects

Recent research using deep convolutional neural networks and new system architectures have demonstrated the ability of smart machines to autonomously learn to classify image scenes and identify…


Added by Michael Walker on May 16, 2015 at 2:38pm — 1 Comment

The Institutional Response

When I talk about "the institutional response," I am referring to an increasingly common occurrence: a standardized or large-scale approach is supported, promoted, and applied by a particular institution - sometimes governmental in nature - premised on its apparent suitability or superiority to achieve desirable outcomes. I suspect that in recent years, there has been a push to get citizens to file their income tax returns electronically. I know that in Canada, it has become difficult…


Added by Don Philip Faithful on May 16, 2015 at 8:48am — No Comments

There is no analytics without data management -an imperative for digital marketers.

In my experience at startups and large companies, good analytics often boils down to the availability of organized data to answer business questions. This is especially important for digital marketers, with the audience data from many channels pouring in and the need to stay on top of key metrics.

Seemingly simple questions can spin up the entire MarTech engineering team!

“If I increase my spend on display ads retargeting by 20%, for middle of the funnel prospects, what can I…


Added by Sri Desikan on May 15, 2015 at 1:02pm — No Comments

Data Science and its problems

A very warm welcome back to all here in Data Science Central. I decided to post today given that a friend in a common Social network shared with me one link that I thought to be in the interest of the community of good and responsible Data Scientists, as it were.

It concerns a blog post from Quantopian, which is an interesting new crowd-sourced investing platform vendor, a new…


Added by Nuno Fernandes on May 14, 2015 at 8:00am — No Comments

Were the Election Polls Marred by Poor Quality Data?

The UK’s general election took place last week, on Thursday 7 May, 2015. It was an election that had been hyped for being ‘too close to call’. According to the polls, the government was likely to be a coalition of one or more, with no party achieving a majority. It could have gone either way.

Imagine the shock when the BBC announced the exit poll results: a…


Added by Martin Doyle on May 14, 2015 at 3:30am — 2 Comments

Internet of Things (IoT) Employers, Job Titles & Locations

For those coming in late, IoT is…


Added by Pansop on May 14, 2015 at 1:30am — No Comments

Weekly Digest - May 18

The full version is always published Monday. Starred articles are new additions or updated content, posted between Thursday and Sunday.


  • Think Big - Think Big, a Teradata company, provides data science and engineering services that enable organizations to accelerate…

Added by Vincent Granville on May 13, 2015 at 3:30pm — No Comments

How to Cross Link Data and Why You Should Do So

The 3Vs model is the foundation of big data - Volume, Velocity, and Variety. It is used to express the key features of big data problems - for me, this is about to change. Big Data is not just about size, speed, or formats, the contextual enrichment is the most critical factor of how we unmask the best value out of data. How well you bring seemingly unrelated data together and identify the valuable connections determines how much power you unleash from your…


Added by Yuanjen Chen on May 13, 2015 at 1:00pm — No Comments

Columbia data science course, week 1: what is data science?

Cathy O'Neil, mathbabe tells her experience attending this program.

I’m attending Rachel Schutt’s Columbia University Data Science course on Wednesdays this semester and I’m planning to blog the class. Here’s what happened yesterday at the first meeting.…


Added by Mirko Krivanek on May 13, 2015 at 10:42am — No Comments

Tutorial: How to determine the quality and correctness of classification models? Introduction

What is classification?

Classification is the process of assigning every object from a collection to exactly one class from a known set of classes.

Examples of classification tasks are:

  • assigning a patient (the object) to a group of healthy or ill (the classes) people on the basis of his or her medical record,…

Added by Algolytics on May 12, 2015 at 9:00am — 4 Comments

Data Science Wars: R versus Python

Nice infographics by DataCamp. Click here to view the original and commented version. 

DSC Resources


Added by Vincent Granville on May 12, 2015 at 8:30am — 12 Comments

Blog Topics by Tags

Monthly Archives













© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service