Subscribe to DSC Newsletter

Featured Blog Posts – November 2013 Archive (27)

Big Data: An Understanding

We establish understanding of things in terms of Data or it will be better to say in terms of Big Data, the utilization of things, matters, issues, inventions, surroundings, maps and much more throughout our everyday life cycle, all of which has a certain data type to get input, process and output for us. Sometime we understand these in almost no time as a human, where data is being originated, what are we targeting for and more, and there are times, when some thing might take longer…

Continue

Added by Atif Farid Mohammad on November 29, 2013 at 12:50am — No Comments

Don't reinvent the wheel!

You could take a look at some traditional stat anslyses, like Cluster Analysis, and for a visual representation,  try Multidimensional Scaling. These would save you a lot of time. There is no reason to reinvent the wheel, but Big Data seems intent upon ignoring the vast historical knowkedge regarding statistics,  psychometrics, and Measurement Theory. Believe me- if you have a question on how to proceed with an analysis, multivariate statistics is likely to have several accaeptable answers.…

Continue

Added by Mark Biernbaum on November 25, 2013 at 9:20am — No Comments

Java Coding Samples for Online Data-mining

In this post, I discuss the basic characteristics of code that I have personally used to extract online data - in a process these days often called data-mining.  I intend to cover some general features.  Those that wish to do so can also compile the coding samples.

Over the years, I have programmed in a number of computer programming languages including Visual Basic, Perl, Python, and LISP (AutoLISP).  The coding samples on this blog are written in Java, my language of…

Continue

Added by Don Philip Faithful on November 24, 2013 at 7:00am — 3 Comments

My Data Science Book - Table of Contents

This book is also part of our apprenticeship. Part of the content as well as new content is in a separate document called Addendum. Click here to download the…

Continue

Added by Vincent Granville on November 23, 2013 at 12:30pm — 58 Comments

Hazards of Institutional Data

A prominent discrimination case in Canada involves a firefighter named Tawney Meiorin.  Meiorin had successfully performed her duties as a firefighter for many years.  She lost her job after the introduction of mandatory testing to determine her fitness for the position.  The testing measured aerobic capacity, and it was developed in a manner that many would regard as scientific; that is to say, it used a highly quantitative and analytic approach.  However,…

Continue

Added by Don Philip Faithful on November 23, 2013 at 4:43am — 1 Comment

SOA, Cloud Computing and Big Data Security

Let us start with some jargons and distill these into some meaningful things to be understood by us the only intelligent autonomous system in other words humans.

 

SaaS – Applications focused on end-users, used internet as a medium, e-mail, salesforce etc.

PaaS – Set of tools focused on developers, such as Ruby on Rails, Python, Eclipse, REST, SOAP, Oracle.

IaaS – Complete software and hardware solutions, VMware, Amazon EC2, Rackspace Cloud, Google Compute…

Continue

Added by Atif Farid Mohammad on November 22, 2013 at 9:22am — No Comments

Taxonomy of Data Scientists

This is a first attempt at classifying data scientists. I invite you to produce a more comprehensive, better solution.…

Continue

Added by Vincent Granville on November 20, 2013 at 8:00pm — 9 Comments

The Data Science Equation

I present here the results of a data science study about data science. Based on LinkedIn data (top people listed when you do a people search for data science, from a LinkedIn account with 8,000+ data science connections), we identified the fields most frequently associated with data science, as well as top data scientists on LinkedIn.…

Continue

Added by Vincent Granville on November 19, 2013 at 8:00pm — 9 Comments

Three Announcements

Data Visualization Contest

This December 4-5, DataBeat will host the 3rd annual Data Science Summit taking place December 4-5 in Redwood, CA. The event brings together academics, organizations, media companies, and brands to explore the…

Continue

Added by Vincent Granville on November 19, 2013 at 10:00am — 2 Comments

ETL, ELT and Data Hub: Where Hadoop is the right fit ?

Few days back i have attended a good webinar conducted by Metascale on topic “Are You Still Moving Data? Is ETL Still Relevant in the Era of Hadoop?” This post is targeting this webinar.

In summary, this webinar had nicely explained about how enterprise can use Hadoop as a data hub along with the existing Datawarehouse set up. “Hadoop as a Data Hub” this line itself raised lot of questions in my…

Continue

Added by Manish Bhoge on November 17, 2013 at 8:16pm — 5 Comments

Models vs. Experiments

Continue

Added by Michael Walker on November 17, 2013 at 8:30am — No Comments

How to compare and rank data science programs?

I started an attempt at program comparison in my article Zipfian Academy versus Data Science Apprenticeship, comparing two apples, rather than comparing apples (stuff like our program, known as…

Continue

Added by Vincent Granville on November 16, 2013 at 10:00am — 10 Comments

Weekly digest - November 18

Sponsored Announcements

Continue

Added by Vincent Granville on November 14, 2013 at 4:30pm — No Comments

Sports Analytics: Is It Only For Developed Countries?

It is a known fact that development of sports activities is not a top priority in the national budget of most of the developing countries. It can also be established that sports activities are not an active part of the…

Continue

Added by Ashish Soni on November 13, 2013 at 10:17pm — No Comments

Hidden decision trees revisited

This is a revised version of an earlier article posted on AnalyticBridge. The most recent article on this topic can be found here. …

Continue

Added by Vincent Granville on November 13, 2013 at 8:30pm — 4 Comments

Fast Combinatorial Feature Selection with New Definition of Predictive Power

In this article, I proposes a simple metric to measure predictive power. It is used for combinatorial feature selection, where a large number of feature combinations need to be ranked automatically and very fast, for instance in the context of transaction scoring, in order to optimize predictive models. This is about rather big data, and we would like to see an Hadoop methodology for…

Continue

Added by Vincent Granville on November 13, 2013 at 10:30am — 14 Comments

A visual solution to a problem of constraints.

Here is an interesting problem to play with in your down time.  I will post the solution soon, when I get a moment to update this blog.

There are five houses in a row, each of a different color, that are inhabited by five people of different nationalities, with different pets, favorite drinks, and favorite sports. Use the clues below to determine who owns the monkey…

Continue

Added by Pradyumna S. Upadrashta on November 11, 2013 at 7:30am — 2 Comments

Visualization - Trading Without Numbers

In this blog, I share some images from an application called Storm.  I wrote the program many years ago.  Storm has the ability to generate 3-dimensional plumes from a stream of data.  It also has an unusual feature that allows the user to trade based on the kinetics - effectively eliminating the need to know about pricing.  At this time, I would like to draw a clear distinction between trading and investing.  I should also point out that I used Storm for recreational…

Continue

Added by Don Philip Faithful on November 10, 2013 at 5:58pm — No Comments

Featured Monthly Archives

2019

2018

2017

2016

2015

2014

2013

2012

2011

Videos

  • Add Videos
  • View All

© 2019   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service