Subscribe to DSC Newsletter

October 2013 Blog Posts (47)

The assumptions on which the RDBMS is based has changed: data and code

In general, computer scientists treats code and data in two very different ways. Virtual memory was originally developed to run big programs (code) in small memory, while data are entities kept in external storage and must be retrieved into memory before computing. As a result, today’s application developers think by instinct the programming model based on storage and explicit data retrieval. This model, referred to as storage-based computing, plays an important role and has done a great job…

Continue

Added by Yuanjen Chen on October 31, 2013 at 7:24pm — No Comments

Weekly digest - November 4

Click here to check if you are missing some of our messages.

Sponsored Announcements…

Continue

Added by Vincent Granville on October 31, 2013 at 9:30am — No Comments

Critical Data and the Organizational Construct

The term "critical thinking" is often found in job postings.  Some would argue that this essentially means, "Thinking outside the box."  Karl Marx, who asserted that labourers represent a class of people, has been described as a critical thinker.  Regardless of how a person feels about Marx, it goes without saying that the phenomena of social classes is well-established.  Politicians for instance fight for the support of the "middle class."  How precisely does such an observation by this…

Continue

Added by Don Philip Faithful on October 30, 2013 at 4:25pm — No Comments

Why the Business Gets Frustrated with IT: the Data Warehouse

Defining the Problem

I propose that business frustration with IT is generally not a communication problem.



I often see managers frustrated with IT, but seldom is the cause a breakdown of communications - as we like to tell ourselves. Good managers always demand clear concise communications. When pressed, IT people, as well as other departmental folks, are able to deliver this easily enough.

The chief…
Continue

Added by Mitchell A. Sanders on October 29, 2013 at 1:30pm — No Comments

Sensors Here, There and Everywhere

Smart organizations are using the power of data science and data produced by embedded sensors and machine…

Continue

Added by Michael Walker on October 29, 2013 at 11:30am — 3 Comments

10 signs that you are a data scientist

http://ficolabsblog.fico.com/2013/10/top-10-ways-you-know-youre-a-data-scientist.html

I'd add an 11th one as well: you check data science sites before you check news sites in the morning!

10. You think … “So much data, so littl…”

9. You know what heteroscedasticity is.

8. Your best pick-up lines all include the word “moneyball.”

7. You look at your grocery…

Continue

Added by Dr. Z on October 29, 2013 at 7:30am — 1 Comment

What is the difference between in-memory and in-place computing approach?

To be short, in-memory computing takes advantage of physical memory, which is expected to process data much faster than disk. In-place, on the other hand, fully utilizes the address space of 64bit architecture. Both are gifts from the modern computer science; both are essences of the BigObject. 

In-place computing only becomes possible upon the introduction of 64bit architecture, whose address space is big enough to hold the entire data set for most of cases we are dealing with today.…

Continue

Added by Yuanjen Chen on October 29, 2013 at 1:00am — 1 Comment

Big Data and Education Data Mining

Big Data is big in nature, however Education data is not that big yet compared to Big Data. Quantitative analysis, Audio-Video recorded data of Education related research and general research work conducted by the academic institutions is steadily increasing. The data change is started within our education system as our students are taking exams, courses, conducting their research using computerized means. This means that the searching behaviors of students or whoever associated with…

Continue

Added by Atif Farid Mohammad on October 27, 2013 at 7:39pm — No Comments

Search for Embodiment in Data Mining

Above - During a Session at the Archives of Ontario in 2011

In this blog post, I describe my early experiences leading me to conclude, data as we know it tends to be "disembodied" - that is to say, often lacking any kind of connection to different types of bodies.  When we talk about things…

Continue

Added by Don Philip Faithful on October 27, 2013 at 1:30pm — No Comments

STEM Education and Big Data

STEM is an acronym for the fields of science, technology, engineering and math and it has a push within United States, however there is a big factor is not taken into consideration yet by the masses in the world of Education, and that is the best use of Big Data. This is a lacking factor, as we are still exploring Big Data and its utilization at a novice level by our selves. Education by itself is a huge and vast field to conduct the research, by have an exploration within the research…

Continue

Added by Atif Farid Mohammad on October 26, 2013 at 8:27am — 13 Comments

Creating Complex Encoded Objects from Qualitative Data

I have always found the task of converting qualitative data into something quantifiable a bit challenging.  A common route might be as follows:  assemble all of the resources containing qualitative information (e.g. questionnaires containing open-ended questions); seek out apparent themes in the responses; and quantify how frequently these themes are mentioned or raised.  This methodology leaves open the question of when something is or isn't a theme, and whether something must be a theme in…

Continue

Added by Don Philip Faithful on October 25, 2013 at 7:49pm — No Comments

Reductive Versus Expansive Data

I have so far encountered two general types of data . . .

Reductive Data

This is data that conforms to prescribed criteria.  I sometimes describe it has the metrics of criteria or measurements of conformity.  For instance, an organization might want to measure something potentially obscure like "efficiency."  It therefore becomes necessary to establish under what conditions or criteria something is efficient.  I describe…

Continue

Added by Don Philip Faithful on October 24, 2013 at 1:05pm — No Comments

R Tutorial for Beginners: A Quick Start-Up Kit



Learn R: A Statistical Programming Language



Here's my quick start-up kit for you.
  1. Install R
    1. Linux: "sudo apt-get install r-base" should do…
Continue

Added by Mitchell A. Sanders on October 24, 2013 at 9:00am — 5 Comments

A Tail of 3 Models - The Story of Goodness of Fit with Binary Classification

Before you select the best model based on your favorite goodness of fit statistic – Mean Squared Error, Gini, K-S, AUC, or misclassification rate – STOP!  Model performance metrics are not a one size fits all measure.  As an analyst, selecting the right performance metric might mean the difference between having an exceptionally good result, and having no result.   

The classic example:  There is only a 3% prevalence of the event of interest in my…

Continue

Added by Laura E. Wood Squier on October 24, 2013 at 8:00am — No Comments

The BigObject - an Agile Analytic Engine for Big Data

Hi all,

This is my first post here. I'm glad to introduce this newly launched big data analytic engine, the BigObject. In the past 2 years we have been working on an optimal approach to handle big data for analytic purposes and challenging the existed models, some assumptions of which are no longer valid. For example, as the data size grows so rapidly, is it still practical that we stick to the relational models neglecting the time spending in data retrievals? What impact did…

Continue

Added by Yuanjen Chen on October 23, 2013 at 11:30pm — 2 Comments

Big Data Skills In Demand (Comprehensive List)

Big Data kills In demand

Click on the image for full view

Here is a refresher from the last post:

“I went on a job board and searched for the number of job postings that listed Big Data tools as part of the requirements.”

I created a comprehensive list for Big Data skills in  response to the…

Continue

Added by Fari Payandeh on October 23, 2013 at 7:20pm — 2 Comments

IBM Distinguished Engineer solves Big Data Conjecture

A mathematical problem related to big data was solved by Jean-Francois Puget, engineer in the Solutions Analytics and Optimization group at IBM France. The problem was first mentioned on Data Science Central, and an award was offered to the first data scientist to solve it.

Bryan Gorman, Principal Physicist, Chief Scientist at…

Continue

Added by Vincent Granville on October 23, 2013 at 3:00pm — 4 Comments

The Professionalization of Data Science

There has been much discussion and debate about the definition of data science and the new rare breed of sexy bird…

Continue

Added by Michael Walker on October 22, 2013 at 10:13pm — 2 Comments

Monthly Archives

2017

2016

2015

2014

2013

2012

2011

1999

Follow Us

Videos

  • Add Videos
  • View All

Resources

© 2017   Data Science Central   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service