Subscribe to DSC Newsletter

All Blog Posts Tagged 'data' (754)

Not just analyze the log, mine it.

Much more devices held, much more messages sent, much more data up in the air.

Upon the arrival of IOT (Internet of Things) era, number of connected devices grows rapidly, the signals of which stack up mines containing valuable, hidden insights.

We used to analyze log files for risk management, spotting the anomalies and exceptions based on outlined records. Via the ever-richer meta data and context, we are entitled to weave more story from the strings now. By union or…

Continue

Added by Yuanjen Chen on January 25, 2014 at 8:00am — No Comments

In-place Computing Model: for Big and Complex Data

As we've seen how in-place and in-memory work differently, today we are sharing more fundamentals of in-place computing model. This models was designed to solve "Big and Complex Data," - not just about size but more about the complexity. We see many analytic cases today incorporate…

Continue

Added by Yuanjen Chen on January 13, 2014 at 12:30am — 2 Comments

Big Data from Small Devices?

Predictions are in our DNA.  Millions of us live with them daily, from checking the weather to reading daily horoscopes.   When it comes to Big Data, the industry has shown no shortage of predictions for 2014.  In fact, you might have read about insights on women in data science, ambitions for Machine Learning or a vision for the consumerization of Advanced…

Continue

Added by Bruno Aziza on January 6, 2014 at 2:30pm — 1 Comment

Searching for Intangibles - Data Embodiment Using Protocols

I often find myself following missing-persons cases. I am interested in the reasoning behind the use of resources. There can be major deployments of capital during investigations. I tend to wonder which scenarios trigger more spending than others. I also recognize the visceral side of missing-persons cases. There are publicly accessible databases of missing persons in Canada, the United States, and I am certain many other countries. I imagine that it can be difficult to regard such…

Continue

Added by Don Philip Faithful on January 1, 2014 at 7:12am — No Comments

Smart Use of Small Data Can Still Have a Big Impact

By: Nicholas Hartman, Director at CKM Advisors

This is a re-print of a post from ckmadvisors.com. The original post is available here.

Whenever I introduce the data analytics we do here at CKM, an ever increasing percentage of people will respond along the lines of “So like Hadoop / NoSQL / [insert generic ‘big data’ term]?”…

Continue

Added by Nicholas Hartman on December 13, 2013 at 6:44am — No Comments

Java Coding Samples for Online Data-mining

In this post, I discuss the basic characteristics of code that I have personally used to extract online data - in a process these days often called data-mining.  I intend to cover some general features.  Those that wish to do so can also compile the coding samples.

Over the years, I have programmed in a number of computer programming languages including Visual Basic, Perl, Python, and LISP (AutoLISP).  The coding samples on this blog are written in Java, my language of…

Continue

Added by Don Philip Faithful on November 24, 2013 at 7:00am — 3 Comments

Big Data getting bigger in UAE

With immense growth in technology, financial institutes, skyscrapers and malls, one can find that the number of consumers have increased exponentially in the Middle East over the past few years. Reports suggests that 91% of the population comprises of expats in UAE enjoying high salaries and tax free benefits. 

Moreover, the options to spend your money have gotten bigger and better with UAE working towards the initiative to build Smart Cities setting strong infrastructure fundamentals…

Continue

Added by IPSITA on November 22, 2013 at 12:02pm — No Comments

Introduction to the BigObject® and In-place Computing Model

The BigObject® - A  Computing Engine Designed for Big Data

BigObject® presents an in-place* computing approach, designed to solve the complexity of big data and compute on a real-time basis. The mission of the BigObject® is to deliver affordable computing power, enabling enterprises of all scales to interpret big data. With the advances in what a commodity machine can perform, it…

Continue

Added by Yuanjen Chen on November 20, 2013 at 5:29pm — No Comments

The assumptions on which the RDBMS is based has changed: the ideal data structure

We have been using tables in the relational database, mostly for the transactional purposes, and that proves effective. Considering the data size and analytic purpose, however, the data structure might need to be redesigned for better efficiency.

To determine how to decompose the complexity of big data, we have observed the way the organisms function. In the physical world, the universe is organized into a hierarchy of…

Continue

Added by Yuanjen Chen on November 3, 2013 at 10:29pm — No Comments

The assumptions on which the RDBMS is based has changed: data and code

In general, computer scientists treats code and data in two very different ways. Virtual memory was originally developed to run big programs (code) in small memory, while data are entities kept in external storage and must be retrieved into memory before computing. As a result, today’s application developers think by instinct the programming model based on storage and explicit data retrieval. This model, referred to as storage-based computing, plays an important role and has done a great job…

Continue

Added by Yuanjen Chen on October 31, 2013 at 7:24pm — No Comments

Critical Data and the Organizational Construct

The term "critical thinking" is often found in job postings.  Some would argue that this essentially means, "Thinking outside the box."  Karl Marx, who asserted that labourers represent a class of people, has been described as a critical thinker.  Regardless of how a person feels about Marx, it goes without saying that the phenomena of social classes is well-established.  Politicians for instance fight for the support of the "middle class."  How precisely does such an observation by this…

Continue

Added by Don Philip Faithful on October 30, 2013 at 4:25pm — No Comments

Why the Business Gets Frustrated with IT: the Data Warehouse

Defining the Problem

I propose that business frustration with IT is generally not a communication problem.



I often see managers frustrated with IT, but seldom is the cause a breakdown of communications - as we like to tell ourselves. Good managers always demand clear concise communications. When pressed, IT people, as well as other departmental folks, are able to deliver this easily enough.

The chief…
Continue

Added by Mitchell A. Sanders on October 29, 2013 at 1:30pm — No Comments

10 signs that you are a data scientist

http://ficolabsblog.fico.com/2013/10/top-10-ways-you-know-youre-a-data-scientist.html

I'd add an 11th one as well: you check data science sites before you check news sites in the morning!

10. You think … “So much data, so littl…”

9. You know what heteroscedasticity is.

8. Your best pick-up lines all include the word “moneyball.”

7. You look at your grocery…

Continue

Added by Dr. Z on October 29, 2013 at 7:30am — 1 Comment

What is the difference between in-memory and in-place computing approach?

To be short, in-memory computing takes advantage of physical memory, which is expected to process data much faster than disk. In-place, on the other hand, fully utilizes the address space of 64bit architecture. Both are gifts from the modern computer science; both are essences of the BigObject. 

In-place computing only becomes possible upon the introduction of 64bit architecture, whose address space is big enough to hold the entire data set for most of cases we are dealing with today.…

Continue

Added by Yuanjen Chen on October 29, 2013 at 1:00am — No Comments

A Tail of 3 Models - The Story of Goodness of Fit with Binary Classification

Before you select the best model based on your favorite goodness of fit statistic – Mean Squared Error, Gini, K-S, AUC, or misclassification rate – STOP!  Model performance metrics are not a one size fits all measure.  As an analyst, selecting the right performance metric might mean the difference between having an exceptionally good result, and having no result.   

The classic example:  There is only a 3% prevalence of the event of interest in my…

Continue

Added by Laura E. Wood Squier on October 24, 2013 at 8:00am — No Comments

The BigObject - an Agile Analytic Engine for Big Data

Hi all,

This is my first post here. I'm glad to introduce this newly launched big data analytic engine, the BigObject. In the past 2 years we have been working on an optimal approach to handle big data for analytic purposes and challenging the existed models, some assumptions of which are no longer valid. For example, as the data size grows so rapidly, is it still practical that we stick to the relational models neglecting the time spending in data retrievals? What impact did…

Continue

Added by Yuanjen Chen on October 23, 2013 at 11:30pm — 2 Comments

Warm-up exercise before data science.

Practicing Data science indeed a long term effort than a learning handful of skills.  We ought to be academically good enough to take up this challenge. However, if you think you came a long way from your academic rebuilding,  but you still have that zeal & passion to take the oil from the data and fill the skill gap of data science then here is the warm-up tips. Below points must exercised before jumping into…

Continue

Added by Manish Bhoge on October 18, 2013 at 9:26am — No Comments

Data Mining and Cookie Butter - Yes, they're related.

Data Mining and Cookie Butter

There is much buzz about the next revolution in technology and the coming innovations that will shape the 21st Century. Those welcoming this change espouse that our lives will be improved…
Continue

Added by piALGO on October 17, 2013 at 8:32am — No Comments

Monthly Archives

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

1999

Videos

  • Add Videos
  • View All

© 2020   Data Science Central ®   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service