Subscribe to DSC Newsletter

The ongoing pursuit of data solutions occupies mindshare of consumers, vendors and service providers alike as they invest considerable amount of time, costs and efforts. The past attempts to concur data have resulted in solutions that combined databases, applications and tools with limited success. We are still struggling with a few unresolved, persistent legacy challenges such as - 

Data everywhere

Today, every enterprise has huge data repositories of varying sizes, subjects and authenticity as a result of just conducting business. This data tsunami is aided by the ever improving storage technologies boosting performance and capacity. 

Access Hurdles

As we build data around us with an intent to use it, the challenge is Knowing how to access it. Many solutions targeting the access are riddled with attempts that address the retrieval challenge through upstream design of “data store”. This has taken a lot of work and funding while addressing the problem partially at best.  

Redundancy & Disparity

We have many instances of disparate source systems recording the same subject data with dissimilar looking business definitions. The competing user requirements of how they seek to view and use data in their own world has complicated the issue further.  Adding disparity with healthy, targeted redundancy  has not worked as desired.

Decay

Corporations have stored huge volumes of data with intent to gain competitive edge. They have built a multitude of data sets supporting different forms of same data over long periods of time. As a result, they have faced the challenge of data format as it evolved through its lifecycle. How can we fulfill the demands of seamless data access? What cannot be accessed will decay due to lack of use.

 This representative list only has a few of the data related challenges.  As we try to unlock the “Big” data opportunity, let us understand how and if it will help.   

Can “Big” Data help?

At high level the above challenges are results of two basic activities to store and retrieve data. This simple outlook leads to two distinct lines of attack to the data challenge. The past attempts tried resolve the storage and retrieval as a unified problem resulting in half-baked vehicles resolving each facet partially at best.  

In the “big” data paradigm we are offered variety of choices in managing storage and retrieval as two separate and independent needs. Handling data storage activities independent of how we plan to retrieve offers freedom in handling data variety, capture speed, frequency and volume challenges. We store all the data we get “as is” with no regard for how it is accessed and used. 

The retrieval handled as a separate, independent challenge without regard to how data is stored. In this space the focus is on devising use centric access mechanisms to the data stored in “as is” state. This offers design of data stores independence from access needs and vice versa. This delineation offers speed and flexibility of solution architecture in resolving consumer demands.  

In subsequent posts we explore more ... stay tuned!!!

Views: 731

Tags: Apache, Big, Data, Hadoop

Comment

You need to be a member of Data Science Central to add comments!

Join Data Science Central

Videos

  • Add Videos
  • View All

© 2020   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service