Guest blog by R. Bhargav
What does “Big Data” mean?
The term “big data” is self-explanatory -a collection of extremely big data sets that normal computing techniques cannot process. The term not only refers to the data, but also to the various frameworks, tools, and techniques involved.
Technological advancement and the advent of new channels of communication (like social networking) and new, stronger devices has presented a challenge to industry players in that they have to find other ways to handle the data.
From the beginning of time until 2003, the entire world only had five billion gigabytes of data. The same amount of data was generated over only two days in 2011. By 2013, this volume was generated every ten minutes. It is, therefore, not surprising that generation of 90% of all the data in the world has been in the past few years.
All this data is useful when processed, but it had been in gross neglect before the concept of big data came along.
The Major Sources of Big Data
Black Box Data: This is the data generated by airplanes, including jets and helicopters. Black box data includes flight crew voices, microphone recordings, and aircraft performance information.
Social Media Data: This is data developed by such social media sites as Twitter, Facebook, Instagram, Pinterest, and Google+.
Stock Exchange Data: This is data from stock exchanges about the share selling and buying decisions made by customers.
Power Grid Data: This is data from power grids. It holds information on particular nodes such as usage information.
Transport Data: This includes possible capacity, vehicle model, availability, and distance covered by a vehicle.
Search Engine Data: This is one of the biggest sources of big data. Search engines have vast databases where they get their data.
From these examples, it is clear that big data is not about volumes alone. It also includes extensive variety and high velocity of data. In 2001, Doug Laney -an industry analyst-articulated the 3 Vs of big data as velocity, volume, and variety.
The speed at which data is streamed, nowadays, is unprecedented, making it difficult to deal with it in a timely fashion. Smart metering, sensors, and RFID tags make it necessary to deal with data torrents in almost real-time. Most organizations are finding it difficult to react to data quickly.
Not many years ago, having too much data was simply a storage issue. However, with increased storage capacities and reduced storage costs, industry players like Remote DBA Support are now focusing on how relevant data can create value.
There is greater variety of data today than there was a few years ago. Data is broadly classified as structured data (relational data), semi-structured data (data in the form of XML sheets), and unstructured data (media logs and data in the form of PDF, Word, and Text files). Many companies have to grapple with governing, managing, and merging the different data varieties.
Veracity (the quality of the data), variability (the inconsistency which data sometimes displays), and complexity (when dealing with large volumes of data from different sources) are other important characteristics of data.
9 of the many merits of Big Data
Today’s consumer is very demanding. He talks to past customers on social media and looks at different options before buying. A customer wants to be treated as an individual and to be thanked after buying a product. With big data, you will get actionable data that you can use to engage with your customers one-on-one in real time.
One way big data allows you to do this is that you will be able to check a complaining customer’s profile in real-time and get info on the product/s he/she is complaining about. You will then be able to perform reputation management.
Big data allows you to re-develop the products/services you are selling. Information on what others think about your products -such as through unstructured social networking site text- helps you in product development.
Big data allows you to test different variations of CAD (computer aided design) images to determine how minor changes affect your process or product. This makes big data invaluable in the manufacturing process.
Predictive analysis will keep you ahead of your competitors. Big data can facilitate this by, as an example, scanning and analyzing social media feeds and newspaper reports. Big data also helps you do health-tests on your customers, suppliers, and other stakeholders to help you reduce risks such as defaulting.
Big data is helpful in keeping data safe. Big data tools help you map the data landscape of your company, which helps in analysis of internal threats. As an example, you will know if your sensitive information has protection or not. A more specific example is that you will be able to flag the emailing or storage of 16 digit numbers (which could, potentially, be credit card numbers).
Big data allows you to diversify your revenue streams. Analyzing big data can give you trend-data that could help you come up with a completely new revenue stream.
Your website needs to be dynamic if it is to compete favourably in the crowded online space. Analysis of big data helps you personalize the look/content and feel of your site to suit every visitor based on, as an example, nationality and sex. An example of this is Amazon’s IBCF (item-based collaborative filtering) that drives its “People you may know” and “Frequently bought together” features.
If you are running a factory, big data is important because you will not have to replace pieces of technology based on the number of months or years they have been in use. This is costly and impractical since different parts wear at different rates. Big data allows you to spot failing devices and will predict when you should replace them.
Big data is important in the healthcare industry, which is one of the last few industries still stuck with a generalized, conventional approach. As an example, if you have cancer, you will go through one therapy and if it does not work, your doctor will recommend another therapy. Big data allows a cancer patient to get medication that is developed based on his/her individual genes.
Check out Simplilearn's Big Data and Analytics Training.
About the author: R Bhargav
A seasoned engineering process & analysis enthusiast, Bhargav writes on Quality Management, Data Science, App Development, Programming, and other allied disciplines. An MS in MechEng, Bhargav has over six years of professional experience in various domains, ranging from game development to CFD D&A, and was previously associated with Paradox Interactive, The Creative Assembly, and Mott MacDonald LLC.
The original post can be seen here.