In-Memory Technology’s Petabyte-Sized Headache

In-memory technology for business intelligence and analytics first appeared in the early 2000s. Since then, they have been gaining prominence. Today, in-memory tools have finally come to a stage of maturity as widely accepted enterprise-scale solutions for a diverse range of organizations. Enterprises have become enamoured with the software’s ability to provide much faster results than could be supplied by previous tools.

However, as data continues to grow at astonishing rates, in-memory tools will no longer suffice and will fail to produce the performance rates they previously could. Accordingly, BI software developers will have to start thinking about new ways to provide the fast querying capabilities consumers have grown to expect.


The Promise of In-Memory Databases

In-memory technology relies on a simple and accurate premise: processing data in RAM is much faster than doing so in disk storage.

This fact, combined with the drop in RAM hardware prices that occurred early in the previous century, made a new type of analytics method viable and lucrative. Simply put, the software would copy data into RAM (with or without compressing it) and perform calculations in-memory, producing much faster results than could previously be achieved via databases relying on technologies such as OLAP cubes and distributed databases (e.g. Hadoop).



This type of platform does indeed make good on its promise and allows for rapid data discovery and exploration. Queries that previously would have required days of processing could suddenly be answered in minutes. End-users and consumers could suddenly do much more with their business intelligence dashboards - asking new, on-the-fly questions and getting near immediate answers.

However, the advantage of in-memory technology is also its Achilles’ heel: if performance depends entirely on RAM, what happens when you run out of RAM?


Rapidly Approaching the Glass Ceiling

Previously, the possibility of a RAM glass ceiling would in itself seem unlikely. After all, RAM is much more of a commodity than it was a few decades ago, and it seemed perfectly viable and sustainable for companies to keep upgrading their hardware to accommodate for the growth in their data. But what this approach failed to take into account is the speed in which data is growing.

Though somewhat cliched, it is true nonetheless that data is growing at an exponential rate due to new ways of automated data collection as well as large datasets that are generated daily from such sources as worldwide web usage information, sensor data, machine logs and the Internet of Things. By all accounts, this trend is expected to continue, and according to one estimate, we can expect a 4,300% increase in annual data generation by 2020.

In this reality of truly big data, the hardware requirements of in-memory technology are beginning to shed doubt on its sustainability and soundness.

Once the system runs out of RAM, an inevitable occurrence when the size of the data grows larger than the amount of available memory, there is a considerable drop in performance. And while RAM has grown cheaper, it is still relatively expensive. Data-intensive organizations are beginning to discover that to actually process all of their data using in-memory BI software they are being forced to make investments so large that they often exceed the value they intend to derive from their data analytics project.

Companies with less resources, unwilling to make such a risky investment, will instead opt to reduce the granularity of the data they work with. That is, they will leave out certain sources or fields, and settle for higher-level analysis. But this approach seems to contradict the basic assumption of data analysis: true insights are reached when one can take all data into account, finding new insights and unexpected connections.

Hence, working with partial data is also a highly unsatisfactory solution.


Looking in New Directions

It is for these reasons that in-memory business intelligence has reached a crucial point in its development. If it keeps on trying to do the same thing (i.e., to cram all data into memory and demand consumers to keep upgrading their hardware to retain the fast performance they have grown accustomed to), it will simply not manage to provide a reasonable solution for companies that have large amounts of data and limited resources (needless to say, many companies fall into this category).

The solution to this is not to abandon in-memory processing altogether. Indeed, the unique advantages of performing queries in RAM should not be yielded so quickly. On the other hand, storing the bafflingly large amounts of business data in RAM on a permanent basis will also not do.

To balance the unique performance of in-memory processing with the limitations of hardware costs, we turn to smarter algorithms, caching and making full use of available computational resources (disk storage, RAM and CPU). Identifying the parts of data that can remain “in rest,” and using past queries to shorten the calculation times of new ones, can give BI software developers a new way to crunch Big Data, without relying on endless RAM upgrades.

While the industry has already started moving in this direction with the introduction of In-Chip analytics, there is still much ground to be tread. But if Business Intelligence expects to keep up with the needs of the business community it will have to start looking for alternatives to in-memory.  It is only a matter of time before the major players in the industry realize this.

About the Author

Saar Bitner is the VP of Marketing at SiSense.

Views: 1601

Tags: SiSense, big, business, data, intelligence


You need to be a member of Data Science Central to add comments!

Join Data Science Central

© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service