Over the last decade or so, object storage use cases have evolved considerably as they replace traditional file and block use cases. Specifically the need to work with small data objects is becoming commonplace. Yes, there’s still plenty of large objects but small objects are becoming more prevalent than large for specific workloads and application environments.
Traditional object storage systems were designed for large objects that were infrequently accessed. With today’s smaller objects, object storage systems need to be much more dynamic and active. For legacy systems, that transition is proving difficult - even impossible. The reason lies in the architectural design. But before we get to that let’s define what a small object is and look at what’s driving the need for them.
A small object can be generally defined as objects that are smaller than 1MB. When objects are this size, it poses two challenges. First, when treated at scale (10s of PB), the aggregate number of small objects quickly reaches the billions, even trillions. This is well beyond the capability of traditional data storage systems. Second, the sheer number of objects creates an additional challenge - that of chattiness (LIST, PUT, GET, PUT, LIST DELETE etc.). Dealing with the metadata management at this scale leads is what breaks most systems. Doing this at scale without losing data or compromising performance is the grand challenge.
Let’s turn our attention to the drivers of the small object revolution:
These are just some examples of the new small objects phenomena. Additional ones include web, mobile and messaging applications. As a result, small objects are becoming the norm, at least for organizations using these new applications, workloads and systems.
With small objects proliferating, what does this mean for object storage systems:
It’s our view that these new applications, workloads and systems that make use of small objects represent the leading edge of what should be happening to every organization in the future. At some point, all organizations are going to be deploying streaming data-in-motion processing, AI-ML-DL applications and new document database systems. Perhaps IoT is less generalizable than these other activities, but it too will become much more pervasive in time.
The challenge is that some object storage systems handle small objects better than others. The only way to truthfully tell is to do a PoC that loads up an object storage system with bunches of small objects and runs applications that use them and see how well it stores, performs and works. Only then will you have a good understanding of how well an object storage system handles the oncoming flow of small objects.
MinIO is particularly well suited for small objects. This is a by-product of our design choices and not necessarily something that we set out to do. Because we are relentless in our pursuit of simplicity, we have fewer moving parts, most notably the absence of a database to manage metadata.
This is perhaps the biggest advantage MinIO possesses in the small objects realm.
Metadata databases are functionally incompatible with large numbers of small objects. You cannot list them at scale, you cannot delete them at scale. Small objects are corrosive to external metadata database architectures.
The world will continue to produce more small objects - this we know. As an architect, it may be possible to create a series of workarounds to address these issues, but ultimately, adopting an object store that will scale seamlessly with your small objects is a better path - and one that should be undertaken sooner rather than later.
We are going to follow up on this post with benchmark data and instructions on how to measure small object performance. This will include detail on the storage media as well as the bandwidth requirements.
In the interim, if you are keen to get started, MinIO can be downloaded here. We are ready to help and have a Slack channel for the community that supports more than 10K members of our ever expanding community. If you want an SLA or 24/7 direct to engineer support, the answer can be found in the MinIO Subscription Network. Priced on capacity and billed monthly it brings the tools, talent and technology that powers MinIO to your deployments.
Feel free to shoot us a note at [email protected] if you have any questions.
Posted 1 March 2021
© 2021 TechTarget, Inc.
Powered by
Badges | Report an Issue | Privacy Policy | Terms of Service
Most Popular Content on DSC
To not miss this type of content in the future, subscribe to our newsletter.
Other popular resources
Archives: 2008-2014 | 2015-2016 | 2017-2019 | Book 1 | Book 2 | More
Most popular articles
You need to be a member of Data Science Central to add comments!
Join Data Science Central