While the technologies and techniques for analyzing, aggregating, and modeling data have largely kept pace with the demands of the modern data organization, our ability to tackle broken data pipelines has lagged behind. So, how can we identify, remediate, and even prevent this all-too-common problem before it becomes a massive headache? The answer lies in the data industry’s next frontier: data observability.
When you were growing up, did you ever read a Choose Your Own Adventure novel? You, the protagonist, are responsible for making choices that will determine the outcome of your epic journey, whether that’s slaying a fire-breathing dragon or embarking on a voyage into the depths of Antarctica. If you’re in data, these “adventures” might look a little different:
It’s 3 a.m. You’ve spent the last four hours troubleshooting a data fire drill, and you’re exhausted. You need to figure out why your team’s Tableau dashboard isn’t pulling the freshest data from Snowflake so that Jane in Finance can generate that report… yesterday.
You’re migrating to a new data warehouse and there’s no way to know where important data lives. Redshift? Azure? A spreadsheet in Google Drive? It’s like a game of telephone trying to figure out where to look, what the data should look like, and who owns it.
It takes 9 months of onboarding before you know where any of your company’s “good data” lives. You’ve found so many “FINAL_FINAL_v3_I_PROMISE_ITS_FINAL” versions of a single data set that you don’t know what’s up and what’s down any more, let alone don’t know which data tables are in production and which ones should be deprecated.
Before we dive into how to fix this problem, let’s talk about the common cause of broken data pipelines: data downtime.
In the early days of the internet, if your site was down, it was no big deal — you’d get it back up and running in a few hours with little impact on the customer (because, frankly, there weren’t that many and our expectations of software were much lower).
Flash forward to the era of Instagram, TikTok, and Slack — now, if your app crashes, that means an immediate impact on your business. To meet our need for five nines of uptime, we built tools, frameworks, and even careers fully dedicated to solving this problem.
In 2020, data is the new software.
It’s no longer enough to simply have a great product. Every company serious about maintaining their competitive edge is leveraging data to make smarter decisions, optimize their solutions, and even improve the user experience. In many ways, the need to monitor when data is “down” and pipelines are broken is even more critical than achieving five nines. As one data leader at a 5,000-person e-commerce company recently told me: “it’s worse to have bad data on my company’s website than to not have a website at all.”
In homage to the concept of application downtime, we call this problem data downtime, and it refers to periods of time where your data is missing, inaccurate, or otherwise erroneous. Data downtime affects data engineers, data scientists, and data analysts, among others at your company, leading to wasted time (north of 30 percent of a data team’s working hours!), sunk costs, low morale, and perhaps worst of all, lack of trust in your insights.
Here are some common sources of data downtime — maybe they’ll resonate:
With the increased scrutiny around data collection, storage, and applications, it’s high time data downtime was treated with the diligence it deserves.
Data observability, a concept pulled from best practices in DevOps and software engineering, refers to an organization’s ability to fully understand the health of the data in their system. By applying the same principles of software application observability and reliability to data, these issues can be identified, resolved and even prevented, giving data teams confidence in their data to deliver valuable insights.
Data observability can be split into five key pillars:
Data observability provides end-to-end visibility into your data pipelines, letting you know which data is in production and which data assets can be deprecated, thereby identifying and preventing downtime.
A robust and holistic approach to data observability incorporates:
With these guidelines in tow, data teams can more effectively manage and even prevent data downtime from occurring in the first place.
So — where will your data adventure take you?
Interested in learning more about data observability for your organization? Reach out to Barr Moses.