The difference between data monitoring and data observability solutions
Meredith Stowe Christie
“There has got to be a better way”
My colleague and Sifflet co-founder, Wissem Fathallah, recently shared a story about his time as a data scientist at Amazon. It is a story in which you may be able to relate. He remembers reviewing historical data to forecast customer behavior. As part of this review, he was constantly uncovering data anomalies despite the detection controls he had in place, spending more time than he would like to admit trying to make sure the data is clean. He also remembers uncomfortable conversations with senior leadership in which the data was not presenting correctly as monitoring solutions could only detect so much and provide so many answers.
While Wissem was head of analytics at Uber marketplace, he recalls a time when a costly data outage occurred due to the lack of visibility on data dependencies between data teams. Again, this is despite proper data monitoring controls being in place.
Such frustrating experiences led him to believe–”there has got to be a better way.” This led him, along with his co-founders Salma Bakouk and Wajdi Fatallah, to develop a better way to understand and manage data beyond your typical data monitoring solution.
The “better way” is data observability. You may be new to this term as the field of data observability has only recently gained prominence. This is due to the increasing complexity and scale of data systems; cloud, big data technologies etc.
So, how is data observability different from data monitoring?
A way to consider the difference between data monitoring and data observability, is via a car analogy. In this analogy, we compare a car dashboard to the driver of a car. Think of data monitoring as the dashboard and data observability as the driver.
Data Monitoring as the Dashboard
Data monitoring refers to the practice of tracking and measuring various aspects of data systems, such as data pipelines, databases, and data processing workflows. Such is similar to the dashboard of a car, which displays real-time information about the car's speed, fuel level, engine temperature, and other metrics. Similarly, data monitoring provides visibility into the operational aspects of the data infrastructure, alerting the data team when predefined thresholds or conditions are violated. Kind of like the alert you receive when your gas is running low or when your speed gauge shows that you are going over the speed limit. The purpose of a data monitoring tool is just like the car dashboard–you want to ensure that the car is running smoothly and promptly addressing any issues that arise, just like your data environments.
Data Observability as the Driver
Data observability, on the other hand, is comparable to the experience of driving a car. Data observability solutions are built to provide insights and understanding of the data and in this case, that’s you as the driver. Take a moment to think about the complex thought processes you undergo and the sensory experiences you have while driving. The car dashboard is simply part of the experience, an informant. Data observability goes beyond the information displayed and provides a holistic understanding of your data as it relates to the whole data ecosystem.
Specifically, a data observability tool provides deep insights into the behavior, quality, and lineage of data throughout your systems. Typically, such tools provide capabilities such as metadata management, data profiling, data cataloging– giving you the ability to understand the impact of data issues on downstream processes and analytics. This is similar to the sensations experienced when driving, such as the response of the steering wheel, the feel of the brakes, and the impact of different driving conditions. It is giving you control and oversight of what is actually happening as part of the journey.
Data Monitoring: Subset of Data Observability
Like a car dashboard, having a data monitoring solution in place is critical as it provides you real-time or near-real time alerts when something went wrong, especially before it gets worse–similar to the “check engine” light in a car. But without additional information on what is happening as part of the overall driving experience, one can only get so far.
As such, data monitoring is increasingly becoming less of a standalone tool and is commonly offered as a feature of data observability solutions. The benefit of a complete data observability solution is that not only do you get monitoring and alerts, you gain a better understanding of the quality, impact, and behavior of the data.
By developing a data observability platform, Wissem and Wajdi aim to give you more control over your data stack and give you back the time you would have spent figuring out what happened and those painful discussions with leadership about what went wrong.