Metric Infrastructure with Minerva @ Airbnb
As described in the first post of this series, Airbnb invested significantly in building Minerva, a single source of truth metric platform that standardizes the way business metrics are created, computed, served, and consumed. We spent years iterating toward the right metric infrastructure and designing the right user experience. Because of this multi-year investment, when Airbnb’s business was severely disrupted by COVID-19 last year, we were able to quickly turn data into actionable insights and strategies.
In this second post, we will deep dive into our compute infrastructure. Specifically, we will showcase how we standardize dataset definitions through declarative configurations, explain how data versioning enables us to ensure cross-dataset consistency, and illustrate how we backfill data efficiently with zero downtime. By the end of this post, readers will have a clear understanding of how we manage datasets at scale and create a strong foundation for metric computation.
As shared in Part I, Minerva was born from years of growing pains and related metric inconsistencies found within and across teams. Drawing on our experience managing a metric repository specific to experimentation, we aligned on six design principles for Minerva.
We built Minerva to be:
- Standardized: Data is defined unambiguously in a single place. Anyone can look up definitions without confusion.
- Declarative: Users define the “what” and not the “how”. The processes by which the metrics are calculated, stored, or served are entirely abstracted away from end users.
- Scalable: Minerva must be both computationally and operationally scalable.
- Consistent: Data is always consistent. If definition or business logic is changed, backfills occur automatically and data remains up-to-date.
- Highly available: Existing datasets are replaced by new datasets with zero downtime and minimal interruption to data consumption.
- Well tested: Users can prototype and validate their changes extensively well before they are merged into production.
In the following sections, we will expand on each of the principles described here and highlight some of the infrastructure components that enable these principles. Finally, we will walk through the user experience as we bring it all together.
You may recall from Part I that the immense popularity of the
core_data schema at Airbnb was actually a double-edged sword. On the one hand,
core_data standardized table consumption and allowed users to quickly identify which tables to build upon. On the other hand, it burdened the centralized data engineering with the impossible task of gatekeeping and onboarding an endless stream of new datasets into new and existing core tables. Furthermore, pipelines built downstream of
core_data created a proliferation of duplicative and diverging metrics. We learned from this experience that table standardization was not enough and that standardization at the metrics level is key to enabling trustworthy consumption. After all, users do not consume tables; they consume metrics, dimensions, and reports.
Minerva is focused around metrics and dimensions as opposed to tables and columns. When a metric is defined in Minerva, authors are required to provide important self-describing metadata. Information such as ownership, lineage, and metric description are all required in the configuration files. Prior to Minerva, all such metadata often existed only as undocumented institutional knowledge or in chart definitions scattered across various business intelligence tools. In Minerva, all definitions are treated as version-controlled code. Modification of these configuration files must go through a rigorous review process, just like any other code review.
At the heart of Minerva’s configuration system are event sources and dimension sources, which correspond to fact tables and dimension tables in a Star Schema design, respectively.
Event sources define the atomic events from which metrics are constructed, and dimension sources contain attributes and cuts that can be used in conjunction with the metrics. Together, event sources and dimension sources are used to define, track, and document metrics and dimensions at Airbnb.
Prior to Minerva, the road to creating an insightful piece of analysis or a high-fidelity and responsive dashboard was a long one. Managing datasets to keep up with product changes, meet query performance requirements, and avoid metric divergence quickly became a significant operational burden for teams. One of Minerva’s key value propositions is to dramatically simplify this tedious and time consuming workflow so that users can quickly turn data into actionable insights.
With Minerva, users can simply define a dimension set, an analysis-friendly dataset that is joined from Minerva metrics and dimensions. Unlike datasets created in an ad-hoc manner, dimension sets have several desirable properties:
- Users only define the what and need not concern about the how. All the implementation details and complexity are abstracted from users.
- Datasets created this way are guaranteed to follow our best data engineering practices, from data quality checks, to joins, to backfills, everything is done efficiently and cost effectively.
- Data is stored efficiently and is optimized to reduce query times and responsiveness of downstream dashboards.
- Because datasets are defined transparently in Minerva, we encourage metric reuse and reduce dataset duplication.
By focusing on the “what” and not on the “how”, Minerva improves user productivity and maximizes time spent on their primary objectives: studying trends, uncovering insights, and performing experiment deep dives. This value proposition has been the driving force behind Minerva’s steady and continual adoption.
Minerva was designed with scalability in mind from the outset. With Minerva now serving 5,000+ datasets across hundreds of users and 80+ teams, the cost and maintenance overhead is a top priority.
At its core, Minerva’s computation is built with the DRY (Do not Repeat Yourself) principle in mind. This means that we attempt to re-use materialized data as much as possible in order to reduce wasted compute and ensure consistency. The computational flow can be broken down into several distinct stages:
- Ingestion Stage: Partition sensors wait for upstream data and data is ingested into Minerva.
- Data Check Stage: Data quality checks are run to ensure that upstream data is not malformed.
- Join Stage: Data is joined programmatically based on join keys to generate dimension sets.
- Post-processing and Serving Stage: Joined outputs are further aggregated and derived data is virtualized for downstream use cases.