How Airbnb Standardized Metric Computation at Scale | by Amit Pahwa | Airbnb Engineering & Data Science | Jun, 2021


Metric Infrastructure with Minerva @ Airbnb

Part II: The six design principles of Minerva compute infrastructure

By: Amit Pahwa, Cristian Figueroa, Donghan Zhang, Haim Grosman, John Bodley, Jonathan Parks, Maggie Zhu, Philip Weiss, Robert Chang, Shao Xie, Sylvia Tomiyama, Xiaohui Sun

As described in the first post of this series, Airbnb invested significantly in building Minerva, a single source of truth metric platform that standardizes the way business metrics are created, computed, served, and consumed. We spent years iterating toward the right metric infrastructure and designing the right user experience. Because of this multi-year investment, when Airbnb’s business was severely disrupted by COVID-19 last year, we were able to quickly turn data into actionable insights and strategies.

In this second post, we will deep dive into our compute infrastructure. Specifically, we will showcase how we standardize dataset definitions through declarative configurations, explain how data versioning enables us to ensure cross-dataset consistency, and illustrate how we backfill data efficiently with zero downtime. By the end of this post, readers will have a clear understanding of how we manage datasets at scale and create a strong foundation for metric computation.

As shared in Part I, Minerva was born from years of growing pains and related metric inconsistencies found within and across teams. Drawing on our experience managing a metric repository specific to experimentation, we aligned on six design principles for Minerva.

We built Minerva to be:

  • Standardized: Data is defined unambiguously in a single place. Anyone can look up definitions without confusion.

In the following sections, we will expand on each of the principles described here and highlight some of the infrastructure components that enable these principles. Finally, we will walk through the user experience as we bring it all together.

You may recall from Part I that the immense popularity of the core_data schema at Airbnb was actually a double-edged sword. On the one hand, core_data standardized table consumption and allowed users to quickly identify which tables to build upon. On the other hand, it burdened the centralized data engineering with the impossible task of gatekeeping and onboarding an endless stream of new datasets into new and existing core tables. Furthermore, pipelines built downstream of core_data created a proliferation of duplicative and diverging metrics. We learned from this experience that table standardization was not enough and that standardization at the metrics level is key to enabling trustworthy consumption. After all, users do not consume tables; they consume metrics, dimensions, and reports.

Minerva is focused around metrics and dimensions as opposed to tables and columns. When a metric is defined in Minerva, authors are required to provide important self-describing metadata. Information such as ownership, lineage, and metric description are all required in the configuration files. Prior to Minerva, all such metadata often existed only as undocumented institutional knowledge or in chart definitions scattered across various business intelligence tools. In Minerva, all definitions are treated as version-controlled code. Modification of these configuration files must go through a rigorous review process, just like any other code review.

At the heart of Minerva’s configuration system are event sources and dimension sources, which correspond to fact tables and dimension tables in a Star Schema design, respectively.

Figure 1: Event sources and dimension sources are the fundamental building blocks of Minerva.

Event sources define the atomic events from which metrics are constructed, and dimension sources contain attributes and cuts that can be used in conjunction with the metrics. Together, event sources and dimension sources are used to define, track, and document metrics and dimensions at Airbnb.

Prior to Minerva, the road to creating an insightful piece of analysis or a high-fidelity and responsive dashboard was a long one. Managing datasets to keep up with product changes, meet query performance requirements, and avoid metric divergence quickly became a significant operational burden for teams. One of Minerva’s key value propositions is to dramatically simplify this tedious and time consuming workflow so that users can quickly turn data into actionable insights.

Figure 2: Data Science workflow improvement.

With Minerva, users can simply define a dimension set, an analysis-friendly dataset that is joined from Minerva metrics and dimensions. Unlike datasets created in an ad-hoc manner, dimension sets have several desirable properties:

  • Users only define the what and need not concern about the how. All the implementation details and complexity are abstracted from users.
Figure 3: Programmatic denormalization generates dimension sets which users can easily configure.

By focusing on the “what” and not on the “how”, Minerva improves user productivity and maximizes time spent on their primary objectives: studying trends, uncovering insights, and performing experiment deep dives. This value proposition has been the driving force behind Minerva’s steady and continual adoption.

Minerva was designed with scalability in mind from the outset. With Minerva now serving 5,000+ datasets across hundreds of users and 80+ teams, the cost and maintenance overhead is a top priority.

At its core, Minerva’s computation is built with the DRY (Do not Repeat Yourself) principle in mind. This means that we attempt to re-use materialized data as much as possible in order to reduce wasted compute and ensure consistency. The computational flow can be broken down into several distinct stages:

  • Ingestion Stage: Partition sensors wait for upstream data and data is ingested into Minerva.



Source link