Building Airbnb’s Internationalization Platform | by Hua Zheng | Airbnb Engineering & Data Science | Oct, 2020


Airbnb’s vision is to allow people to “Belong Anywhere” by helping travelers feel at home anywhere in the world. This vision is inspiring, yet it presents challenges on numerous fronts, one of which is overcoming the communication barrier. The diversity and richness of our cultures makes us unique as humans, but can sometimes get in the way of relating to one other, given the varying forms in which we express ourselves through language. Thus, bridging the language gap between people is fundamental in helping to create a world where we can all feel belonging no matter where we are.

In this blog post, we will discuss how we built Airbnb’s Internationalization* (I18n) Platform in support of that vision, by serving content and their translations across product lines to our global community in an efficient, robust, and scalable manner.

* Internationalization is the process of adapting software to accommodate for different languages, cultures, and regions (i.e. locales), while minimizing additional engineering changes required for localization.

Content in the Airbnb UI ends up being aggregated from hundreds of microservices, and displayed in the language specified by user preference or locality. Content is composed of phrases (content units), uniquely identified and stored in a central data repository and dispatched after creation (or modification) for translation to all supported production languages. Once the translations are ready, they are propagated to clients apps to be displayed to users.

There were several main system requirements we wanted to uphold in our design:

  • Performant: translate calls are served with very low latency, without the need for further application-layer optimizations (ex: batching, deferred execution, parallelization).
  • Scalable: the system should efficiently scale with the increase in client apps, supported languages, content units, and traffic growth.
  • Available: the system is resilient to failures, and limits possible downtime for dependent clients.
  • Cross-Language: apps across multiple platforms and programming languages are supported.
  • Integration ease: onboarding clients is easy, seamless, and results in minimal churn to development.
Figure: High-level view of the overall architecture

A Content Management System (CMS) allows engineers and content strategists to create, access, and modify content. It also supports other management features such as submitting content for translation, adding relevant metadata (ex: description, screenshots) to improve translation quality, and tracking translation progress.

Content units (phrases) and associated metadata are stored and managed by a Content Service. Each phrase has a unique string key to identify it, along with a timestamp indicating when it was last updated. Phrases also belong to collections, which offer a logical grouping to provide context on the product or app domain in which the content is served.

Newly added or modified phrases that are marked as ready to translate are sent at a regular interval to an external Translation Vendor for translation in the set of target locales. Once done, the vendor notifies an External Callback Service of the new translations batch, which is then packed and sent as translation events to our Event Bus. Following that, an event consumer listens to new translation events, parses the translation units, and writes them to a Translations Service where they are persisted and sent for delivery to client apps.

The Translation Service stores all the translation versions for each phrase. Translations are keyed by [phrase key, locale, timestamp], and are immutable, in order to offer a historical audit trail. Only the latest translation for each phrase is served to clients.

A Snapshotter component periodically loads the latest translations for each locale, creates a JSON blob snapshot, and stores it with the associated timestamp to an object store. The snapshots offer a deterministic view of phrase translations at a specific point-in-time, and help with populating the local client-side translation cache (as we will see later).

On each client instance (can range from a microservice to a web server), translation data is downloaded and stored in a Local Store, which acts as a persisted in-node on-disk key-value cache of all translations accessed by the app. This allows for resolving client translate requests locally and avoiding network calls to the server. There are several benefits to this approach, mainly in improving availability, reliability, and request latency. It also provides loose coupling between the Translation Service and clients, promoting resilience in case of service downtime.

I18n Agent

An I18n agent is deployed on each client app instance as a separate process, and is responsible for keeping the Local Store in-sync with the server-side store. Responsibilities include: fetching the latest translations, performing pre/post processing, and managing on-disk storage. This helps encapsulate data access patterns and synchronization operations to the local cache, allowing easy integration with apps implemented in different languages (Java, JS, Ruby).

The main operations the agent performs are:

  • Initialize: bootstrap new client app instances with the latest translations snapshot.
  • Sync: continuously pull in new incoming translations and update the Local Store.

Figure: Sequence diagram of translate calls to the Translator library.

The Translator library in the I18n client is used by the app to translate content, given a phrase key and locale. The pertaining translations are fetched from the Local Store, with no fallback to server in case the translation does not exist. Missing content or translations are detected and remediated in an asynchronous manner via introspection (explained later). Serving translation requests locally allows us to achieve low latency (sub-millisecond), while ensuring deterministic load & scaling for our server fleet.

The client library has other features as well to support client apps, such as:

  • Fallbacks: If a translation is not found in the requested locale, we fallback to a parent locale translation according to a predetermined fallback chain. If no translations are found in any fallback locale, we return the phrase in its original source language.
  • Pluralization: Languages (ex: Russian) may have different plural rules, which translations should accommodate for when numerical qualifiers are present in a phrase.
  • Interpolation: Phrases can have embedded variables that are resolved at runtime. The
    client helps replace them with their associated value before returning the translation.



Source link