Linchpin DSL for Pinterest ranking

Angela Sheu | Pinterest engineer, Home feed Infrastructure

At Pinterest, we’re building a visual discovery engine with a growing dataset of 100B+ ideas. Our engineers are tasked with showing the right idea to the right user at the right time across home feed, search, Related Pins and more. Engineers use shared Pin features and user attributes to make more than 10B recommendations every day. Because multiple teams use the same data pipelines and frameworks, it’s important that models can be used consistently in both a development environment and in production.

Before, teams created separate processes for developing machine learning (ML) models. As these models became more complex, and teams increasingly had similar needs for a model development workflow, we needed a common language to express, evaluate and deploy models across multiple teams. The answer for us was the Linchpin DSL, and we’ll demonstrate why in this post.

Why the Linchpin DSL?

Only a tiny fraction of code in machine learning systems actually does “machine learning.” The rest is usually glue code that performs some version of: (1) injecting data into the scoring system, (2) transforming data into a list of numeric features and (3) interpreting a model to combine features into a score.

With a domain-specific language (DSL), developers can focus more on feature development and actual machine learning since the need to write glue code is eliminated.

Creating a model spec

A model spec written using Linchpin DSL syntax is a list of ordered statements of the form:

Linchpin has a small library of available computation nodes (i.e. “Transforms”) that are meant to act as primitive operations. Sometimes there are more complex Transforms, such as decision forest transforms for our Gradient Boosting Decision Tree (GBDT) models and multi-linear transforms for our Neural Net Models. A developer simply writes a new Linchpin Transform if the desired feature can’t be expressed with the existing library.

If we ranked a Pin for a user’s home feed, an example model spec might look like this:

Note: This is an extremely simplified model for illustration.

A peek under the hood

There are two main components of Linchpin: (1) parsing a model spec into a computation graph and (2) evaluating the created graph. The former is specific to the Linchpin DSL, while the latter is a computation system — independent of the DSL format.

Phase 1: Linchpin parses a model spec into the Model DAG and builds a graph of Model Nodes. These can be Source Nodes, Interior Nodes (transform / computation nodes) or Sink Nodes. At this phase each node has identified the size and value type of its output, allowing us to perform preliminary validation checks.

Phase 2: The caller provides Linchpin with data for scoring and injects the given data into Source Nodes. Linchpin then evaluates all subsequent Interior Nodes and extracts output from Sink Nodes to return to the caller.

Putting everything together, we might create the following DAG in phase 1:

We maintain an array of values during the computation that might look like this:

How we use Linchpin

In 2014 the home feed team designed and implemented this DSL in Java to rank personalized content. Over the next two years, the home feed evolved significantly. As Pinterest quickly grew, the number of users increased by 2x and the number of Pins ranked went from ~5B scored per day to 70B. We used to pre-rank content offline, and today our systems rank content in real-time. Because of this, latency and performance have become paramount. In 2015, we implemented a Pinnability DSL V2 in C++ for performance optimization purposes and the DSL was renamed “Linchpin.” (Pinnability is the collective name for our machine learning models that power recommendations across Pinterest.)

In home feed, we’ve seen the following benefits from adopting Linchpin:

  • Inputs decoupled from computation. Because Linchpin enables us to decouple inputs from computation, we apply the same computation in different environments and contexts. This guarantees consistency between offline training and online scoring and that the Hadoop computations used to generate training data will be consistent with the serving-time scoring results from RealPin.
  • Ease of experimentation. In any given week, Pinnability engineers may perform 20 new experiments while also determining whether or not to shut down or ship existing experimental models. By using the Linchpin DSL, we’re able to treat models as data and can run multiple experiments with all feature changes in a single model spec. We ship a model simply by changing the model spec used during scoring time and can rollback a bad model or run older, retired models.
  • Modeling changes separated from infrastructure changes. The Linchpin DSL also enables us to separate feature development from infrastructure code. In early 2017 the home feed infrastructure team migrated Pinnability’s data sources from Thrift to FlatBuffers format while the Pinnability team changed their models from using GBDT to Neural Networks in parallel.

In addition to the home feed team, the Related Pins, Search and Ads teams have also started using Linchpin. This means that home feed Pins, search results and Related Pins are all ranked using Linchpin.

The future of Linchpin

As models have become more complex, the way of specifying models using the Linchpin DSL syntax can get unwieldy and is often error-prone. We’ve begun efforts to move toward a mechanism to build and debug models using a python interface, similar to the way that models are built in TensorFlow.

Until then, Linchpin has already significantly sped up the developer velocity of ML engineers and provided numerous teams across Pinterest with a way to develop and evaluate models. It continues to chug away as one of our key engines powering Pinterest’s ranking systems.

Source link