Using PID controllers to diversify content types on home feed | by Pinterest Engineering | Pinterest Engineering Blog | Jul, 2020

Yaron Greif | Software Engineer, Homefeed Ranking

Every day millions of Pinners visit the home feed to find inspiration on Pinterest. As a member of the home feed ranking team, it’s my job to not only figure out what relevant pins to show Pinners but also to make sure that those Pins will help maintain the health of the overall Pinterest ecosystem. For instance, relative to ranking just for relevance, we might display more newly-created Pins to ensure our corpus doesn’t become stale, or more video Pins to surface actionable ideas from creators.

Traditional click-through prediction models are designed to maximize user engagement, but they don’t help achieve those other business objectives. To solve those other objectives, the home feed ranking team introduced controllable distribution, a flexible real-time system applied after the traditional ranking layer to control the tradeoff between areas like relevance, freshness, and creator goals by boosting and demoting the ranking scores of content types.

Before controllable distribution, we solved for those business constraints through a large number of special case solutions in the codebase. The two most common solutions were to simply insert the content we wanted more of approximately every n slots or to move the content up on the feed until a minimum percentage of the content returned is a particular type.

Those types of solutions were painful for both practical and theoretical reasons.

In practice, these hand-tuned boosts quickly became unmanageable and interfered with each other. And worse, they often stop working over time — especially when ranking models are updated. We regularly had to delay very promising new ranking models because they broke business constraints.

In theory, controlling content on a per-request basis is undesirable because it prevents personalization. If we show each user the same number of video Pins we can’t show more videos to people who really like to watch videos or vice versa.

Controllable distribution replaces those hard-coded constants with a system where business owners can specify a global target for the percentage of impressions by content type. For example, if 4% of the feed is set to video, controllable distribution can then automatically determine how to achieve that distribution while still respecting Pinner content preferences. Importantly, controllable distribution adjusts the system continuously in realtime, so it does not grow stale.

Controllable distribution does this through a system that tracks what percentage of the feed was video in the past and then boosts or demotes the content based on how close to the target video is. The boost is implemented by increasing the ranking systems score by a scalar that we call a “normalization constant.”

To motivate normalization constants we can formulate the Pinterest ranking setting as an optimization problem subject to constraints imposed by controllable distribution. The normalization constants are then the Lagrangians of that optimization problem.

For every user i slot j pair, the system selects pin Xij to maximize the ranking scores. Controllable distribution adds a constraint that every Pin type t should make up Pt percent of the feed

The optimization problem then becomes:

The Lagrangian form is then:

The Lagrangian λ are our normalization constant. From an economic perspective, the λ is the shadow price or acceptable opportunity cost to select a Pin of type t. We are willing to give up λ of expected engagement to show a Pin of type t.

The above optimization problem cannot be solved in practice because we don’t know in advance the set of Pins that will be ranked. Instead, without controllable distribution, the solution is approximated by greedily selecting Pins with the highest ranking score. Since λ for type t is independent of the user and slot, the decision rule above can be updated to select the Pin with the highest combined ranking score and normalization constants.

λ for type t is approximated by observing in real-time the error g(t)and adjusting λ accordingly.

For instance, in the below experiment we wanted the actual percentage of Pins of a certain type to be 15.5%. It started high, at 20%. When the system saw the content was being over distributed, it reduced the constant and eventually the percentage converged to about 15.5%.

Source link