PinnerSage: Multi-Modal User Embedding Framework for Recommendations at Pinterest | by Pinterest Engineering | Pinterest Engineering Blog | Aug, 2020

Aditya Pal | Applied Science, Chantat Eksombatchai | Applied Science, Yitong Zhou | User Understanding, Bo Zhao | User Understanding, Charles Rosenberg | Applied Science, Jure Leskovec | Applied Science

As we build a visual discovery engine that powers 2B+ Pins, it’s crucial to understand user interests and preferences in order to serve relevant content. One standard approach to encode user preferences is via an embedding-based representation in a high dimensional space. Most prior methods tried at Pinterest infer a single high-dimensional embedding for each user in compatibility with the content embedding. This is a good starting point but falls short in delivering a full understanding of the user.

In this work, we postulate that a single embedding is not sufficient for encoding multiple facets of a user’s interests that might have no obvious linkage between them. They can evolve, with some interests persisting long term while others span a short time period. Recommended items are also represented in the same embedding space. A good embedding must encode a user’s multiple tastes, interests, styles, etc., whereas a recommended item (a video, an image, a news article, a house listing, a pin, etc.) typically only has a single focus. Hence it becomes important to represent a user with multiple embeddings, with each embedding capturing a specific aspect of their interest.

Figure 1: Overview of PinnerSage model

Design Choice 1: Pin Embeddings are Fixed

Joint embedding inference models, where both user and Pin embeddings are inferred together, can be too complex and hard to scale. Moreover, we posit that in practice they compromise recommendation relevance, as some spurious connections between pins can be established via the users. To see this point, consider the example in Figure 2.

Figure 2: Three interests of a given user.

In the above example figure, a user is interested in painting, shoes, and sci-fi. Jointly learned users and Pin embeddings would bring pin embeddings on these disparate topics closer, which can compromise the relevance of the nearest neighbor-based recommender. Pin embeddings should only operate on the underlying principle of bringing similar pins closer while keeping the rest of the pins as far as possible. For this reason, we use PinSage, which precisely achieves this objective without any dilution.

Design Choice 2: Unlimited User Embeddings

PinnerSage generates as many interest clusters as the underlying data supports. This is achieved by clustering users’ actions into conceptually coherent clusters via a hierarchical

agglomerative clustering algorithm (Ward). A light user might get represented by 3–5 clusters, whereas a heavy user might get represented by 75–100 clusters.

Design Choice 3: Medoid-based Cluster Representation

only requires storage of medoid’s pin id, and leads to cross-user and even cross-application cache sharing. It also allows our system to be compatible with other non-embedding-based recommendation systems such as Pixie.

Design Choice 4: Medoid Sampling for Candidate Retrieval

Design Choice 5: Two-Pronged Approach for Handling Real-Time Updates

Table 1 shows that PinnerSage provides significant engagement gains on increasing overall engagement volume (repins and clicks) as well as increasing engagement propensity (repins and clicks per user). Any gain can be directly attributed to increased quality and diversity of PinnerSage recommendations.

Table 1: A/B test of PinnerSage vs current production, which includes a single embedding model.

Source link