Communities AI: Building Communities Around Interests on LinkedIn

Figure 1: Example follow recommendations for a member. This example shows hashtag and company recommendations.

The goal of the Follow Recommendations product is to present the member with follow recommendations that the member finds both relevant (i.e., increase the probability the member will follow the recommended entity) and engaging (i.e., the recommended entity produces content that the member finds relevant). Estimating relevance and engagement is where the bulk of the machine learning work happens. To compute these estimates, we rely on information (features) from the viewer (member), the entity to be followed (e.g., influencer, company, hashtag, group), and the interaction between the viewer and the entity (e.g., the number of times the member viewed the feed of a specific hashtag).

There are over 630 million members on LinkedIn. This presents a scaling challenge and a relevance challenge. The Follows Relevance data flow processes hundreds of terabytes of data and is the second largest at LinkedIn after the People You May Know flow. To understand how we managed this explosion of data, we refer the reader to the article Managing “Exploding” Big Data.


As soon as a member follows an entity, the content generated from that entity starts flowing to the member’s main feed and is ranked in conjunction with other content (e.g., posts from 1st degree connections). 

The member can also go to specialized feeds for each entity that they follow, be it a hashtag, group, event, etc. We personalize these feeds by ranking more relevant content higher. For example, a post in the #AI feed that is posted by an influencer the member follows is more likely to be relevant than a post that is generated by another member.

The goal of feed ranking at LinkedIn is to help members discover the most relevant conversations that will aid them in becoming more productive and successful. Relevance is determined by our objective function which optimizes for three main components: The value to the member, the value to the member’s network (downstream effects), and value to the creator of the post. A diverse set of machine learning and experimentation techniques are used to estimate these three components and the combined effect of the three (e.g., see Spreading the Love in the LinkedIn Feed with Creator-Side Optimization). 


Our Hashtag Suggestions and Typeahead (HST) product recommends hashtags that allow the member to effectively target their posts to the right communities. In addition to reducing the friction the member faces when trying to add hashtags to their post, the HST product allows us to consolidate content around areas of interests and prevent content fragmentation.

The objective of HST is to both increase the probability that a member will select relevant hashtags from the recommended list to add to their post and increase the relevant feedback the member will get on their post. Here we use a variety of natural language processing (NLP), deep learning, word embedding, and supervised learning techniques to recommend relevant hashtags. The HST product is shown in Figure 2 below.

Source link