Figure 2: Examples of the LinkedIn Feed and a Jobs page
People + machines to leverage data at scale
Many people think of AI as a completely automated process with no human input, but much of the data used by our AI systems and many of the ways we deploy those systems are reliant on human input. Take the example of profile data. At a fundamental level, almost all our member data is generated by members themselves. As a result, one company might have a job called “senior software engineer,” while at another company, the same role would have the title “lead developer.” Multiply this by millions of member profiles, and you begin to realize that providing a good search experience for recruiters, where all of these varying job titles show up, can be a very challenging task! Standardizing that data in a way that our AI systems can understand is an important first step of creating a good search experience, and that standardization involves both human and machine efforts. We have taxonomists who create taxonomies of titles and use machine learning models (LSTM models, other kinds of neural networks, etc.) that then suggest ways that titles are related. Understanding these relationships allows us to infer further skills for each member beyond what is listed on their profile; for instance, someone who has a set of “machine learning” skills also understands (at least a subset) of “AI.” This is just one example of the kinds of of taxonomies and relationships that make up the LinkedIn Knowledge Graph.
As you can see, our approach to AI is neither completely machine-driven nor completely human-driven; it’s a combination of the two. We believe that both elements working together in harmony is the best solution.
Deep learning for personalization and content understanding
To perform personalization at the member level, we need machine learning algorithms that can understand content in a comprehensive fashion. Combining machine learning with member intent signals, profile data, and information about a member’s network, we can extensively personalize the recommendations and search results for our members.
We heavily leverage deep learning, a branch of machine learning that automatically learns complex hierarchical structures present in data using neural networks with multiple layers, to understand content of all types. We have developed new classes of machine learning models based on generalized mixed effects models (GLMix) to combine disparate sources of data for personalization at the member level.
In addition, deep learning methods can also capture nonlinear patterns in both temporal, sequential, and spatial data in an effective fashion. We employ three broad classes of deep learning methods for most of our natural language processing and computer vision tasks: the aforementioned LSTM, CNNs, and sequence-to-sequence models. We also employ canonical multi-layered perceptrons wherever necessary for some supervised learning tasks.
Putting AI into production at scale
Getting an AI system up-and-running can be a daunting challenge. When I started working on the AI team at LinkedIn several years ago, we already had a rich collection of data from many different sources. This benefited us for one aspect of AI creation, but our remaining challenge was twofold: scaling our people (due to a worldwide AI talent shortage) and scaling our infrastructure for deploying sophisticated models that are compute-hungry and built by processing very large data. These challenges still face many in the tech industry today.
Scaling our people
To scale our AI engineers, statisticians, and data scientists, we’ve adopted a centralized organizational model that embeds our experts with product teams but maintains the reporting relationship within a centralized AI organization. This allows us to find unique opportunities to cross-collaborate and problem solve for the entire member experience, while still applying more localized optimizations for machine learning problems at the product level. Our engineers often find ways to collaborate on disparate projects and share knowledge more easily because of our centralized organization.
LinkedIn AI Academy is another program that helps equip our employees across the company—in areas like engineering, product management, etc.—with the knowledge they need to optimally deliver impactful AI experiences to our members. As part of this program, engineers, for example, take a course that consists of five one-day-per-week deep-dive classes, and a subsequent, one-month apprenticeship with the core AI team. It takes participants from understanding how to incorporate and maintain an AI system to the step of actually shipping one for their team. For product managers and company executives, there is a single day-long deep-dive session that focuses on the specific domain knowledge that they’ll need to manage AI products.
A platform to train and deploy any AI model
Each AI system can only utilize certain types of data, a restriction that’s dictated by “features” that are built into a model. These features describe different kinds of information that we think might be useful to make better recommendations. For example, your job title may be a feature that can be used to match you to new job opportunities down the road. Our experts and A/B testing framework then teach the AI system how to use these features to make better recommendations based on previously-available data (e.g., someone with the job title “intern” might be more interested in junior developer listings than senior developer listings, for instance).
Doing this work can be a time-consuming process. Just at LinkedIn, we have hundreds of models in production across our various products, and hundreds of thousands of features. So we built an “AI automation” platform called Pro-ML that allows us to centrally manage the features and machine learning models for every engineering team at the company from one system. This system provides a single platform for the entire lifecycle of developing, training, deploying, and testing machine learning models. It has already massively accelerated the speed at which we can build and deploy new products at LinkedIn.
Scaling our infrastructure
On the data infrastructure side, we have a long history at LinkedIn of innovation in this space.
For example, we use our now-famous data messaging system, Kafka, as the “central nervous system” of everything at LinkedIn. We have our own stream processing framework, called Samza, which is also open sourced and used by other companies around the world. In addition to these streaming data systems, we have contributed to the Hadoop ecosystem and a variety of other projects, such as Ambry. We’ve also contributed new open source projects to help accelerate machine learning uses cases for Spark.
We consume a wide variety of open source software for our projects as well. For example, our deep learning workflows extensively use TensorFlow, which is a project that originated at Google. We use Spark with Scala extensively for data processing, and use Pig and Hive for data analytics.
In addition to these open source innovations, our recent collaborations with Microsoft have allowed us to take advantage of some of the artificial intelligence services offered on Azure. For example, as detailed in previous blog posts we use the Microsoft Text Analytics API for dynamic translation of content in the feed.
Making magic happen
AI is like oxygen at LinkedIn—it powers everything that we do. But why do we think that everything we do can benefit from AI? Here are a few reasons.
Our AI systems have had a huge impact for members who are trying to find a job. We saw a 30% increase in job applications from deploying just one AI system that improved the personalization of Jobs You May Be Interested In (JYMBII).
Job applications overall have grown more than 40% year-over-year, based on a variety of AI-driven optimizations that have been made to both sides of the member-recruiter ecosystem.
AI-driven improvements to our recruiter products have helped increase InMail response rates by 45%, while at the same time cutting down on the notifications that we send to our members.
AI has improved article recommendations in the feed by 10-20% (based on click-through rate).
If you’ve read this article and are interested in learning more about AI at LinkedIn, be sure to watch the free course videos for “AI the LinkedIn Way: A Conversation with Deepak Agarwal,” now on LinkedIn Learning.