Figure 4: Illustration of querying Pinot servers before/after replica group assignment strategy
Another observation we made when analyzing query performance was that Pinot requests would often be dropped when serving the bursty requests needed for powering the metrics on Talent Insights. We noticed that this occurred when all the Pinot broker connections were occupied when communicating to the Pinot servers. When a Pinot broker received a query, it would check out connections from the thread pool for each server it communicated with and hold them until the responses were returned by the servers. While waiting for the query to complete, all the connections would be unusable by subsequent queries, resulting in an inefficient use of threads. To address this, the Pinot team implemented asynchronous broker routing, in which broker connections to the servers would no longer be held while waiting for the response. After this change was rolled out, there were no longer any dropped requests from Pinot, even with high QPS/bursty traffic.
Several other changes were made to the Pinot infrastructure that ultimately helped support the high read throughput use case for Talent Insights, such as the addition of the VALUEIN transformation function, upgrading the cluster hardware to SSD machines, and supporting tenant-level isolation. By planning and coordinating with the Pinot team, we were able to parallelize these optimization efforts with the ones made at the query and data layer.
It was imperative to be responsible Pinot clients and optimize queries wherever possible. We inspected all the different query patterns for powering the metrics on Talent Insights and identified the ones that were performing particularly slowly. The common trait between the slow queries was that they scanned a high number of records to process the query. In database terms, this is related to selectivity, or the ability of a query to narrow the results using the index. To ensure selectivity, we had to write queries that limited the number of possible documents when using the column index and which would, as a result, lead to faster aggregation in Pinot.
With this in mind, we made significant efforts to make our queries as selective as possible. For example, if we wanted to get the companies that employ the most software engineers in the United States, we added a filter to exclude the long tail of small companies that don’t need to be returned to the frontend. Furthermore, when querying top skills of a talent pool, we removed skills commonly held across many job roles like “leadership” and “Microsoft Word” from the dataset, which significantly reduced the amount of data to be aggregated.
Here’s an example of a query with a higher selectivity value. ~72M members on LinkedIn have “management” as a skill.
SELECT COUNT(member) FROM myTable WHERE skills = “management”
Here’s an example of a query with a lower selectivity value. ~3,000 members on LinkedIn have “quantum field theory” as a skill.
SELECT COUNT(member) FROM myTable WHERE skills = “quantum field theory”
We also realized there were many duplicated queries when generating the reports on Talent Insights. Queries would sometimes be repeated for metrics that appear across multiple locations within the application. Adding a Pinot query result cache (Couchbase) tremendously helped performance and reduced overall Pinot QPS.
User interface optimizations
Our optimization strategy of minimizing the number of Pinot queries influenced the design of the Talent Insights user interface. For instance, rather than firing queries with every change to the search filter, we waited for the user to signal completion of their search facets by hitting an “Apply” button before making any query to Pinot. This helped cut down on unnecessary queries for intermediate results that a user might not care about. To further reduce the application’s load on Pinot, we made sure that the rows in the tables in the product were “lazily loaded”. Given that the tables typically showed 10 results on each page, we made sure our APIs implemented pagination so that we did not fire queries for subsequent results until the user clicked on the next page, which greatly reduced the number of queries needed to load the table initially.
Collaboration: “Relationships matter”
The LinkedIn mantra of “relationships matter” once again proved true in building Talent Insights. Many different teams had to come together to bring this tool to life. It was key to establish a strong working relationship, especially when working with teams across LinkedIn’s different campuses. We had frequent syncs with partner teams to make sure expectations were set and people were held accountable for completing their tasks.
Getting engineering teams to support a product that has not yet been built can be tough. Why should I spend my resources building something that currently has no users or generated revenue? Those times are the most crucial for leveraging and building relationships, which are the true foundation of your tech stack. They are also the foundation for your future collaboration.
Although we were hacking away at optimizations into uncharted territory, we knew we were paving the way for other teams, with hopes they would reap the benefits of our work. There were times of frustration and doubt, but we pushed through. Looking back, we can confidently say our work paid off, as other teams have benefited from our learnings and success. The Pinot optimizations that were motivated by Talent Insights have helped other teams with their own use cases.
Dream big, but appreciate the journey
It is a rare opportunity to get the chance to build out a brand new enterprise product from scratch. With all the chaos that comes with tight deadlines, it is easy to get so caught up in work that you forget to “stop and smell the roses.” It is important to take a step back and remember that you’ll get there one step at a time.
We knew that we weren’t going to immediately support the QPS of our predicted customer usage with one magical optimization. But it is important to dream big and set specific, measurable, attainable, realistic, and time-bound goals (S.M.A.R.T.) along the way. And with each goal/milestone attained, it is important to give kudos and celebrate. Take some time to appreciate your team and all who have supported you.
Talent Insights now
Long gone are the days when we were scrambling for more capacity in our Pinot cluster. Talent Insights has since scaled to serve over 2,000 companies across the world. Not only does Talent Insights serve requests for its own users, but it also now serves company-related metrics across various LinkedIn products, including Recruiter, Sales Navigator, Company Pages, and Premium Insights.
Building out Talent Insights was truly a collaborative effort that required many cross-functional partners. We would like to extend our special thanks to our partners on the Pinot and UMP teams who helped launch this product: Kishore Gopalakrishna, Ravi Aringunram, Prasanna Ravi, Jackie Jiang, Sunitha Beeram, Seunghyun Lee, John Gutmann, Dino Occhialini, Mayank Shrivastava, Shraddha Sahay, and Ameya Kanitkar, under the leadership of Kapil Surlaker. Finally, we would not be where we are today without the hard work of the entire Talent Insights Engineering team who brought this product to life.