Lessons from 1.5 Years of our AI & Data Reading Group


Group photo of the reading group for Data Standardization, Knowledge Graphs, Natural Language Understanding, and Conversational AI at LinkedIn, celebrating the group’s 17-month anniversary

 

Co-authors: Qi He and Jaewon Yang
 

For the past 17 months, a group of roughly 20 people of different ages, genders, and backgrounds have been meeting every Friday in LinkedIn’s Silicon Valley HQ. At these meetings, they discuss cutting-edge research in specialized areas of machine learning and artificial intelligence (AI). Seeing this group in action, you’d be forgiven for assuming that they were be reviewing literature for the editorial board of a scientific journal, or working on a new top-secret project. However, the group is none of these things—these gatherings are actually the regular meetings of the LinkedIn Data & AI Reading Group.

Scaling research knowledge, expertise, and curiosity

Encouraging knowledge-sharing and learning within any engineering or research organization is important. This is especially true at a company like LinkedIn, where the valuable data contained in our Economic Graph can be combined with the latest data science and AI techniques to create new solutions to problems affecting our members and customers. However, researchers and practitioners in fields like AI, data science, and machine learning face a unique challenge when it comes to reviewing, assimilating, and discussing the newest developments: according to the 2017 AI Index Report, the number of new AI papers published each year has increased by more than 9x since 1996.

To help tackle this problem of scale, one method of encouraging intellectual enrichment and collaboration is to hold regular meetings of like-minded peer researchers—just like in college or university programs. This reading group idea was initiated by me and Bee-Chung Chen, and then independently led by Staff Software Engineer Jaewon Yang since its launch in July 2017.

Creating the group

At first, this new group at LinkedIn focused specifically on the review of literature related to chatbots. We would sit together every Friday and discuss a cutting-edge paper—just like we used to do when reviewing our homework back in school. Over time, the focus of the group broadened to include research in fields such as conversational AI systems, natural language understanding, data standardization, and knowledge graphs.

Here is the format we’ve been following. Each week, one person volunteers to present a paper. The presenter announces which paper they’ll study on Monday. To help the presenter pick a paper, we manage a wiki listing of interesting papers that could be good options to choose from. In the meeting, the presenter gives a deep dive on the paper to the audience. After we cover the content of the paper, we discuss high-level takeaways, such as what to leverage for our work, strengths and weaknesses of the paper, and so on. In some cases, we decide to try the main ideas of the papers in our own projects, such as using cross-lingual word embeddings, using multiple rules to annotate our data sets, and dividing a compound word into subwords.

Below are just a few of the interesting AI topics that we have covered in the past 17 months:

  • Knowledge graph based question answering
  • Semantic parsing
  • Dialogue management systems
  • Translating natural language to database queries
  • Entity identification and resolution
  • Machine reading comprehension
  • Sequence to sequence models
  • Natural language understanding
  • Machine translation
  • Cross-lingual word embeddings
  • Machine learning for taxonomy generation
  • Generating training data from weak supervision

Among the above AI topics, we’ve also focused on state-of-the-art deep learning methods that can tremendously improve the way an AI agent understands the semantic meanings of natural language sentences, much like a human brain. Below are three example papers that were featured in our reading group:

Research outcomes

We’ve applied our learnings to develop deep learning methods for many LinkedIn applications. For example, we applied neural networks to find the right articles to answer member questions in the LinkedIn Help Center, to segment job posting text into different sections, and to build an internal Analytics chatbot (Anabot) to handle questions about LinkedIn product metrics, etc.

In a KDD 2018 tutorial, titled End-to-end goal oriented question answering, we presented a special deep dive on the recent 100+ papers that have come out about question answering, discussing their common technical components along with our own practical experiences. On Jan. 28, we will also be presenting a tutorial on this topic at the Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19).

We also covered engineering designs that enable end-to-end systems in two LinkedIn use cases:

  • Analytics Bot (Anabot) is an internal Q&A bot that can answer questions about company metrics. The data analytics team uses the bot to address questions they get from business partners.

  • Help Center Search is an AI model we developed that can understand members’ questions in the Help Center article search.



Source link