Impact Optimization in Trust & Safety – Airbnb Engineering & Data Science – Medium

There have been 500 million Airbnb guest arrivals, with listings in 191 countries around the world. Trust and Safety is key to the success and continued growth of Airbnb’s community.

We employ a complex system of defenses to identify and prevent malicious activities like identity theft, account-takeover, harmful content, credit card fraud, and more while preserving a seamless experience for our good users.

Our system is equipped with machine learning, experimentation, and analytics, as well as tools that empower our top notch operations teams. Our metrics tell us we’re doing well at keeping our community safe, but we must always ask –

“Is this the best we can do? How can we do even better?”

These questions have many layers. You can say there is no absolute “best” in the highly iterative world of fraud. But at any given point in time, are we using our resources in the best possible way to maximize “return” or impact to the business? This can unfold into numerous questions, such as:

  • Are we choosing correct thresholds for our front-line models?
  • When do we rely on human reviews vs. machine learning models?
  • How to choose the best set of frictions, and where in the product do we show them?
  • How much agent time should be spent on manual investigation?
  • How do we quantify the impact on good users?
  • What is the Return On Investment (ROI) of our defense system?

Some of the above questions are answered by smartly designed solutions like targeted friction, some through deep forecasting and/or analytics. At this stage, we are ready to take an extra step to connect the dots together and tackle the ultimate question:

“Net net, what is the $ impact of Trust and Safety system to our business, and how can we maximize it?”

Answering the above question requires a few components:

  1. Understand what components are needed to solve the puzzle
  2. Separate Input (driver) Metrics from Output (result) Metrics
  3. Understand how the pieces connect and fit together to provide a holistic view of the state of the world
  4. Identify opportunities and prioritize actions accordingly.

Input Metrics vs. Output Metrics

We adopted an Input/Output Metrics framework to surface focus areas of the business.

  • Output metrics (OMs) are top-line metrics that measure success of the products and/or operations. OMs need to be straightforward for people to understand.
  • Input metrics (IMs) are key drivers of OMs. Most IM’s are in our control (e.g., model precision) and thus actionable, while some are not (e.g., fraud attack volume). Many IMs are leading indicators of the effectiveness of our defense system and thus are key to enable rapid-react cycle.

Impact Optimization Framework

We then put our metrics into two contrasting buckets: “benefit” vs. “cost”. Below is a simplified version to solidify the idea. One key way in which we measure benefit is by the amount of financial loss prevented. Cost depends on how we spend our money, as well as how much potential revenue we are losing from good users. The difference between the two is the “net benefit”, or “return” of having each component of the defense system in place.

Sounds simple, right?

It’s not that simple because of a few nuances –

  • In reality, fraudsters have to execute multiple steps to realize their financial gains — for example, takeover a good user account, then post a fake listing, then convince a good guest to send them money outside of Airbnb’s platform. As a result, not all fraud activities directly result in financial losses — in the example above, if the fraudster had just taken over the good user account but was not able to post a fake listing, the loss would not happen. Therefore, we need to bake in the propensity of different types of downstream losses when evaluating the loss amount prevented by our various frictions and human reviews.
  • If we have only a few types of “checkpoints” (either a user friction or a human review), addressing each metric listed above and consolidating them would be straightforward. In reality, we have more than a few checkpoints, many of which might be interdependent in the funnel. Therefore demystifying the logic can get tricky

Use Cases

One immediate use case is to evaluate the business impact and ROI of our Operations. In many organizations, operations is perceived as a “cost center.” This could potentially downplay the business value we are creating or the opportunity cost of not having proper operations. The value proposition of Operations warrants more effective decision making in staffing and workflow optimization, for example.

Funnel optimization is another big use case. The optimization framework helps to answer questions like “when do we verify a user’s identity vs. when to verify ownership of payment instruments”, and “when do we show in product defenses vs. when do we send to human review”.

Data scientists, engineers and many others on Trust are working hard to tackle many more use cases. If you are interested to join us in solving challenging optimization problems, we’re always looking for talented people to join our Data Science team!


Special thanks to Josh McMullen for being my editor and reviewer. I also want to thank Nilesh Dalvi, Dave Hafferty, Dasha Cherepennikova, and many other members of our amazing Trust team for ideating, collaborating, and coordinating on impact optimization. Also many thanks to Benjamin Breit and Sophie Sung for reviewing this post.

Source link