Which ads should we buy? How much should we pay for each ad? How should we measure the performance of each ad? We built a marketing system to help answer these questions.
By Tao Cui, Ye Wang and Bassel Namih
Belonging is at the core of Airbnb’s mission. One essential step to make people feel like they can “belong anywhere” is to have homes where people can stay around the world. Airbnb is a community-driven super-brand, and acquiring hosts through the community’s word of mouth has been an important channel for growth. However, as we have continued to grow our business, we need to find new ways to acquire hosts and homes, and to find ways to truly help people “belong anywhere” — which means not only in big cities like San Francisco, London and Paris, but also in less populous countries such as Bolivia and Nepal, where only very few people may have heard about Airbnb.
Online advertising is one of the most effective ways to reach potential new hosts and to increase awareness about Airbnb. It includes email marketing, search engine marketing (SEM), social media marketing, and display advertising. After extensive research, we have developed an internal online advertising system to expand our host base, and the number of hosts acquired through online advertising system has grown exponentially. During the system development, we found that abundant information about how advertising platforms such as Google and Facebook monetize with advertising, but very limited information is available about large marketplaces like Airbnb can fully utilize these platforms.
This post will provide an architectural overview of the marketing system we built and discuss the main issues that were tackled in the system.
What is the Problem?
We want to maximize the return on a given budget by acquiring new hosts through online advertising. This can be partitioned into two sub-problems:
- Which ads should we buy?
- How much should we pay for each ad to be shown on an advertising platform?
In short, when you visit a supermarket you should know what you need to buy and what to pay for each item.
The ultimate goal of our marketing system is to automatically generate ads, set and adjust bids, and allocate budgets in order to hit key business objectives and goals. The system should also support performance reporting and ads experiments analysis and visualization.
The Life of an Ad
At the high level, here is a diagram describing the life of an ad:
Ads are created on online marketing platforms and are displayed to users when certain criteria are satisfied. Users click on the ads and land on Airbnb’s host landing page. Airbnb will log the click landing event. Users with intent to list their spaces on Airbnb will go through the the new host on-boarding experience — List Your Space (LYS) flow. By the time a listing gets its first booking on Airbnb, we estimate the Lifetime Value (LTV) that this listing can bring to Airbnb. We then look at previous ad click events that lead to this listing conversion. This conversion will be attributed to relevant ads, which will share the listing LTV.
After collecting data from many conversions, we can infer the value of each ad. All of this information will be used to compute bids and allocate budget for each ad. Finally, bids and budget are set on online marketing platforms (e.g., Facebook or Google). This completes the life cycle of an ad. By looking at the life cycle, we can clearly see several main components of a marketing system: click tracking, attribution, LTV estimation and bid and budget optimization.
Becoming a Host
We first look at the experience of becoming a new host on Airbnb and see why it is challenging to build a marketing system specifically for host growth.
Persuading people to list their spaces on Airbnb is very challenging as it is a big commitment. At first, hosts may learn about Airbnb from word of mouth or online marketing campaigns. After doing some research, which may take days or weeks, the hosts could decide to create a listing on Airbnb. They then would need to go through the LYS flow to provide information about themselves and the listings. To create a safe and trusted marketplace, Airbnb requires hosts to follow terms of service and conducts background check on hosts in the US.
The LYS flow ensures Airbnb is safe and weeds out low-intent audiences to provide the most valuable experiences. Even though Airbnb has been working on improving the experience of creating a listing, new potential hosts still drop off throughout the flow. After finishing LYS and publishing the listing on Airbnb, hosts need to manage their listing, respond to prospective guest inquiries, and prepare to receive guests from confirmed reservations.
We can see that choosing to list on Airbnb is a considered decision. This creates at least two challenges when building the marketing system:
- The conversion time between initial ad click and a listing getting a booking is very long — it may take several days or weeks.
- The conversion event is rare. As potential hosts are cautious, we may only see a few conversions from each ad.
Both challenges make our host marketing optimization very difficult.
The Architecture of a Marketing System
So how do we go about this?
Data Logging and Tracking
From the life of ad diagram, obtaining a reliable source of truth for marketing events is at the core of the system. They are not only essential to downstream operations such as bidding and budget allocation but also critical to strategic and tactical performance marketing decisions. However, having accurate logging and tracking is very challenging. The landscape is vast and necessitates Marketing Technology, Engineering, and Data Science skill sets and domain expertise.
There are two data sources: internal and external. Our internal user landing logging relies on jitney logging, a kafka based messaging framework with standardized schema. We built data pipelines to obtain a complete user journey on Airbnb. We fetch external data through third party APIs. To ensure data integrity, we created data pipelines to cross-validate the data used in key metrics across multiple data sources. We have built both an online Datadog real time dashboard and an offline daily dashboard to track the gap between internal and external data. Anomaly detection algorithms are used to detect any abnormalities caused by production code change or API connection errors. For host growth, data integrity is extremely important as host conversions are rare. Any data discrepancy in host data may have significant impact on all downstream operations.
With data validation and tracking in place, we can have an accurate understanding of the clicks from various ads. To compute return on investment (ROI) and the price we want to pay for each ad, we first need to know how many conversions are generated from each ad. However, attributing conversions to the right channels is difficult, we need to leverage multiple channels to access different audiences. When a user first sees an ad on platform 1, and then sees and clicks on another ad on platform 2, and then creates a listing on Airbnb, and finally receives a booking, it is not fair to give all the credits to platform 2, which may underestimate the value of ad on platform 1, and overestimate that of platform 2.
Multi-touch attribution characterizes the right share of credit from each conversion to different channels. Accurate attribution enables development of precise bidding strategy and budget allocation. In the past, we have used an SQL based last touch attribution model, which is hard to implement and validate. We moved away from SQL to a user-defined function (UDF) based approach. UDFs are written in Java and unit tests can be added to test individual attribution components and ensure that the attribution logic works the way it was intended to. To ensure a smooth switch from the SQL to UDF approach, we built a validation pipeline to make sure the metrics from the UDF pipeline match those from the SQL logic. We can then retire the SQL based logic. More complicated rule based or model based attribution logic can then be built on top of the UDF.
The LTV Model
After attributing conversions to different ads, we need to know the LTV of these conversions — the incremental revenue that each conversion can bring to Airbnb. For host growth, LTV is a prediction of the net profit active hosts bring to Airbnb. Hosts on Airbnb offer a wide variety of spaces, ranging from shared rooms to private islands. Such diverse room types provide unique experiences to guests but at the same time pose challenges to LTV estimation.
At Airbnb, we have developed a machine learning model to predict the listing LTV. More details can be found in this post. However, host conversion time is very long as listing on Airbnb is a big commitment. It may take days or even weeks to see a conversion from initial ads click to listing publish. As we accumulate more data, we can improve the accuracy of the original LTV model and resolve the long conversion time challenge. We will discuss these new findings in a separate post.
Bidding and Budget Optimization
With ad tracking, attribution and LTV in place, we are ready to optimize bids for each ad. Airbnb has listings in over 10,000 cities cross 191 countries. For any given city Airbnb may only have a small amount of conversions. It is very challenging to estimate the value of each ad with a couple of conversions. We have developed a mathematical model to solve this issue. More details can be found in a forthcoming post. By knowing the value of each ad, we can then set bid and budget for different ads based on the ROI we would like to have in different markets. We work closely with marketers to integrate our system with advertising platforms such as Google Adwords UI. Marketers can easily set efficiency targets and budget through the UI. This is an area we are actively working on, including the development of advanced models and tools to optimize campaigns. Ultimately we want to make our bidding and budget optimization adaptable to granular traffic changes in real time.
Ads and Campaign Management
We have covered the life cycle of an ad so far. It is also important to manage ads and campaigns in a systematic way. With the scaling of host campaigns, we have launched ads with dynamic content, which require generating information tailored to each market. This requires not only massive time investment in creative generation for a large number of campaigns but also may be a laggy or error prone process. In the past, we have seen several postmortems across Airbnb caused by changes made manually. We developed creative automation tools to generate information feeds that can be used by marketers. We want to reduce the chances of human-error, and make the process more efficient.
The other main question the marketing system tries to solve is “What are the ads we should buy?” Traditionally keywords and ads copy are generated by marketers based on business sense, experience, intuition or third party tools such as Google keyword planner and Google trends. However, it is worth noting that users are always searching for new things, and predicting what keywords we should bid on is hard for even the best marketers. As such, we are developing pipelines that can:
- Discover new keywords automatically
- Create new campaigns using these new keywords
- Evaluate the quality of new keywords and remove inefficient keywords automatically
All of the above are very challenging. We will report the progress once they are mature.
It is very important to be able to measure the incremental impact of the the new ideas outlined above. We can think of our incrementality tests as fitting into one of two buckets:
- Relative Incrementality: How many more listings did a treatment variant bring over the control (example: changes to a bidding model, ad copy, ad creatives, etc.)
- Absolute Incrementality: If we do not spend money on a given marketing channel, how much would the number of listings on Airbnb differ
Airbnb has developed various tools and pipelines to measure relative incrementality of different marketing channels. We have a testing environment and analytics tools to conduct A/B and multivariate tests on search, display and mobile ads. Our tools can also track revenue, conduct ads copy test, and creates statistics based on spending and performance data to help direct future campaigns.
We use marketing-channel specific experimental methodologies to measure absolute incrementality, allowing us to measure the causal impact of our advertising spend. Outputs of these experiments are used to calibrate our multi-touch attribution model above. Methodologies include running campaigns with ghost ads, geography based experimentation, and longitudinal tests.
The information from the marketing system will be ultimately consumed by members of the cross-functional team: Marketers, Product Managers, Finance, Engineers, and Data Scientists. It is important to build robust and easily interpretable dashboard to track the performance of different channels and campaigns. Crucial performance marketing and business decisions are made by looking at key performance metrics, such as listings acquired and ROI of spend. We build dashboard by using Airbnb open source Apache Superset, which provides an interface to explore and visualize datasets, and create interactive dashboards.
Thank you for staying with us on the journey of an ad. We hope you will have a better understanding of the marketing system. Our ultimate goal is to build a fast, robust and scalable marketing system that can empower host growth at Airbnb. Airbnb has revolutionized the way how people travel or even live. As the Internet has been changing everyday life, online marketing gives us an opportunity to bring Airbnb to more and more people all over the world so that they belong anywhere.