A seven part series on Android Test Frameworks we’ve built at Airbnb — this first part looks at our philosophy on testing, and the mocking system that lays the foundation for all our tests.
Automated testing for Android development has historically been challenging at Airbnb, due to both technical and organizational reasons. We’ve instead relied on manual testing to catch regressions before releasing. While we do have thousands of unit tests, those are generally for our infrastructure and libraries — our UI features had been impractical to test automatically.
We’ve experimented with Espresso, but simply found it too flaky and time intensive to be worthwhile. There were several reasons for this:
- It was difficult to mock out network requests. We experimented with recording requests with OkReplay, but found the overhead of maintaining up to date JSON recordings to be cumbersome.
- Our Fragments had business logic and UI code combined, so mocking or isolating either was difficult.
- Flakiness made tests unreliable and slowed developers down.
- Maintenance overhead was too high with our constantly changing app, and engineering resources were limited.
- Much of our app complexity comes from UI designs — which are difficult to test with Espresso. For example, assertions on proper padding, text colors, and other styling is impractical.
All of this created an environment where it made more sense to use a QA team to catch issues instead of spending engineering time to maintain tests.
This worked well for years, but started to break down as we continued to grow. Our app became increasingly complex, with dozens of contributors, many separate teams, hundreds of screens, and increased business requirements.
It became difficult to reliably catch regressions in each release. We also weren’t in a position to start using Espresso because all of the previous concerns still applied, and would be magnified with the growth of the app. It was clear we needed to seriously invest in testing, but we required a step change evolution to our product architecture to enable it.
Traditionally we had built our UI with Fragments, with network requests, data storage, and UI management all lumped together in the same class. The problems with this are numerous, and have been a focus of discussion in the Android community for years. For testing in particular it is problematic because we don’t have a reliable way to set the field values of each Fragment, so mocking them out for tests is tedious. Additionally, we can’t prevent or detect side effects if something in the Fragment changes that data.
These problems have resulted in architectures such as MVP, MVI, and MVVM, which separate View and Business logic. We were initially slow to adopt one of them — cautious of introducing overhead and the difficulty of aligning dozens of teams on a forced standard — but when the Android Jetpack ViewModel was released we decided it was a good fit for us.
Since we needed to give our product teams a standardized architecture we built structure on top of Jetpack’s ViewModel, which we released in 2018 as the open sourced MvRx library.
There’s already a lot written on MvRx (as well as several great talks by Gabriel Peal), so I won’t cover it in depth here. However, for the sake of this article it’s necessary to understand that MvRx relies on the principle of a single State class which contains all of the data needed to render a screen. This State is implemented with a Kotlin data class that is forced to be immutable — state can only be changed by submitting a new state instance. MvRx provides utilities for easily setting state and listening to state changes.
MvRx also integrates well with Epoxy for constructing UI programmatically; this is the pattern we use at Airbnb. With Epoxy, our MvRx Fragment provides a “controller” that takes the current State as an input, and outputs which views to show and the data to bind to them.
Here’s the complete code needed for a simple screen.
Note: MvRx also works well with Android Data Binding and other UI frameworks, and does not depend on Epoxy.
The important point for testing is that we can force any State instance we want on the screen and test it in isolation, without worrying about where the data originally came from or whether it will be changed by side effects. Additionally, we can be sure that the UI in the Fragment is solely a function of the State, without uncontrollable side effects that could cause flakiness in tests.
The UI is rebuilt every time the State changes, and is purely based on State. Epoxy helps us to do this efficiently and with minimal boilerplate, but any UI framework you like can be used with MvRx, such as Android Data Binding.
MvRx has now been adopted by all of our product teams, and has been a huge success for us at Airbnb. With MvRx, we can build new features in less time, while making them more maintainable, more robust, more performant, and yes, more testable
Once teams at Airbnb adopted MvRx, we began to look at how we might better support testing for it. While we could have let our engineers loose to write Espresso tests for their features, it’s likely that Espresso’s potential flakiness we discussed earlier would disrupt CI. Consequently, engineers would have to spend significant amounts of time maintaining their tests.
Instead, we hoped to build a testing framework that would provide high test coverage with minimal work. We investigated possible testing solutions, and came up with some guiding principles for what we wanted in a test framework. In our ideal world:
- Tests should be stable, with no flakiness
- It should be easy to add tests to a screen, and updating tests should be simple
- Tests should run fast on CI
- The test framework should enable 100% code coverage
- Tests should enable faster development by making it quick to validate changes
These ideals are great in theory, but there are often compromises, since it is difficult to have good test coverage without spending excessive time writing and maintaining tests. However, we were hopeful we could provide an improvement over vanilla Espresso, so we started to think about what we could build to enable our vision.
We aligned on a MvRx mocking system, where Fragments define mock implementations of state for any ViewModels they use. There is one canonical “default” state representation of each screen, and “variants” to the default describe other UI variations we want to test. All of our tests build on this mocking framework.
Here’s an example of a mocked version of our MvRx sample app. This page shows a list of jokes that are loaded asynchronously with pagination.
We register mocks directly in the Fragment by overriding the provideMocks function.
This declares which state to force the ViewModel to, and can also specify mock arguments if the fragment needs them. It doesn’t matter how the ViewModel normally populates the State, whether by network request, database query, or anything else. With our mock framework, the ViewModel is frozen to a single state that can’t be changed. The fragment receives this state and renders a consistent UI for easy testing. Since MvRx uses Kotlin data classes for State implementations, the only work required to mock a Fragment is implementing a complete version of its state.
We define the mocked state in a separate file so it is easy to manage.
For complex screens this might be a large data class, and would be tedious to create manually. Instead, we created a way to easily generate it:
- In debug builds at runtime, MvRx fragments register a broadcast receiver that listens for a certain Intent action.
- Developers run a Kotlin script from the command line that emits a broadcast with that action via adb.
- The broadcast receiver activates and reflectively collects all ViewModels on the Fragment.
- The State of each ViewModel is reflectively analyzed, retrieving the values of primary constructor parameters and generating the code needed to reconstruct them. This continues recursively for nested objects, resulting in code that can completely reconstruct the State class.
- The code for each ViewModel is saved to a kt file on device and the script pulls them from the device to a mocks package in the app
To run a test with this mock data, a developer would simply need to open a screen on their device and run a single command. The mock is generated and copied to their source files, then just the few lines of Fragment code shown above is added and we’re done. It takes just a few minutes to setup a complete mock for even complex screens, and this adds comprehensive testing coverage for the whole screen (in future articles we’ll cover what tests are run using this mock data).
Since we use immutable Kotlin data classes, the main default mock can be reused amongst all Fragments that share a ViewModel, reducing the amount of time needed to set up mocks for a flow.
If a State is changed its mock needs to only be modified in one place in order to update tests for an entire flow. We also have compile-time guarantees that our mock data stays up to date if the State schema is modified.
If Fragments sharing a mock need any different default states they can leverage the data class’s copy constructor. Additionally, it’s common for a screen to have variations to State that need to be tested. These state variations can also leverage the default state and the copy constructor.
To streamline the process of modifying the default State we created a DSL for defining the State variants. It looks like this:
With this DSL, each variant is given a name and a mock State. This code creates three mocks:
- A main mock called “Default state” using the default state object given in the main arguments
- A “Loading” variant where the initial request is still loading data
- An “Empty results” mock where the returned data is empty
This DSL is particularly convenient for changing data on nested objects. Normally when using the copy function of a data class it is messy to copy a single value nested several layers deep.
This new syntax starts with the default state, and uses reflection to specify increasingly nested layers. The final object specified is the one that is changed. This is especially helpful the more nested the property is.
While the value can be set to anything via this syntax, helpers make it easy to change common primitive values. Let’s look at a more complex state mock for a booking page:
Shortcuts also exist for other common values like empty, zero, and false.
If multiple values need to be changed then these setters are chained. However, we recommend limiting each mock to only one changed value to limit the scope of what it tests.
The return value of the lambda is used as the variant’s final mock state.
If the Fragment has arguments then those can be changed as well. This lets you test different arguments passed to the Fragment for initialization.
In total, this syntax makes the process of testing edge cases extremely simple. Each variant represents a different test case, allowing for complete coverage of the screen.
The basic pattern for feature development is to write some code, run it, check that it does what you want, and iterate ad nauseam. With an app, you are generally relaunched to the entry activity and may need to spend some time clicking through nested screens to get to the feature that you actually changed, which can be tedious.
We built a launcher system to avoid this pain, and allow our developers to jump straight to any screen in the app, with any starting state they need.
This launcher screen is the entry activity for our “dev” apps. The activity automatically detects Fragments in the app’s dex files, retrieves their mocks, and lists them on screen.
Clicking into a Fragment shows all of the mock variants defined for that Fragment, and selecting a mock loads it instantly.
This makes it simple to jump to any screen (or edge case of that screen) in the app, even if it is normally deep in a flow.
These dev apps are technically an Android Flavor, which include only the feature modules a developer wants to test. The flavors leverage our app’s modularization architecture and allow for faster build times.
To further improve developer iteration speed, the launcher remembers which mock you last selected, and automatically reopens it on the next launch. It also supports deep links, so any screen or mock state can be opened by name from the command line. The mock selection allows you to easily test every state and edge case your screen may be exposed to, making for easy local testing before committing a change.
This mocking system and launcher were designed to integrate cleanly into the MvRx library. We initially built them in our internal app, but are in the process of migrating them to our open source MvRx repository.
We will continue to develop this open source work to ensure MvRx is a great architecture with which to build apps.
In this article we’ve shown the mocking framework we’ve built on top of MvRx, which is enabling comprehensive testing of Android at Airbnb. We also walked you through the MvRx Launcher, which is one of the tools we’ve built on top of the mocking framework to help developers test their code.
Next week we’ll share Part 2 of this testing series, where we’ll discuss how this mocking system is used to automate screenshot testing and catch UI regressions on CI.
This is a seven part article series on testing at Airbnb. A new part will be released each week.
Part 1 (This article) — Testing Philosophy and a Mocking System
Part 2 — Screenshot Testing with MvRx and Happo (Coming soon!)
Part 3 — Automated Interaction Testing
Part 4 — A Framework for Unit Testing ViewModels
Part 5 — Architecture of our Automated Testing Framework
Part 6 — Obstacles to Consistent Mocking
Part 7 — Test Generation and CI Configuration
Going to be in San Francisco for droidcon on November 25th and 26th and want to learn about opportunities at Airbnb? Come by our table to meet our team of recruiters and developers.