The nature of running distributed tasks across multiple machines and the I/O-centric nature of the coordinating software elements make Python a very good fit for implementing this design. MTO leverages Python’s built-in multiprocessing module to ease development in a distributed environment. For example, the multiprocessing module supports proxy objects that add transparency to a local client when interacting with remote objects, as well as distributed queues. MTO also leverages Python’s deep support for asynchronous I/O through its asyncio module and capabilities. This aligns well with the I/O-centric operations conducted by each host, which by their nature are asynchronous. Lastly, the choice to use Python also fits well with our internal tooling infrastructure at LinkedIn, which is primarily Python-based.
The test execution model
Another aspect we had to reconcile within the design of MTO is which execution model to use in the distribution of tests. There are a number of choices, but here I will focus on two. The first is a “push” model, where tests are divided into subsets up front and each subset is then distributed to the workers. The second is a “pull” model, where the root node holds a queue of tests to run, and each worker pulls the next test from that queue as it becomes available.
MTO implements the “pull” model as the means of test distribution and execution. There are a number of reasons for this choice. First, the order of test distribution and execution is important, and this is easier to manage in a pull model than in a push model. In general, you want the longer-running tests to execute sooner; if instead you were to run the longest test as the very last test, you would be extending the overall test time because this test will be the only one running, making poor utilization of resources and parallelism. In a push model, if you want to strategically order the tests, then the way in which tests are divided initially has to be re-tuned and re-balanced as you add more tests or scale to more emulators. In a pull model, on the other hand, there is a single queue of tests to be ordered. The ordering is independent of how the distribution occurs and is simpler to perform.
The pull model is also more robust to failures. If an emulator misbehaves or a worker node goes offline, the remaining nodes of the system can pick up the slack, since the remaining workers will simply pull the remaining tests from the queue one-by-one. Tracking has to be put in place to know which test (if an emulator) or tests (if a full worker node) goes down, of course, but this is fairly simple to do. In a push model, if an emulator goes down, the full set of remaining tests have to be re-distributed according to some sort of potentially awkward logic.
Emulator cloud options
So far, we have focused solely on test distribution and execution. Mobile Test Orchestrator also provides a framework for launching emulators and creating device pools, in order to easily scale testing operations. These two concerns are very much kept separate and independent within MTO. Indeed, you have the choice of managing the device/ emulator pool within the same process as test execution or in an independent process. Emulator pools use an underlying queue to manage the emulators on a worker host. Clients reserve a device from the pool, making it unavailable to other clients, and relinquish the emulator back to the queue (pool) when finished. This prevents the conflict of multiple workers utilizing the same emulator.
MTO provides two means for the workers to get a reference to a device/ emulator pool. The first is through discovery. The API provides a means to discover existing emulators already present on the host and instantiate a pool from those. The second is through launching emulators explicitly to create the pool. A third alternative is to use custom code outside of MTO to launch the emulators and provide them through a shared multiprocessing.Queue.
Regardless of how the emulator pools are created, several choices exist for operating an emulator cloud. Although MTO does not provide code for bootstrapping and managing the host machines that it will run on, the framework does not preclude options for how the overall emulator cloud is managed. One option is a persistent, potentially elastic cloud. Here, a pool of emulators across multiple hosts is kept active, growing or shrinking to meet the demand. In this model, resource allocation software would have to be implemented to manage a broad pool of emulators. Emulators would be allocated based on each client’s needs and that subset of emulators would be handed off to the MTO software to execute the client’s test needs. This model has the advantage of having an always-ready pool of emulators. It also carries the additional burden of monitoring the health of each emulator to ensure reliability and stability of the system on an ongoing basis.
The second option is a “dynamic cloud.” Here, each worker host is responsible for launching the emulators, creating the workers to run the tests, and shutting the emulators down at the end of execution. The cloud of emulators exists only for the purpose of satisfying the client’s test needs and only during the time span needed to execute the client’s test plan. This solution can be ideal in the context of an existing continuous integration pipeline that already provides the ability to allocate host machines, but not the more specific allocation of Android emulators. One disadvantage is the overhead of emulator startup-time. However, when run as part of a full build-and-test pipeline, the process can be kicked off at the same time as build to mitigate or even eliminate that overhead. Here, since the cloud of emulators is a dedicated pool with a fixed lifetime, existing only during a single test run, there is no need to continually monitor the health of the emulator pool. At LinkedIn, we utilize our CI/CD infrastructure as a means to provision resources to run MTO tests. MTO is responsible for orchestrating the software on the resources that run our emulators and mobile tests.
As the number of tests for the LinkedIn flagship Android app was approaching 10,000, the execution time for tests alone was approaching 80 minutes on a single machine, albeit with 16 emulators. Through use of the MTO framework and distribution across 10 hosts each running 16 emulators, we have brought our test time down significantly, nearly reaching our target of 10 minutes. More importantly, the implementation is easy to scale to a larger number of hosts to support future growth in testing.
In testing of the MTO framework itself, the CPU utilization of the orchestration Python code was also measured to be low, as expected. For tests involving emulator interaction (installation of apps, running apps, pushing and pulling files, etc. the total CPU utilization of the orchestration software was between 1 and 2%, versus 200%+ utilization by the emulators. As expected and desired, the lion’s share of resource utilization resides in the emulators. Of course, these numbers were obtained from a test application that was neither overly simplistic nor overly taxing of the emulator. Real utilization of the orchestration software can be less for longer running tests and be more when executing shorter running tests.
Mobile Test Orchestrator provides a framework for distributing and executing Android tests across multiple emulators running across multiple machines. MTO helps contain test execution times and makes it easy to scale to a larger number of machines and emulators as needed.
We expect to fully open source the MTO code on GitHub in the coming weeks. As your tests start to grow in number, consider using MTO to keep test execution times in check—quicker feedback to developers is key to higher productivity.