by Tom Gianos
The big data space continues to change at a rapid pace. Data scientists and analysts have more tools than ever at their disposal whether it be Spark, R, Presto, or traditional engines like Hive and Pig.
At Netflix the Big Data Platform team is responsible for making these tools available, reliable and as simple as possible for our users at massive scale. For more information on our overall architecture you can see our talks at QCon 2016, re:Invent 2016, 2015 and 2014 among others.
Genie is one of the core services in the Netflix data platform. It provides APIs for users to access computational resources without worrying about configuration or system state. In the past, we’ve written about the motivations to develop Genie and why we moved to Genie 2. This post is going to talk about new features in the next generation of Genie (i.e., Genie 3) which enable us to keep up with Netflix scale, evolution of tools, and expanding use cases. We will also explore some of our plans for Genie going forward.
Our Current Scale and Use Cases
Genie 3 has been running in production at Netflix since October 2016 serving about 150k jobs per day (~700 running at any given time generating ~200 requests per second on average) across 40 I2.4XL AWS EC2 instances.
Within Netflix we use Genie in two primary ways. The primary use case is for users to submit job requests to the jobs API and have the job clients run on the Genie nodes themselves. This allows various systems (schedulers, micro-services, python libraries, etc) at Netflix to submit jobs and access the data in the data warehouse without actually knowing anything about the data warehouse or clusters themselves.
A second use case which has evolved over time is to leverage Genie’s configuration repository to set up local working directories for local mode execution. After Genie sets up the working directory, it will return control to the user who can then invoke the run script as needed. We use this method to run REPL’s for various engines like Hive, Spark, etc. which need to capture stdout.
While Genie 3 has many new features, we’re going to focus on a few of the bigger ones in this post including:
- A redesigned job execution engine
- Cluster leadership
- Dependency caching
Execution Engine Redesign
In Genie 2, we spent a lot of time reworking the data model, system architecture and API tier. What this left out was the execution engine which is responsible for configuring and launching jobs after a request is received by the jobs API. The reasoning was the execution piece worked well enough for the use cases that existed at the time. The execution engine revolved around configuring a job directory for each job in a rigid manner. There was a single job execution script which would be invoked when setup was complete for any type of job. This model was limited as the set of tools we needed to use grew and a single script couldn’t cover every case. It became increasingly complex to maintain the script and the code around it.
In Genie 3, we’ve rewritten the entire execution engine from the ground up to be a pluggable set of tasks which generate a run script custom for each individual job. This allows the run script to be different based on what cluster, command and application(s) are chosen at runtime by the Genie system. Additionally, since the script is now built up dynamically within the application code, the entire job flow is easier to test and maintain.
These changes have resulted in an ability for our team to respond to customer requests more quickly as we can change individual application or command configurations without fear of breaking the entire run script.
In Genie 2 every node was treated equally, that is they all would run a set of tasks intended for system wide administration and stability. These tasks included database cleanup, zombie job detection, disk cleanup and job monitoring. This approach was simpler but had some downsides and inefficiencies. For example, all nodes would repeatedly perform the same database cleanup operations unnecessarily. To address this it would be best for cluster wide administration tasks to be handled by a single node within the cluster.
Leadership election has been implemented in Genie 3, currently supported via either Zookeeper or statically setting a single node to be the leader via a property. When a node is elected as leader, a certain set of tasks are scheduled to be run. The tasks need only implement a LeadershipTask interface to be registered and scheduled by the system at runtime. They can each be scheduled at times and frequencies independent to each other via either cron based or time delay based scheduling.
Genie allows users to run arbitrary code, via job attachments and dependencies, as well as the ability to access and transport data in the data warehouse back to the Genie node. It’s become increasingly important to make every effort to ensure the ability to perform these actions are allowed only by people authorized to do so. We don’t want any users who aren’t administrators changing configurations which could break the system for all other users. We also don’t want anyone not authenticated to be able to access the Genie UI and jobs results as the output directories could have sensitive data.
Therefore a long requested set of features have been added in Genie 3 to support application and system security. First, authentication and authorization (authn/authz) have been implemented via Spring Security. This allows us to plugin backend mechanisms for determining who a user is and decouple the decision of authorization from Genie code. Out of the box Genie currently supports SAML based authentication for access to the user interface and OAuth2 JSON Web Token (JWT) support for API access. Other mechanisms could be plugged in if desired.
Additionally, Genie 3 supports the ability to launch job processes on the system host as the user who made the request via sudo. Running as users helps prevent a job from modifying another job’s working directory or data since it won’t have system level access.
As Genie becomes more flexible the data platform team has moved from installing many of the application binaries directly on the Genie node to having them downloaded at runtime on demand. While this gives us a lot of flexibility to update the application binaries independently of redeploying Genie itself, it adds latency as installing the applications can take time before a job can be run. Genie 3 added a dependency file cache to address this issue. Now when a file system (local, S3, hdfs, etc) is added to Genie it needs to implement a method which determines the last updated time of the file requested. The cache will use this to determine if a new copy of the file needs to be downloaded or if the existing cached version can be used. This has helped to dramatically speed up job startup time while maintaining the aforementioned agility for application binaries.
And more …
There are many other changes made in Genie 3 including a whole new UI (pictured above), data model improvements, client resource management, additional metrics collection, hypermedia support for the REST APIs, porting the project to Spring Boot and much more. For more information, visit the all new website or check out the Github release milestones. If you want to see Genie 3 in action try out the demo, which uses docker compose so no additional installation or setup necessary beyond docker itself.
While a lot of work was done in the Genie 3 release there are still a lot of features we’re looking to add to Genie to make it even better. Here are a few:
- A notification service to allow users to asynchronously receive updates about the lifecycle of their jobs (starting, finished, failed, etc) without the need for polling. This will allow a workflow scheduler to build and execute a dependency graph based on completion of jobs.
- Add flexibility to where Genie jobs can run. Currently they still run on a given Genie node, but we can envision a future where we’re offloading the client processes into Titus or similar service. This would follow good microservice design principles and free Genie of any resource management responsibility for jobs.
- Open source API for serving configured job directory back to a user enabling them to run it wherever they want.
- Full text search for jobs.
Genie continues to be an integral part of our data ecosystem here at Netflix. As we continue to develop features to support our use cases going forward, we’re always open to feedback and contributions from the community. You can reach out to us via Github or message on our Google Group. We hope to share more of what our teams are working on later this year!