Rolling Back an Airflow Upgrade – Engineering Blog


At Wealthfront, we have many important processes coordinated with Airflow. This includes hourly and daily ETL jobs for ingestion, Spark pipelines for computing derived information, and other batch processes like moving money to our partner banks. With an open-source project that’s actively evolving like Airflow, we’d like to be able to upgrade versions to take advantage of new features. However, with all the critical jobs running on this service, we can’t afford much downtime if an upgrade like this were to go wrong.

While we can do our best to understand changelogs and try the new version outside of our production environment, there’s always the risk that something is incompatible with how we’re currently using Airflow. As a result, we want to make sure we have a path to downgrade versions when necessary.

Changing Airflow Versions

Behind the scenes, Airflow uses a database (we use MySQL) to store its internal state. When changing Airflow versions, the biggest potential source for issues arises with this database, as version changes frequently come paired with schema changes that are not always backwards compatible. For example, upgrading Airflow might drop a table or a column that is no longer needed. If we were to upgrade Airflow and then discover some issue with the new version, reverting to the previous stable version would not work immediately, as it wouldn’t be compatible with the upgraded database.

Thankfully, Airflow uses Alembic as a database migration tool. Alembic generates change management scripts using SQLAlchemy to make schema changes to the underlying database, and it creates scripts to go in both directions – upgrading and downgrading. Airflow itself only provides wrappers for upgrading the database and resetting it (by dropping all the tables and rebuilding it). While the latter is something that we could do in a pinch, we’d like to keep the history of task instances, DAG runs, etc. when reverting to an earlier version if possible. Finally, in a worst case scenario we can always use our database backups to restore the previous state.

Database Downgrade Procedure

Let’s say we’ve just upgraded Airflow to the newest available version, and the upgrade seems to have broken a critical Airflow plugin. This plugin is used in a critical workflow with a tight SLA so we have to remediate immediately. What steps do we need to take?

To respond to this kind of situation we’ve developed the following procedure to downgrade database versions, allowing us to preserve as much history as possible:

  1. Identify the Alembic version number(s) corresponding to the Airflow version you’d like to downgrade to. This is done most easily by looking at the database before upgrading:
  1. Bring down any of the Airflow services that are running (e.g. the scheduler, web server, etc.)
  2. With the new version of Airflow still installed, modify or create a copy of the Alembic configuration file and update the following:
    1. Update script_location to point to the full path of the Airflow migrations directory (e.g. /usr/lib/python3.6/site-packages/airflow/migrations/)
    2. Update sqlalchemy.url to match your Airflow SQLAlchemy connection string in the airflow .cfg file (e.g. mysql+mysqldb://airflow:password@airflow-db-host.example.com:3306/airflow)
  3. Use the Alembic CLI to downgrade versions using the modified configuration file from step 3 and the version(s) noted in step 1, e.g.
    alembic -c /path/to/configuration/alembic.ini downgrade 41f5b83752f8
  4. If there were multiple versions identified in step 1, run the downgrade command for each version number
  5. Uninstall the new Airflow version
  6. Reinstall the old Airflow version

At this point, you should be able to bring your Airflow services back up.

Limitations

The steps above outline how to downgrade Airflow database versions while retaining as much data as possible. The biggest limitation is that there is likely loss of some data during this process. As mentioned, upgrades often include dropping some columns, so data in those columns cannot be recovered during the downgrade. Additionally, for some changes, there may need to be some manual changes to the database (e.g. if a column was made nullable in the upgrade and null data was added when attempting to run on the new version, those rows might need to be dropped before downgrading). Regardless, we’ve found this to work quite smoothly for the last couple of Airflow version upgrades.


Disclosure

This communication has been prepared solely for informational purposes only.  Nothing in this communication should be construed as an offer, recommendation, or solicitation to buy or sell any security or a financial product.  Any links provided to other server sites are offered as a matter of convenience and are not intended to imply that Wealthfront or its affiliates endorses, sponsors, promotes and/or is affiliated with the owners of or participants in those sites, or endorses any information contained on those sites, unless expressly stated otherwise.

Wealthfront offers a free software-based financial advice engine that delivers automated financial planning tools to help users achieve better outcomes. Investment management and advisory services are provided by Wealthfront Advisers LLC, an SEC registered investment adviser, and brokerage related products are provided by Wealthfront Brokerage LLC, a member of FINRA/SIPC.   

Wealthfront, Wealthfront Advisers and Wealthfront Brokerage are wholly owned subsidiaries of Wealthfront Corporation.

© 2021 Wealthfront Corporation. All rights reserved.



Source link