An efficient system for self-contained executables – Facebook Code

Distributing large pieces of software to thousands of machines with a wide variety of configurations can pose a significant operational challenge, requiring a process to identify and copy precisely the right combination of dependent libraries and data files for each device. To make this faster, more robust, and more efficient, we have developed and deployed XARs, or eXecutable ARchives, a system for distributing self-contained executables that encapsulate both data and code dependencies. Our research has demonstrated that XARs can deliver as intended when deployed across large networks. We are pleased to share XAR with the open source community via GitHub and PyPI.

XARs are single, highly compressed files containing all necessary executable dependencies. They execute with the same speed as natively installed applications and are designed to be the fastest way to distribute and execute large Python applications while maintaining maximum compatibility with the existing open source Python ecosystem. XARs can be run from anywhere on the filesystem, and they remove the need for virtual environments as well as worries about modules installed as part of the operating system. Executables simply work, and dependencies are isolated from the machine the system is running on. This creates a performant, hermetically compressed executable for Python packages. In sum, XARs are designed to serve a wide variety of use cases and to perform faster than all other self-contained executable distribution approaches.

XARs can be used to deploy Python virtual environments, bundle Node.js applications, and even Lua tools. This can result in efficiency wins from lowered overhead for many types of Python tools, reduced size of the binaries deployed, and offer a more reliable production environment for Python tooling.

The road to XARs

XARs represent an evolution of Facebook’s work to create an optimal system to distribute a single executable that is independent of the operating system’s libraries. Statically linked binaries that minimize dependency management difficulties work well for C++ executables, but languages like Python, JavaScript, and even Lua present a different challenge: How do you place source code and data (such as SSL certificates or shared libraries) inside a single executable? What do you do about the dependencies the tool might have on modules installed on the host operating system?

Initially, we used PARs (Python archives) that were similar to SHARs (shell archives). Every time an executable was run, it would decompress itself into a temporary directory, execute, and finish. However, this approach had inefficiencies, such as repeatedly decompressing files, and other shortcomings, such as leaving potentially hundreds of megabytes of now-unused files to clean up.

Over time, we evolved that solution to be more efficient, through an approach similar to that of PEX files: decompressing once, reusing the decompressed files, and then sharing common files between multiple PARs. After years of making this more efficient through various optimizations, we decided on a new approach and created XARs.

How XARs differ from other self-contained executables

XARs are self-contained executables that carry data and code (both native and interpreted), much like PARs and PEX files (which are self-contained Python virtual environments). Unlike PARs, however, XARs do not require explicit decompression. Instead, XARs are slightly modified squashfs files (see below for technical details) that mount themselves when executed and unmount after an idle timeout. They could almost be thought of as a self-executing container without the virtualization. By using the squashfs format, we not only distribute data in a far more compressed format than with a PAR (zip) file, but we also decompress on demand only the portions we need. Thanks to this architecture, XARs have nearly zero overhead in production and can be used just as native scripts or executables would be.

XARs, like PARs, also have advantages for interpreted languages like Python. By collecting a Python script, associated data, and all native and Python dependencies, we achieve a hermetic binary that can run anywhere in our infrastructure, regardless of operating system or packages already installed. In fact, this works for many Python tools as well as for JavaScript (Node.js), Lua tooling, and bundling multiple C++ executables and data files together, yielding a single archive that is smaller and can be moved as a single unit.

This approach grants XARs many advantages over PARs, PEXs, and other similar options:

  • More modules “just work” because they carry data and handle imports like normal directories. (Even with zipimport, some modules fail, and not every library uses pkg_resources properly.)
  • Tools and services see faster start times, since they don’t need to write the contents of the PAR file to disk.
  • Different invocations reuse the same mount point with data cached efficiently by the kernel.

XARs achieve these performance gains by using a novel on-demand FUSE-based filesystem.

Measuring performance benefits

Optimizing performance (both space and execution time) was a key design goal for XARs. We ran benchmark tests with open source tools to compare PEX, XAR, and native installs on the following metrics:

  • Size: file size, in bytes, of the executable
  • Cold start time: time taken when we have nothing mounted or extracted
  • Hot start time: time taken when we have extracted cache or mounted XAR squashfs


The results show that both file size (with zstd compression) and start times improve with XARs. This is an improvement when shipping to large number of servers, especially with short-running executables, such as small data collection scripts on web servers or interactive command line tools.

Sample usage

Facebook has created a bdist_xar plugin for, much like the wheel module.
To create a XAR, install the PyPI xar module and run

pip install xar 
python3 path/to/ bdist_xar

Ubuntu 18.04

Ubuntu comes with Python 3.6.5 today and is an excellent distribution to run Python 3 applications. Here is a quick-start example for using a XAR to run the black PyPI module on Ubuntu 18.04.

1. Install XAR deps.

sudo apt install cmake g++ git libfuse-dev libz-dev python3-pip python3-venv squashfs-tools

2. Today, we have to build squashfuse from source, as Ubuntu’s version does not contain squashfuse_ll

tar xvzf squashfuse-0.1.103.tar.gz && cd squashfuse-0.1.103 
./configure --prefix=/usr && make
sudo make install

3. Clone the XAR repo.

git clone && cd xar

4. Build.

mkdir build && cd build && cmake .. && make && sudo make install

5. Make the base XAR mountpoint.

sudo mkdir /mnt/xarfuse && sudo chmod 01777 /mnt/xarfuse

6. Create a virtualenv + install xar plugin.

cd ..  # root dir of xar repo
python3 -m venv /tmp/xar
python3 -m pip install --upgrade pip
/tmp/xar/bin/pip install .
Successfully installed wheel-0.31.1 xar-18.6.11

7. Clone + install black to the virtualenv.

cd ..
git clone && cd black
/tmp/xar/bin/pip install .

8. Build a XAR of black.

/tmp/xar/bin/python3 bdist_xar

9. Test XAR executable.

dist/black.xar --help

macos 10.13+

MacOS does not ship with a native FUSE filesystem. Instead, we use FUSE for macOS to mount user-space filesystems. Like above, let’s set up a vanilla macOS 10.13.4 install to run the black PyPI module as a XAR.

1. Ensure that you have FUSE for macOS installed.

brew cask install osxfuse

2. Install XAR.

brew tap facebook/homebrew-fb
brew install xar

3. Make the base XAR mountpoint.

sudo mkdir /mnt/xarfuse && sudo chmod 01777 /mnt/xarfuse

4. Create a Virtualenv + Install xar plugin.

git clone && cd xar
python3 -m venv /tmp/xar
python3 -m pip install --upgrade pip
/tmp/xar/bin/pip install .
Successfully installed wheel-0.31.1 xar-18.7.11

5. Clone + install black to the virtualenv.

cd ..
git clone && cd black
/tmp/xar/bin/pip install .

6. Build a XAR of black.

/tmp/xar/bin/python3 bdist_xar

7. Test XAR executable.

dist/black.xar --help

Additional technical details

How exactly do XARs work? A XAR is a simple combination of a few primitives:

  • A four-kilobyte preamble that is a shebang pointing to a helper executable (#!/usr/bin/env xarexec_fuse)
  • A helper, xarexec_fuse, that knows how to read the XAR file, mount it if necessary, and execute the Python (or Lua or …) script inside
  • A FUSE filesystem, squashfuse_ll, that is responsible for making the squashfs file look like a normal directory of files (and, by using FUSE, XARs don’t require root to run and can run on OS X).
  • (optional) Since squashfs supports Zstandard compression (another Facebook open source offering), we can achieve far better compression ratios and faster decompression speed than zlib-based zip files can.

Combining those primitives gives us XAR. In addition to being responsible for the mounted filesystem, squashfuse_ll will also unmount and exit if the contents of the filesystem aren’t accessed, allowing XARs to clean themselves up without intervention.

It is worth noting that nothing in XAR is Python specific. In fact, it’s even possible to create create a XAR file containing a native C or C++ executable, and the result is a smaller file on disk. It is also possible to use any scripting language, even bash, and deliver data alongside code.

What comes next

The path that brought Facebook to XARs was spread over years of iteration, experimentation, and optimization. We’re not done yet; we’re always working to make XARs even more efficient and have plans for improving the efficiency of squashfuse_ll. Additionally, while focused initially on Python, as mentioned above, XARs have found other use cases for other languages, and we are excited to continue expanding the languages and use cases.

We are excited to share XARs with the community and look forward to seeing how you use and help us improve them. PRs and suggestions are welcome!

Source link