Many Python versions, one Bazel build

Photo by Hitesh Choudhary on Unsplash

Many Python versions, one Bazel build

Matt Mackay's photo
Matt Mackay
·Jan 31, 2022·

5 min read

During a migration from one version of the Python runtime to another, or the migration to Bazel itself, it can be useful to have more than one version of the Python interpreter in the build. With the current rule set for Python, and Bazel's built in py_runtime_pair, this is tricky to achieve. This post describes a recipe that allows for many Python interpreters in one build graph.

This works well if there are no edges between two targets which require different versions, for example, a library project designed to work under Python 3.6 isn't depended on by an application whose interpreter version is set to Python 3.9.

To support the multiple interpreters, we are going to intercept the executions of the default interpreter and if required, swap out the interpreter used for the py_binary or py_test with the one that's required.

To get started, we are going to generate a stub script that will perform the swapping and execution of the interpreter. For this, we'll use the expand_template rule from aspect_bazel_lib library rule set which is loaded into our WORKSPACE file.

We also need a stub template, which can be found in this gist. It will attempt to find the requested Python interpreter that's set at the path in the environment variable WHICH_PYTHON. If this can't be found, it will then attempt to fallback to the default version. It's at this stage we can also apply some default flags and arguments.

In our BUILD file, we fill in the template:

load("@aspect_bazel_lib//lib:expand_make_vars.bzl", "expand_template")

expand_template(
    name = "interpreter_stub",
    out = "stub.py",
    data = [
        # Reference to the binary generally found in bin/python3
        "@python38//:bin/python3",
    ],
    is_executable = True,
    substitutions = {
        # Template the path to the default interpreter
        "%DEFAULT_PYTHON_INTERPRETER_PATH%": "$(execpath @python38//:bin/python3)",
    },
    template = "//python:stub.py.tpl",
    visibility = ["//visibility:public"],
)

Next, we'll define a py_runtime that uses our generated interpreter stub. We also need to add the default interpreter versions files into the files attribute of the py_runtime.

The interpreters files can either come from a static build of Python, be built from source as part of the build, or live on the system's PATH.

# This py_runtime defines the files needed to run our "default" version, however the interpreter may swap out and delegate to another
py_runtime(
    name = "python_stub_runtime",
    files = [
        # Need the default python interpreter here as
        # a fallback version for external py_binary targets that don't set the 'WHICH_PYTHON' env var.
        "@python38//:files",
        # Need the runfiles helper for when looking up the files needed for other Python versions defined
        "@bazel_tools//tools/python/runfiles",
    ],
    interpreter = "//bazel/python/interpreter:stub.py",
    python_version = "PY3",
    visibility = ["//visibility:public"],
)

Now, define a py_runtime_pair and the final toolchain that we will register.

Note that py_runtime_pair was designed only for the Python 2 to 3 migration, and isn't useful for defining a pair of different Python 3 interpreters.

py_runtime_pair(
    name = "py_stub_runtime_pair",
    py2_runtime = None,
    py3_runtime = ":python_stub_runtime",
)

# Used to register a default toolchain in /WORKSPACE.bazel
toolchain(
    name = "py_stub_toolchain",
    toolchain = ":py_stub_runtime_pair",
    toolchain_type = "@bazel_tools//tools/python:toolchain_type",
)

Finally, we need to intercept all our calls to py_binary and py_test. To do this, define the macros py_binary and py_test that users will load instead. You'll have to update all the load() sites for rules_python in your codebase to come from this macro.

def py_test(name, py3_version, data = [], env = {}, **kwargs):
    (pyenv, runfiles) = env_and_runfiles_for_python(py3_version)

    _py_test(
        name = name,
        data = data + runfiles,
        env = dict(env, **pyenv),
        **kwargs
    )

The env_and_runfiles_for_python returns a tuple that contains the interpreters' runfiles and environment variables needed for the requested interpreter version.

PYTHON_VERSION_INFO = struct(
    "PY37": struct(
        workspace_name: "python36",
        interpreter: struct(
            major_version: "3",
            version: "3.7.8",
        )
    ),
    ....
)

# convenience export of a struct containing the version keys
PY3 = struct(**{
    key: key
    for key in PYTHON_VERSION_INFO.keys()
})

def env_and_runfiles_for_python(version):
    info = PYTHON_VERSON_INFO.get(version)
    env = {
        "WHICH_PYTHON": "$(execpath @%s//:bin/python3)" % info.workspace_name,
        "PYTHON_VERSION": info.interpreter.version,
    }

    runfiles = [
        "@%s//:bin/python3" % info.workspace_name,
        "@%s//:files" % info.workspace_name,
    ],

    return (env, runfiles)

Now, when users use our py_test macro, they can set the new attribute py3_version to set the version of the Python interpreter that they need for the given target. More interpreter versions can be added to the struct PY3 struct, and it can be used to hold other information too, for example base image labels needed for container_image.

For users who wish to run their tests under different interpreter version from the default, their py_test load and usage now looks like:

# load from the bzl file containing our macro
load("//bazel:defaults.bzl", "py_test")
# we can also load in the PY3 symbol from where we defined our versions
load("//bazel:python.bzl", "PY3")

# run the test under the 3.7 interpreter
py_test(
    name = "lib_test",
    srcs = [
        ...
    ],
    deps = [
        ...
    ],
    py3_version = PY3.PY37,
)

Fetching external dependencies

When fetching external Python dependencies for a project, we must ensure that pip_install rule is called with the right interpreter version, as we may end up with the wrong dependencies. For this, we are going to set python_interpreter_target using the data we stored on the version struct above.

info = PYTHON_VERSION_INFO.PY37

pip_install(
    name = ...,
    python_interpreter_target = "@%s//:bin/python%s" % (info.workspace_name, info.interpreter.major_version),
)

Also, ensure that if the dependencies are locked via something like compile_pip_requirements, that the interpreter used to lock the dependencies is the correct version for the project, as again this can result in different versions of external dependencies.

 
Share this