Bazel: Avoiding eager fetches

Photo by CHUTTERSNAP on Unsplash

Bazel: Avoiding eager fetches

Bazel manages your dependencies, and fetches them to a users machine when they are needed for a build. That's great, as it ensures all developers on the project have the same dependencies installed without having to think about it. When working well, these fetches are lazy and fine-grained: users only download what's needed for the specific targets they requested to build or test.

However it's easy to de-optimize by introducing an "eager fetch". This is when Bazel downloads some dependencies which aren't actually needed for the current build, just because they are referenced during the analysis phase when the BUILD files are read. These fetches are only a problem on the first build, since the resulting "external repositories" are reused by Bazel for subsequent re-builds. However they are still annoying that first time, and if the repository gets invalidated (maybe because the user switches branches to one that's rebased before some change to the dependency listing) then they have to wait again. They are extra-annoying when the "fetch" includes some subsequent slow install steps, like compiling a program that was just downloaded.

Bazel has a "repository cache" but this is often misunderstood. It does not cache the external repository that Bazel installed on the disk - rather it only caches certain network fetches which had a sha256 sum or integrity hash and were fetched by the Bazel built-in downloader. Tools like npm and pip do their own fetches, so those aren't cached by Bazel (though they might be cached somewhere else on disk by those tools). Even if you avoid a network fetch, any computation performed to "install" those dependencies is never cached by Bazel and has to be re-done if the external repository is invalidated. If you change branches back to the original one, the invalidation is just as expensive; there's no re-use of the prior state in a X -> Y -> X sequence.

This article explains how these eager fetches get triggered, how to remediate them and how to prevent regressions.

WORKSPACE eager fetches

These are the worst kind, because they happen for every single build regardless of the dependency graph or which targets the user requests. Bazel must evaluate the complete WORKSPACE file to understand what third-party dependencies exist for the build. Let's say the WORKSPACE file contains this content:

load("@rules_python//python:pip.bzl", "pip_parse")

pip_parse(
   name = "my_deps",
   requirements_lock = "//path/to:requirements_lock.txt",
)

load("@my_deps//:requirements.bzl", "install_deps")
install_deps()

The penultimate line loads from the @my_deps repository, which means that repository must be eagerly fetched. Whatever work happens in pip_parse will happen for every single build, even for developers who aren't doing anything Python-related. In this case, pip_parse does need to fetch metadata about Python dependencies, so this isn't free. Use the Bazel profile to help you determine whether an eager fetch is a problem in your builds.

Mitigating

In some cases, you can refactor the WORKSPACE to remove the fetch. One approach is to "vendor" - check in the result of the expensive computation rather than perform it on-the-fly, and add a test to the repo ensuring it stays up-to-date. That test will still need to fetch the external repository, but other builds won't.

Continuing the example above, I added documentation for pip_parse showing how you could load the requirements.bzl file from within your repo: github.com/bazelbuild/rules_python/blob/mai.. If you do this, there will no longer be a load statement from @my_deps in the WORKSPACE, which should fix the eager fetch.

Another approach is to defer the work from a repository rule to an action that runs later in the BUILD graph. This generally requires changes to the rules you're using, so I won't try to give an example for end-users to follow.

Again, you should first profile your build to understand which eager fetches are really a problem in practice. External repositories are locally cached by Bazel and shouldn't be invalidated often.

BUILD eager fetches

BUILD files also contain load statements, causing the external repository being loaded to be fetched. Unlike the WORKSPACE case above, the behavior depends on whether Bazel needs to analyze the BUILD file, which is the case if it is transitively referenced from targets the user requests to build or test.

For example, in this BUILD file, we load from under the @npm repository:

# Content of //pkg1:BUILD
load("@npm//@bazel/typescript:index.bzl", "ts_project")

package(default_visibility = ["//visibility:public"])

ts_project(
    name = "a",
    srcs = glob(["*.ts"]),
    declaration = True,
    tsconfig = "//:tsconfig.json",
    deps = [
        "@npm//@types/node",
        "@npm//tslib",
    ],
)

filegroup(name = "b")

As a result, if the user asks to build a, or any target in the pkg1 package such as b, then the full fetch of @npm will be eager. This also happens if the user asks for a target which directly or indirectly load's any target in this package, such as //pkg2:c shown here:

# Content of //pkg2:BUILD
filegroup(name = "c", srcs = ["//pkg1:b"])

Another example comes from using the requirements helper provided by rules_python. If you use the suggested pattern

load("@pip//:requirements.bzl", "requirement")
py_library(
    name = "foo",
    ...
    deps = [
       requirement("requests"),
    ],
)

this also causes an eager-fetch of whatever is in @pip which might cause all Python dependencies to be downloaded!

I added this warning to the documentation for pip_install:

Note that this convenience comes with a cost. Analysis of any BUILD file which loads the requirements helper in this way will cause an eager-fetch of all the pip dependencies, even if no python targets are requested to be built. In a multi-language repo, this may cause developers to fetch dependencies they don't need, so consider using the long form for dependencies if this happens.

Mitigating

For BUILD fetches, the shape of the BUILD file graph matters. As with any programming language, it's a design smell when your imports come from many different unrelated places. Try to avoid BUILD files that load from external repositories and also contain other targets which don't use those repositories.

Another mitigation is to reduce the size of the external repository being fetched. In the npm example example, we fetched @npm which might have a large number of packages, and if it uses npm_install or yarn_install from build_bazel_rules_nodejs, then all of those packages had to be installed just to get the @bazel/typescript one actually needed by this build. You could have a separate package.json listing file with a small number of dependencies loaded by BUILD files, with another npm_install repository rule fetching a repo like @npm_bazel_deps. This would still be eager-fetched, but it would be much faster.

In some cases the eager fetch is just for syntax sugar. In the rules_python requirement example, we could have just used @pypi__requests//:pkg in the deps, with no load statement at all.

You can file issues as well. This takes longer to resolve, but it's healthier for the ecosystem to report these problems. Often the maintainers of the ruleset you use don't know about the problem, because they only build targets that require fetching the repo and have a smaller project, so they just don't observe the negative effects. github.com/bazelbuild/rules_nodejs/issues/3.. is the issue for the example above.

Preventing regression

Eager fetches are subtle, and easy to introduce into your Bazel build. You only notice them when the external repository is invalidated, and are only outraged enough to file issues when you're building something you feel is unrelated. "Why am I re-compiling a python interpreter from source just to run my Go test??"

You can't write a test within Bazel to catch this condition (as far as I know). But you can formulate something outside of Bazel, and then run this as a separate step/pipeline on CI. There are a couple methods.

The first is clumsy but reproduces exactly what you observe: write a test to the effect of "If I do a clean Bazel build of target //:foo, then look in the external directory, I should not observe the presence of the unrelated repo bar." See github.com/aspect-build/bazel-examples/tree.. for both the working example, and a sample test_no_eager_fetch.sh script which makes the assertion.

The second is to use a bazel query to detect a path through the dependency graph from a target to an undesired external repo. We'll just query for all the packages that some targets depend on, then grep for those in a repo, and if the result is non-empty then we found an eager fetch:

bazel query --output=package 'let targets = set(//some:target //some/other:target) in buildfiles(deps($targets))' | uniq | sort | grep @slow_repo