Bazel can write to the source folder!

Bazel is Google's open-sourced build tool. When used internally at Google, it comes along with a bunch of idioms which Googlers naturally take for granted, and associate with Bazel. These can accidentally become part of the accepted dogma around Bazel migration.

Most frequently, the accident I see is a false perception "Bazel cannot write to the source folder, so you can no longer check in generated files, nor have them in the sources but ignored from VCS".

Typically you shouldn't do it

Intermediate outputs in Bazel are meant to be used directly as inputs to another target in the build. For example, if you generate language-specific client stubs from a .proto file, those stay in the bazel-out folder and a later compiler step should be configured to read them from there.

However there are plenty of cases where outputs do need to go in the source folder:

  • workaround for an editor plugin that only knows to read in the source folder and can't be configured to look in bazel-out
  • "golden" or "snapshot" files used for tests
  • generated documentation that's checked in next to sources
  • files that you need to be able to search or browse from your version control GUI

Yes you can do it

If you restrict yourself to only bazel build and bazel test, then it's true that neither of these commands can mutate the source tree. Bazel is strictly a transform tool from the sources to its own bazel-out folder. However, bazel run has no such limitation, and in fact always sets an environment variable BUILD_WORKSPACE_DIRECTORY which makes it easy to find your sources and modify them.

This leads us to the "Write to Sources" pattern for Bazel. We'll use bazel run to make the updates, and bazel test to make sure developers don't allow the file in the source folder to drift from what Bazel generates.

Note that this pattern does have one downside, compared with build tools that allow a build to directly output into the source tree. Until you run the tests, it's possible that you're working against an out-of-date file in the source folder. This could mean you spend some time developing, only to find on CI that the generated file needs to be updated, and then after updating it, you have to make some fixes to the code you wrote.

The easiest way to use this pattern is with rules that already exist for this purpose. Aspect has a write_source_files rule, and another option is updatesrc from Chuck Grindel.

You can also assemble the parts yourself, directly in a BUILD.bazel file. Here's the basic recipe, which I've adapted to many scenarios. For example, many of the core Bazel rulesets now use this pattern to keep their generated API markdown files in sync with the sources.

load("@bazel_skylib//rules:diff_test.bzl", "diff_test")
load("@bazel_skylib//rules:write_file.bzl", "write_file")

# Config:
# Map from some source file to a target that produces it.
# This recipe assumes you already have some such targets.
_GENERATED = {
    "some-source": "//:generated.txt",
    # ...
}

# Create a test target for each file that Bazel should
# write to the source tree.
[
    diff_test(
        name = "check_" + k,
        # Make it trivial for devs to understand that if
        # this test fails, they just need to run the updater
        # Note, you need bazel-skylib version 1.1.1 or greater
        # to get the failure_message attribute
        failure_message = "Please run:  bazel run //:update",
        file1 = k,
        file2 = v,
    )
    for [k, v] in _GENERATED.items()
]

# Generate the updater script so there's only one target for devs to run,
# even if many generated files are in the source folder.
write_file(
    name = "gen_update",
    out = "update.sh",
    content = [
        # This depends on bash, would need tweaks for Windows
        "#!/usr/bin/env bash",
        # Bazel gives us a way to access the source folder!
        "cd $BUILD_WORKSPACE_DIRECTORY",
    ] + [
        # Paths are now relative to the workspace.
        # We can copy files from bazel-bin to the sources
        "cp -fv bazel-bin/{1} {0}".format(
            k,
            # Convert label to path
            v.replace(":", "/"),
        )
        for [k, v] in _GENERATED.items()
    ],
)

# This is what you can `bazel run` and it can write to the source folder
sh_binary(
    name = "update",
    srcs = ["update.sh"],
    data = _GENERATED.values(),
)

You may want to tweak the recipe, for example if the output files are markdown I'll append ".md" to the keys. If your files follow a convention you might be able to configure it with just a list rather than a dictionary.