CODEOWNERS and Bazel

CODEOWNERS and Bazel

Gating submission of code changes on the right reviewers is a critical and nuanced problem.

Most companies are using GitHub for code review. It supports a single file named CODEOWNERS in the root of the repository. This clearly wasn't designed for monorepos, where you want each org to maintain its own "Ownership" semantics, and beneath that each team may want overrides. It ought to follow the example of many linters which treat any source file as governed by the nearest ancestor configuration file.

Bazel is a monorepo build and test tool, but it's closely related to code review as well, so this is the sort of problem we could expect it to solve. The naive answer is to encode ownership in Bazel's dependency graph, like with https://github.com/zegl/rules_codeowners (disclaimer, I'm a contributor there)

However, rules_codeowners layering on top of the dependency graph is not ideal. It requires a parent folder to list its children, and then those are listed in a big registration block in generate_codeowners. You don't want to declare such a dependency graph, because it inverts the graph and causes "eager fetches" - in order to load the root package you accidentally have to load //my_org/your_slow_team and //other_org/made_bad_choices - making builds slow for everyone.

Note that Google's monorepo has a separate file, which is just a textproto called OWNERS. Bazel (aka. blaze) is not involved. It's similar to https://www.kubernetes.dev/docs/guide/owners/ from what I can tell.

What you really want is a "whole-repo operation" that reads the data files spread around the repository, and Bazel isn't a good choice for such operations since any given node in the dependency graph should have a limited transitive reachable scope based on the dependencies in the source code.

Paid options

I've used a standalone service before like https://www.pullapprove.com/ - this gives you a great deal of expressiveness in policies around code changes. However it just integrates with GitHub as an additional status on PRs, the same as a CI system. It doesn't understand your GitHub teams or play with the built-in "Owned by" feature in the GitHub user interface https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-code-owners#about-code-owners

So let's say we really want CODEOWNERS but for it to work with monorepo.

Okay so what could we do instead

There's a key observation: in reviewing a commit that modifies an OWNERS file, we don't need the new values to be "live" in evaluating the policies of who reviews that change. Quite the opposite: if I make a PR to remove your team from the set of required owners of some file, your team should be required to approve that change. This means we're fine with the OWNERS semantics applying only after the commit gets merged to the main branch.

This means we can treat CODEOWNERS as a continuous delivery problem. For any green commit on main we can aggregate OWNERS files from the whole repository into a correct CODEOWNERS file, then "deliver" that with a bot commit back into the repository whenever it changes.