Versioning releases from a monorepo

Moving code into a monorepo is just the beginning of a transition. Ideally, each team gives up control and responsibility for governing their development process, and in exchange they get a consistent, centrally-supported experience where they are ultimately glad they don't have to do that "DevInfra" work anymore and can focus on product. Their management is happy too, and they spread monorepo enthusiasm through the organization.

"Monorepo" sounds like a version control concern, but I've written before about things that become "mono" in addition to the git repository, such as the CI build (blog.aspect.dev/monorepo-shared-green). Today is a simpler topic: how to apply a version to the artifacts you release from a monorepo.

Trunk-based development

Teams usually arrive in the monorepo with their prior workflow of tagging their own releases. Their version scheme is specific to their project, and may include Semantic Versioning. Their first-party dependencies come from other repos in the organization and have their own versions, which are handled exactly as if they were third-party packages from the internet. When both ends of such a dependency edge live together in the same repo, this causes a strange time-travel effect: an application released at a given commit depends on a library in ../some-lib at a version many commits in the past. When browsing the source repo, you can't just follow the ../some-lib link to reason about how the application interacts with the library. And the developers of some-lib make releases without knowing if they will work with applications at HEAD. This is like branching, even if there's never a release branch for some-lib, and eventually there is "merge hell".

There's also a performance penalty in Git when you have a large number of tags on the repo. Even non-tag-related operations like git status slow down. We can end up with a classic "tragedy of the commons" if developers keep cutting releases of their libraries and apps by pushing tags to the shared remote. Someone at Stripe knows all the details; if you'd like them just let me know.

The solution to this is sometimes called "trunk-based development", which means that dependencies should be at HEAD. When the last of a library's dependents moves into the monorepo, no one needs to reference that library by a version, and it can stop doing versioned releases altogether. What should it do instead?

Monoversion

The maintainers of the monorepo (e.g. a DevInfra team) can offer a single version across the entire monorepo. You won't convince every team to change their ways, but over time you can make this experience so simple and effective that you overcome change aversion.

There are two options to choose from:

Automated. Every commit in the monorepo gets a version.
Manual. Someone still chooses a tag to apply at a commit and pushes it.

Automated

We'll make versions that look like

2020.44.123+abc1234

[year].[week].[# commits so far that week]+[git SHA]

Benefits:

We can still have a version that "looks like semver", which means that any tools which parse version numbers will work. For example your monitoring software will still show that your app started to crash-loop only after the new release was deployed.
We can make it easy to reason about "when is this release from" and "which of these two versions is later".
The short git SHA still appears in the "build metadata" field so you can navigate to the sources as they existing when the release was built.
You can cut a release from anywhere without having to push commits or tags to the repo.
Avoids proliferation of refs that slow down git.

Downsides:

This isn't semantic versioning: we don't attempt to indicate what's a breaking change. We assume that it's infeasible to coordinate development across the monorepo to force teams to operate on a common cadence.

How to do it:

Tag the repo at the beginning of each week. If you use GitHub, the easy way is to drop a .github/workflows/weekly-tag.yaml file that just calls an API endpoint to add the tag. (This avoids the expense of cloning the repo in order to use git push). Here's my solution: gist.github.com/alexeagle/ad3f1f4f90a5394a8..
Use this command to determine the version:

git describe --long --match="[0-9][0-9][0-9][0-9].[0-9][0-9]" | sed -e 's/-/./;s/-g/+/'

If you use Bazel, you can tuck this into your workspace_status_command and then Bazel's stamping feature will always put the right value in your --stamped build outputs. Also note that the build metadata (the plus character and following bits) aren't legal in Docker tags, so you might need to use a hyphen instead. This is technically indicating a pre-release, per the spec: semver.org/#spec-item-9

Manual

You might still have a reason to tag releases yourself. For example, you may be shipping a product that your users will expect to follow Semantic Versioning.

Downsides:

You have to coordinate features and breaking changes across the entire repository, so that you follow Semantic Versioning.
You have to tag the repo, which requires knowing what tag to apply (bump the major/minor/patch?) and for users to have write permission.

How to do it:

We use github.com/choffmeister/git-describe-semver in Aspect's monorepo. You just need to install that package and use it when cutting a release. We wrapped it with a Bash script (gist.github.com/alexeagle/041f116ecb576aed7..) so that Bazel's workspace_status_command can call it without making users install it themselves.