Photo by engin akyurt on Unsplash
Why would you want a hermetic C++ toolchain?
4 min read
Bazel is usually thought of as a hermetic build system. But the default behaviour you get with Bazel's C/C++ built-in rules is not hermetic! Why does this matter? The short answer: Reproducibility and portability.
Developers want their code to compile correctly and be reproducible on CI systems and other developers' machines. C++ toolchains usually function by relying on the host system libraries to work. There are two primary components of this: compiling and linking.
- Compiling: the compiler will produce individual binaries for the language files. They often have the
.oextension. The compiler will rely on header files it finds on the host machine (for C and C++) to collect information about the dependencies API.
- Linking: the linker will collect all the compiled files and assemble them together to produce the final executable, either the main program that can be executed standalone or a shared library with the
.soextension. The linker will connect the final binary to the correct symbols during linking, and those symbols are again found in files on the host machine. There are two ways it can be accomplished: static or dynamic linking. They are not mutually exclusive and often are mixed during linking, hence the usage of terms like "fully static" and "mostly static." Some libraries like
libstdc++have side effects when linked statically and are preferred to be linked dynamically (see below for more).
Aspect offers the github.com/aspect-build/gcc-toolchain to configure Bazel with a hermetic C/C++ toolchain.
Why is it bad to rely on system libraries?
Relying on system libraries during build is bad for reproducibility and portability. To solve this, we can use a "sysroot".
There will always be a version skew on libraries between different machines. Even if it were true that "everyone is on the same OS version," there is no guarantee that the system libraries are the same. E.g. someone may have installed a slightly different GCC on the system, enough to have a new symbol added to the
libstdc++.so file, and the linker will gladly use that new symbol. The output of the Bazel action will be different and produce, at best, a cache miss and, at worst (but rarely), a different runtime result.
When we use a sysroot with a
libstdc++.so during build, the binary will always require the symbols linked against that
libstdc++.so at runtime. The same is true for any other library in the sysroot used by the linker. The hermetic characteristic of the sysroot leads to a deterministic output that is reproducible between machines.
Take the example from "Why reproducibility?" and apply it here to
libc.so. Every Linux system will have a standard libc. This is one of the most important libraries in the system. While we don't want to link against the system libc, it's stable enough to rely on it at runtime. To accomplish this, we rely on the
runtime search path (or
rpath for short). I.e. during building, we pass a
-L flag to the linker to find the
libc.so in the sysroot, but at runtime, the elf binary will find
libc.so in the rpath (usually under
/usr/lib/<arch>/libc.so). Because of how glibc handles API evolution, a binary linked to the symbols of an old glibc will be compatible with a new version of glibc. The opposite is not true; linking against new symbols will throw exceptions at runtime if those symbols are not present. To solve portability, we include an old-enough version of glibc in the sysroot contained in this repository that will broaden the portability of the binaries produced.
Side effects of static linking
Every time we link a static archive
.a to a binary, that binary will contain all symbols from the static archive, increasing the size of the final output. Even when stripping the binaries correctly, when the necessary symbols are duplicated multiple times, the outputs tend to be much larger than when dynamically linking against the shared object version of that library. This has a special impact on remote caching and remote build execution under Bazel. Unless the performance gain of statically linking surpasses the losses in build time (and costs), shared linking is preferable.
The first feature a binary will lose when statically linking libc is the ability to load other
shared objects at runtime using
dlopen. Since glibc uses
dlopen extensively, it's not
recommended statically linking it. For extra context, when it comes to muslc, it supports static linking, but
dlopen will still not be possible.
The standard C++ library is widely depended upon and often will be dynamically linked by many programs in the build graph under Bazel, e.g. tools and language interpreters. When it comes to language interpreters, it's common that it will allow native extensions, and more common yet is that those native extensions are shipped as shared objects, subsequently loaded at runtime using
Any shared object loaded that has been dynamically linked to libstdc++ (external pre-built
binaries), will make the static linking effort useless. For the users who understand the nuances well, static linking can be done by adding
static_libstdcxx to the
features attribute. See hello_world_cpp/BUILD.bazel.
Always check if static linking is supported or advised for other libraries.