This post will show how much faster TypeScript builds can be when using remote execution, Bazel's unique ability to parallelize transpile and type-check work across a farm of machines. We hope that Bazel 6.0 will include fixes for symlinks support, making it possible to use remote execution with Aspect's rules_ts.
How much faster?
Using a remote execution cluster of 100 executors provided by our partners at EngFlow, we benchmarked a large TypeScript application with 10M lines of code, representing a large-scale enterprise application, to be 8.4x faster with remote execution than when building locally on a 16 core MacBook Pro. The build took 2 minutes and 13 seconds with remote execution vs. 18 minutes 53 seconds locally. Full benchmark results are found further down in this post.
With remote execution your build is no longer bound by the resources on your local or CI machines making it easy to horizontally scale your build compute. You can now scale your build compute to keep build times fast for even the largest TypeScript code bases by increasing the number of remote executors.
The benchmarks in this post used 100 remote executors. Had we increased the number of remote executors by a factor of 2 we would expect to see a similar 2x reduction in build times. How much you gain from increasing remote execution compute depends only on how wide your build graph is and how many actions can be run in parallel.
Fixes coming in Bazel core
Why doesn't this work with Bazel 5?
rules_ts is built on top of rules_js, which uses a symlinked node_modules structure for linking. This means it inherently relies on Node.js tools, such as TypeScript, following symlinks to resolve npm dependencies.
Historically, Bazel turned symlink inputs into actual files on remote executors. This made it incompatible with any actions that depend on symlinks when executing, such as rules_js actions when resolving transitive npm dependencies.
Thanks to recent work by Fabian Meumertzheim, symlinks are now supported with remote execution in Bazel 5.3.0. The last change outstanding for rules_js to work with remote execution is currently in review. This is a fix to keep unresolved symlinks relative in the sandbox & the runfiles trees.
Developer Productivity and Build Times
To benchmark remote execution with rules_ts we increased the number of actions used in the original rules_ts benchmarks by 20x so that a full clean build took around 20 minutes locally on a MacBook Pro. This is meant to represent a large-scale enterprise application.
Twenty minutes of waiting on build & test is a representative threshold at which many companies might start to consider bringing remote execution into their Bazel configuration to decrease build & test times and make their developers more productive.
Long waits on build & test can really kill developer productivity at an organization. At twenty minutes, a developer can only iterate three times an hour at best. When working on a difficult problem, quick iterations are crucial to flow and result in faster & better solutions. At twenty minutes, a developer is likely to context switch to other work rather than waiting for the next results. A developer may also settle on a less than ideal solution just so she can move past a problem rather than continuing to iterate slowly.
The benchmarks used for this post were run against a generated TypeScript code base that mimics a large enterprise scale. It has 100 features, 10 modules per feature, 10 components per module and 1001 lines of code per component. This makes for a total of 11,100 TypeScript files containing over 10 million lines of TypeScript in aggregate. That is a lot of TypeScript code!
For the Bazel build, each module maps to one Bazel target, for a total of 1100
A timestamp is written to each generated TypeScript source file in this benchmark to intentionally cause cache misses, so that actions are forced to re-run.
These benchmarks were run on a MacBook Pro (16-inch 2019), 2.4 GHz 8-Core Intel Core i9, 64 GB 2667 MHz DDR4 running macOS Monterey 12.5.1
The remote execution cluster, provided by our our partners at EngFlow, was made up of 100 executors on AWS c6i.xlarge instances.
Versions of TypeScript and rule sets used were,
- TypeScript 4.8.2
- rules_nodejs 5.5.0
- @bazel/typescript 5.5.0
- @bazel/concatjs 5.5.0
- aspect_rules_js 1.1.2
- aspect_rules_ts 1.0.0-rc2
- aspect_rules_swc PR#57 (a soon-to-be-landed performance enhancement which uses a new pure rust CLI for swc)
GitHub Actions Hosts
We also ran the benchmark on standard GitHub Actions machines with 2 cores and 7 GB ram. These machines were not powerful enough to run local actions in comparable times or without OOM'ing, so only remote execution actions were benchmarked on GitHub Actions hosts.
The ability to run a large Bazel build on relatively small machines is one of the benefits of Bazel remote execution. Instead of allocating CI machines with hundreds of cores, you can instead run your build on very small CI hosts backed by a large auto-scaling remote execution cluster. If tuned well, this configuration can result in significant cost savings on compute.
rules_js, which reach 1.0.0 less than a month ago, is already in use by many companies that we've talked to. We've seen a lot of interest on the
For historical context, while remote execution was possible in some configurations with rules_nodejs, it has never worked well.
Originally, the Node.js toolchain was difficult to use if the host & execution platform did not match. This is the case, for example, if you run Bazel locally on a MacBook but the remote execution cluster uses Linux executors. This issue has now been fixed in the rules_nodejs toolchain layer, which is shared between rules_js and rules_nodejs rules.
Next, rules_nodejs historically enumerated all files in every npm dependency as individual inputs. This resulted in hundreds of thousands of input files to actions in large projects that noticeably slowed down sandbox and runfiles tree creation. When rules_nodejs was updated to use source directories for npm dependency inputs, this resolved the excessive number of inputs, but the optimization was not compatible with remote execution since remote execution does not support source directory inputs.
Finally, rules_nodejs was updated to use declared directories for npm dependencies. This made it compatible with remote execution but the additional overhead of making a directory copy of each npm dependency was a noticeable performance hit to already slow, eager npm dependency fetching & linking and rules_nodejs still suffered from all the other problems inherent with its in-action runtime linker.
Full builds vs. "devserver" builds
In these benchmarks we measure two different scenarios:
A full clean build (
bazel build ...) followed by an incremental
bazel build ...after making a change to a leaf TypeScript file.
A clean "devserver" build (
bazel build :devserver), which emulates a typical developer workflow of building while running a tool such as a devserver, followed by an incremental
bazel build :devserverafter making a change to a leaf TypeScript file.
The "devserver" scenario is an important measure that emulates the typical local development workflow of coding while running tools such as a devserver or a test runner such as jest. These tools are often run in watch mode while making changes to source code. The faster build times are on changes the shorter the round-trip-time is to get feedback on those changes.
Ideal build times to maximize developer productivity are less than 1 second on changes to leaf nodes and less than 10 seconds on changes that affect large parts of the graph. With a good dependency graph, these ideal times can be preserved even as the project grows.
ts_project vs. ts_library
ts_project was originally developed in rules_nodejs as an alternative to
ts_library to provide a cleaner API better suited for the many ways TypeScript is used outside of Google. While the API was better suited for the wild, it could not compete with
ts_library, a heavily optimized and deeply integrated wrapper around the TypeScript compiler, on performance.
ts_project from rules_ts has significantly reduced the performance gap with
ts_library by adding first-class support for Bazel workers and now support for remote execution. We'll refer to
ts_project from rules_ts as simply
ts_project in this blog post. The original
ts_project from rules_nodejs we'll refer to as the @bazel/typescript
In these benchmarks, we'll measure both
ts_project rules configured with swc as the transpiler. swc is an order of magnitude faster that TypeScript for pure transpilation but it does not type-check, so TypeScript is still used for type checking in this split configuration.
The split configuration also removes type checking from the build graph for devserver and test targets, so only transpilation is needed to build them, reducing the round-trip-time on changes when running such targets by an order of magnitude. Type checking is handled in separate targets that can be run explicitly or with the catch-all
bazel build ....
Here are the results of the benchmarks.
Full clean build
Fastest: ts_project + swc with remote execution
The fastest full clean build we were observed is on the MacBook Pro host with
ts_project + swc and remote execution. This full transpile & type-check of 10M lines of TypeScript code took just 2 minutes and 13 seconds, 8.4x faster times faster than the equivalent build without remote execution running on the MacBook host, which took 18 minutes 35.
ts_project+ swc configuration, we ran the 11,100 swc transpile actions (one per TypeScript file) locally on the 16 MacBook Pro cores while the remote execution cluster ran the TypeScript type-check and declaration file emit actions. Transpiling with swc is so fast and short-lived that running locally is faster than using remote execution due to network latency and overhead of uploading inputs and downloading outputs.
Runners-up: ts_project & ts_library with remote execution on GitHub actions host
ts_project build took only 2 minutes and 29 seconds, 7.5x faster than the equivalent build without remote execution running on the MacBook host, which took 18 minutes and 45 seconds. The
ts_library build was virtually identical at 2 minutes and 31 seconds.
Running on GitHub Actions, with all actions executing remotely, was faster than running the same configuration on the MacBook host, which took 3 minutes and 2 seconds. The faster build times on GitHub actions were due to the superior network connection on GitHub actions machines compared to the MacBook host running in my home office. TL;DR is that when you're using remote execution, your uplink speed to the remote execution cluster matters.
Incremental full builds
ts_library has the best incremental full build time at 4.1s with its heavily optimized workers. @bazel/typescript
ts_project with swc is runner up at 7.5s.
ts_project with swc was a close third taking 8.0s. In the smaller original rules_ts benchmark,
ts_project with swc was 2nd fastest after
Incremental builds with remote execution were slower than local in this scenario as there were not many actions to run and the network overhead of using remote execution made the overall time slower.
tsc is quite a bit slower than every other configuration in this benchmark for incremental builds. It is configured as a single project. In a real world scenario, you would likely split it up to multiple invocations and use TypeScript project references between them which may result in faster build times.
Clean "devserver" builds
Running swc actions on remote executors is a de-optimization since the added time of network latency and upload/download time adds more overhead than the benefit of more executors when on the MacBook Pro host with 16 cores.
Incremental "devserver" builds
Like clean "devserver" builds, the most performant incremental "devserver" builds are the ones that use swc for transpilation with local actions. Both
ts_project rules with swc clocked around 2 seconds.
ts_library without remote execution was comparably fast at 3.2s.
Remote execution for incremental "devserver" builds was slower than for local builds due to the additional network lately and upload/download time.
The Bottom Line
Remote execution with rules_ts on large TypeScript projects can make an order magnitude impact on large build times. Your developers will be more productive and thank you when their wait time for large builds is reduced by 10x or more. As your project grows, remote execution makes it possible to easily scale your build compute horizontally to keep large builds fast.
For very fast incremental builds with few actions, however, developers may get faster builds times by keeping actions running locally depending on how fast their connection is to the remote execution cluster. Cloud hosted development environments may solve this discrepancy in the future since the developer machines can be located very close to the remote execution cluster to keep network overhead at a minimum.