The Many Caches of Bazel¶
As its "{fast, correct} — choose two" tagline promises, a major feature of Bazel is performance. Caching is a key technique Bazel uses to improve build speed. Bazel deploys several kinds and layers of caches. There are so many caches that it’s hard to keep them straight. Additionally, frequently used terms like “action cache” can be ambiguous. This blog post will lay out the major Bazel caching mechanisms.
In-memory caches¶
Bazel internally is built on an incremental evaluation framework called Skyframe.1 Bazel splits the work of a build into a DAG of numerous Skyframe computations, each of which may be individually cached.
The in-memory Skyframe cache is Bazel’s most comprehensive cache layer. It caches many small steps of the build. For example, all these are cached by Skyframe:
- the results of
glob()
calls inBUILD
files - the action graph generated by rules
- build actions themselves
Most of this data does not have persistent caches and must be recreated whenever the Bazel server dies. Preserving these caches across commands is a primary reason Bazel runs a daemon in the background.
Repository cache¶
Bazel can pull in build sources from external archives through several WORKSPACE
and MODULE.bazel
mechanisms, mostly prominently the http_archive
rule. The repository cache stores these archives in the filesystem, so multiple downloads of the same artifact can be avoided. It keeps its data in a filesystem tree that can be found by running bazel info repository_cache
.
The repository cache can be shared across multiple workspaces and Bazel processes. However, since it only stores archives, a Bazel workspace must perform the costly expansion step for any archives that it uses.3 Another defect of the repository cache is that there is no automated pruning process, so it grows without bound.2
Output tree¶
The most prominent data Bazel can cache is the result of actions. The most basic form of action caching is Bazel’s ability to not rerun actions that have valid outputs already in the output tree, bazel-out/
. If I've executed bazel build //foo
, immediately running bazel shutdown; bazel build //foo
will rerun no actions because Bazel recognizes all results are still valid in the output tree. This is sometimes called the “action cache”, especially in Bazel internals. I prefer the term “output tree cache” to avoid ambiguity with other uses of “action cache”.
If Bazel is considering running an action for which an output in the output tree already exists, how does it know if the output still a valid result? Bazel maintains an index describing how files in the output tree were generated. Bazel persists this index in $(bazel info output_base)/action_cache/
. The index uses a custom binary format4; its contents may be viewed with bazel dump --action_cache
. Here’s an example entry from the output of the dump command for an action that generates libstring-hjar.jar
:
The actionKey
is a hash of non-file data related to the action, such as its command line arguments and mnemonic. Environment variables from the --action_env
Bazel flag are hashed into the usedClientEnvKey
field. The digestKey
is a hash of the inputs and outputs of the action. When Bazel is considering executing the action to generate libstring-hjar.jar
, it checks this action cache entry against its current in-memory version of the action. If all the digest fields match, the libstring-hjar.jar
in the output tree is valid, and the action need not be rerun.
As the first layer of persistent caching for actions, the output tree cache is an important everyday optimization for Bazel users. However, it doesn’t have a long memory. For example, the sequence
Bash | |
---|---|
will build //foo
three times even though the first and last Bazel invocations are identical. To address that problem as well as the desire to share caches across machines, the remote cache exists.
Remote caches¶
The remote cache has two primary stores: the action cache and the content addressable store (CAS). The interface to these stores is defined by the bazelbuild/remote-apis protocol buffers. Both stores can be thought of as maps of keys to blobs:
Cache | Keys | Values |
---|---|---|
Action | digest of the action’s metadata and inputs.5 | serialized ActionResult protocol buffer messages |
CAS | digest (typically SHA-256) of value | action output files |
To use the remote cache for an action, Bazel computes the action’s key. This key is a digest of the action’s metadata and inputs.5 If the action cache does not have a result for an action key, there is a miss in the remote cache. Bazel executes the action, uploads the action’s outputs to CAS, and puts the appropriate ActionResult
into the action cache. If the action cache does contain a value for the action key, Bazel uses the digests in ActionResult
to download the action’s outputs from the CAS.
Local disk cache¶
Counter-intuitively, Bazel can use the “remote” cache infrastructure completely locally. Bazel’s disk cache feature is built on remote cache concepts. The disk cache, a useful feature in itself6, is also a simple way to explore remote caching without needing separate remote cache software. Using the --disk_cache
flag while building Bazel itself, we can see the remote cache concepts in action7:
Remote execution¶
Remote execution is a simple extension to the concepts of remote caching. To execute an action remotely:
- Bazel stores an action’s input files in the CAS.
- The remote executor server executes the action, stores its results in the CAS, and returns an
ActionResult
protobuf to Bazel. - Bazel downloads the results from the CAS.
Conclusion¶
We’ve explored many of Bazel’s caches to reach a better understanding of how Bazel reduces build times while preserving correctness.
-
See our 2023 BazelCon talk for more about Skyframe. ↩
-
It is safe to remove the entire cache manually from the filesystem with
rm -rf $(bazel info repository_cache)
. ↩ -
There’s a design document to improve this situation. ↩
-
This makes this index essentially a simple, custom database, which is somtimes a frustrating engineering choice when bugs appear. ↩
-
So, the remote cache action key is similar but different than the
actionKey
in the output tree cache. ↩↩ -
Unfortunately, like many Bazel caches described in this post, the disk cache’s growth is currently unbounded. Work is underway to add garbage collection. ↩
-
The CAS and AC are stored split across many subdirectories. This is an old hack to work around filesystem scalability problems with large directories. ↩