Migrating to Bazel Modules (a.k.a. Bzlmod) - Repo Names and Runfiles¶
The first post in our Bzlmod migration series explained many of the problems that may arise when migrating your project. These next three posts will explore various solutions to problems arising from changes in how Bazel handles repository names under Bzlmod, beginning with runfiles paths. After applying the techniques in this post, your project should be well insulated from runfiles path related breakages, now and well into the future.
All posts in the "Migrating to Bazel Modules" series
- Migrating to Bazel Modules (a.k.a. Bzlmod) - The Easy Parts
- Migrating to Bazel Modules (a.k.a. Bzlmod) - Repo Names and Runfiles
- Migrating to Bazel Modules (a.k.a. Bzlmod) - Repo Names and rules_pkg
- Migrating to Bazel Modules (a.k.a. Bzlmod) - Repo Names, Macros, and Variables
- Migrating to Bazel Modules (a.k.a. Bzlmod) - Module Extensions
- Migrating to Bazel Modules (a.k.a. Bzlmod) - Fixing and Patching Breakages
- Migrating to Bazel Modules (a.k.a. Bzlmod) - Repo Names, Again…
- Migrating to Bazel Modules (a.k.a. Bzlmod) - Toolchainization
- Migrating to Bazel Modules (a.k.a. Bzlmod) - Maintaining Compatibility, Part 1
- Migrating to Bazel Modules (a.k.a. Bzlmod) - Maintaining Compatibility, Part 2
- Migrating to Bazel Modules (a.k.a. Bzlmod) - Maintaining Compatibility, Part 3
- Migrating to Bazel Modules (a.k.a. Bzlmod) - Maintaining Compatibility, Part 4
Prerequisites¶
The following advice assumes that you're familiar with the runfiles concept, as well as the concepts presented in the following Bazel documents:
- External Dependencies Overview
- Canonical repository name
- Apparent repository name
- Bazel Modules
- Output Directory Layout
It also assumes that you've correctly declared all dependency targets in the
deps and/or data attributes of your BUILD rules.
To review many key differences between Bzlmod and the legacy WORKSPACE model,
see the comparison table from the "Module Extensions"
post.
Translating file paths using runfiles libraries¶
If you enable Bzlmod and your runfiles paths are immediately broken, you need to start using runfiles libraries.
Runfiles libraries help you translate runfile paths corresponding to Bazel
labels to actual file system paths, so your programs can find these files at
runtime. For example, from the rlocationpaths example
below, in which the main repository's module name from MODULE.bazel is
engflow:
- Bazel label:
//data:0-foo.txt - Runfile path:
engflow/data/0-foo.txt - Runfiles link:
/home/mbland/.cache/bazel/_bazel_mbland/1234567890abcdef/execroot/_main/bazel-out/k8-fastbuild/bin/runfiles_demo.runfiles/_main/data/0-foo.txt - Actual path:
/home/mbland/src/EngFlow/example/runfiles/engflow/data/0-foo.txt
(I'll explain the patterns behind the runfile path and actual path below.)
Runfiles libraries are distributed as source code for different languages, enabling tests or other programs to locate their runfiles programatically during execution. They originally came about to make Bazel runfiles portable to Windows (developed by my colleague László Csomor prior to the existence of EngFlow). Many of them now map runfiles paths to actual file system paths under Bzlmod (via the repo mapping mechanism), making them essential to using Bzlmod.
In this way, runfiles libraries prevent your code from breaking whenever there's a change in Bazel's internal repository name representation. From Bazel modules: Repository names and strict deps (emphasis theirs):
...the canonical name format is not an API you should depend on and is subject to change at any time. Instead of hard-coding the canonical name, use a supported way to get it directly from Bazel...
The canonical name format just changed again.
The change to no longer encode module versions in canonical repo names in Bazel
7.1.0 is a recent
example of Bazel maintainers altering the format. The Bazel maintainers
just changed the canonical repo name format again to fix build performance
issues on Windows caused by the ~
characters. This will
land in Bazel 7.3.0 under the
‑‑incompatible_use_plus_in_repo_names flag, which implies other
canonical name format changes as well. See also: bazelbuild/bazel:
‑‑incompatible_use_plus_in_repo_names #23127.
Runfiles libraries for different languages¶
Here's where you can find the runfiles libraries for a few common languages. The links and notes below describe how to depend on and initialize these libraries; sections below describe how to use them.
Note that the links here are to files within the latest versions at the time of writing. Make sure that you're viewing the versions matching your project's actual dependencies.
C++¶
- Target:
@bazel_tools//tools/cpp/runfiles - Documentation: runfiles_src.h header comment
Requires initialization using the BAZEL_CURRENT_REPOSITORY macro symbol.
Java¶
- Target:
@bazel_tools//tools/java/runfiles - Documentation: Runfiles.java class comment
Requires using the @AutoBazelRepository annotation to generate a class
constant used during initialization.
Bash¶
- Target:
@bazel_tools//tools/bash/runfiles - Documentation: runfiles.bash header comment
Requires copying a preamble from the header comment to enable discovery at runtime.
Python¶
- Target:
@rules_python//python/runfiles - Documentation: rules_python/python/runfiles/README.md; the runfiles.py source file
Do not use the @bazel_tools Python runfiles library, as it is out of
date and does not support Bzlmod.
Go¶
- Target:
@rules_go//go/runfiles:go_default_library - Documentation: runfiles.go (the whole file)
Rust¶
- Target:
"@rules_rust//tools/runfiles" - Documentation: runfiles.rs header comment
Using predefined source/output path variables to pass paths to rlocation()¶
All runfiles libraries, once properly initialized, provide a standard
rlocation() or Rlocation() function. For example, here is the Rlocation()
function signature and docstring from rules_python:
The intended usage is to use the rlocationpath and rlocationpaths
predefined source/output path variables in your BUILD targets to
generate paths to pass to rlocation():
rlocationpath: The path a built binary can pass to theRlocationfunction of a runfiles library to find a dependency at runtime.......[the result] always starts with the name of the repository....
The
rlocationpathof a file in an external repository repo will start withrepo/, followed by the repository-relative path.Passing this path to a binary and resolving it to a file system path using the runfiles libraries is the preferred approach to find dependencies at runtime.
Pass the rlocationpath results to your program as command line arguments or
environment variables by:
- Specifying them in the
argsorenvattribute of test or binary targets - Including them in the
cmdattribute of a genrule
Generating compiled modules
It's also possible to generate text files including these paths, or source
files for different languages defining constants from these rlocationpath
values. I've done it—but ultimately found it to be unnecessary. See
the Passing known file path constants to
rlocation() section
below.
rlocationpaths example¶
To illustrate, we'll use a small example project containing a py_binary that
emits information about its runfiles, which are provided by rlocationpaths.
If you'd like to follow along with the example on your own machine, clone the EngFlow/example repo by running:
| Clone the EngFlow/example repo | |
|---|---|
Let's examine some of the files from this repo before running the demonstration program.
First, we define our engflow module in engflow/MODULE.bazel. It depends on
the frobozz module from the same git repository, using local_path_override to simulate an
external repository (lines 4 and 5).
Now let's look at the example program itself. Notice that it:
- Instantiates the
_RUNFILESobject using therules_pythonrunfiles library (lines 23 and 26) - Prints information about runfiles related environment variables, the runfiles directory, and the current working directory (lines 50-62)
- Iterates over runfiles paths from both command line arguments and the
RUNFILE_PATHSenvironment variable (lines 64-72) - Prints information about individual runfiles, their runfiles directory links, and their actual file system paths (lines 36-43)
The engflow/BUILD file defines a py_binary for our demo program, with env
and args attributes that will be applied during bazel run. The runfiles
targets are specified in its data attribute.
It also contains a genrule that uses this binary and passes runfile paths as
environment variables and command line arguments in its cmd attribute. For
genrules, the runfile targets are specified in the srcs attribute.
Runfiles input attributes may vary.
Most rules have a data attribute to specify runfiles, but genrule
doesn't; it uses srcs instead. Other rules may or may not also populate
the runfiles directory with srcs. Check the documentation for the rule in
question to learn what's included in its runfiles. When in doubt, you can
modify this example project, or write your own from scratch, to get insight
into how specific rules manage their runfiles.
Let's build the package and see the resulting outputs.
We can see the generated runfiles directory (runfiles_demo.runfiles), as well
as the .repo_mapping (for Bzlmod) and .runfiles_manifest support files. The
runfiles libraries use these artifacts to translate relative runfile paths to
their actual paths within the execution environment.
No runfiles directory by default on Windows
If you're running this program on Windows, you likely won't see the
runfiles_demo.runfiles directory. See the Enabling runfiles directories
on Windows section below for
details on how to enable runfiles directory generation.
Let's run the //:runfiles_demo Python binary. It receives one space separated
list of rlocationpaths paths from its command line args, and another passed
in via the RUNFILES_PATHS environment variable.
In the example output below, I've elided some output as follows to collapse details specific to my local build:
OUTPUT_BASErepresents the result ofbazel info output_base, e.g.,/home/mbland/.cache/bazel/_bazel_mbland/1234567890abcdef.ARCHrepresents the build architecture output path component, e.g.,k8-fastbuild.RUNFILES_DIRin therunfiles link:paths is the value ofrunfiles dir:at the top of the output.EXAMPLE_DIRis where I've cloned the example repository, plus therunfilesdirectory containing the example, e.g.,/home/mbland/src/EngFlow/example/runfiles.
This should make the output easier to understand, and help you see the same patterns in the output when running the example locally.
There are few interesting things to notice here:
- The paths generated by
rlocationpathsalready include the canonical name of thefrobozzexternal repository, which isfrobozz~. (For now, that is; it will appear asfrobozz+in the near future.) - The actual locations for runfiles in external repositories are direct paths
into the external repository storage directory under
OUTPUT_BASE. - Runfile paths for files in our own repository begin with
_main. This is because we're building ourengflowmodule as our main repository. bazel runruns therunfiles_demobinary inside the_mainsubdirectory of itsrunfiles dir.RUNFILES_MANIFEST_FILEis defined instead ofRUNFILES_DIR, so the runfiles library is using the manifest file to translate paths.- The actual
runfiles dircomes from stripping_manifestfromRUNFILES_MANIFEST_FILE, and we can see that corresponding runfiles links do exist for each file.
Now let's run //:runfiles_demo as part of the genrule target and inspect the output. SANDBOX stands in for the sandbox path components.
Notice that:
RUNFILES_DIRwas defined in the environment;RUNFILES_MANIFEST_FILEwas not.- The rule did not run the
//:runfiles_demobinary in theRUNFILES_DIR. - Since the
//:runfiles_demobinary executed duringbazel buildinstead ofbazel run, bothrlocationpathoutputs resolved to a sandbox path, not the typical output path. - The
argsandenvattributes of//:runfiles_demoweren't used, since thegenruleexecuted the//:runfiles_demobinary directly, not viabazel run.
Passing known file path constants to rlocation()¶
While using rlocationpath is the preferred way to pass runfiles paths
through to the runfiles libraries, it's not strictly necessary. It can also
prove inconvenient in some cases, such as:
- Writing tests that refer to one or more input data files
- Providing a library that executes a binary on behalf of a larger program
In such cases, hardcoding paths to pass as arguments to rlocation() is easier
than plumbing through rlocationpath values. This is acceptable if the paths
aren't going to change beyond your control. Remember:
- For runfiles in external repositories, use the repository's apparent name as the first path component. The rest of the path should be relative to that repository's root.
- For runfiles within the same repository, use your repository's
modulename fromMODULE.bazelas the first path component. The rest of the path should be relative to your repository's root.
These differ from paths generated with rlocationpath, which begin with the
canonical repository name or _main, respectively. The rlocation() runfiles
library function will then translate these path constants into actual file paths
at runtime, using the repo mapping mechanism.
Watch out for Windows!
If a runfile is an executable, you may need extra logic to add the .exe
extension on Windows. This is unnecessary when using rlocationpath, since
it will always generate the correct executable path.
Predefined path constants repo mapping example¶
To see the repo mapping mechanism in action, we'll use the example program to
simulate passing a path constant to rlocation(). This time, we'll run the
runfiles_demo binary directly from bazel-bin to avoid applying its args
and env attributes.
Notice:
- The
runfiles link:constructed manually for each path does not exist. This is because the first path component of each path on the command line is an apparent repository name. The actual runfiles links contain a path component for the corresponding canonical repository names.rlocationpathvalues already contain translated repository names, which do produce valid runfiles links, as we saw in the output frombazel run //:runfiles_demo. - The runfiles library translated these runfile paths to the same actual
locations as the
bazel run //:runfiles_demoexample. - Even though
runfiles_demoisn't running in a sandbox, and symlinks exist underbazel-bin/runfiles_demo.runfiles, the runfiles library still returns the actual absolute path.
We can inspect the runfiles links by interpolating the correct path segments for each repository, and see that they point to the same files:
Constructing runfile paths without a runfiles library¶
If there isn't a runfiles library available for your language, or it isn't yet Bzlmod compatible, you can still construct runfiles paths manually.
However, this requires using rlocationpath to define runfile paths, then
passing them to the program as command line arguments or environment variables.
With Bzlmod enabled, that's the only reliable way to get the correct path
beginning with the canonical repo name or _main without a runfiles library.
Well, that's not the only way to pass canonical repo names...
Technically, you could write a genrule to emit rlocationpath output into
a text file that a program could read at runtime. Or you could use a
genrule, or write your own custom rule, to emit a source file defining
rlocationpath constants to compile into a target. Or you could write a
macro to invoke
Label.repo_name on
a target label and inject that. (I actually tried all of these before
realizing I only needed to use the @rules_python//python/runfiles library
instead of @bazel_tools//tools/python/runfiles.) You could do
these things, but it's likely more work than passing an argument or an
environment variable. Or you could be super cool and update the runfiles
library for your language, or contribute the first implementation if one
doesn't already exist.
Finding the runfiles directory¶
If RUNFILES_DIR is defined, that will be the location of your runfiles
directory. If it isn't, and ‑‑enable_runfiles is set to true
on your platform, stripping _manifest from the end of RUNFILES_MANIFEST_FILE
will produce the runfiles directory path. This is how the
runfiles_demo.py script determines the runfiles
directory when RUNFILES_DIR is undefined.
Alternatively, assuming sys.argv[0] is the full path to your program,
f'{sys.argv[0]}.runfiles' will be your runfiles directory if it exists.
(After translating this Pythonic syntax to the language of your choice, of
course. For example, in Bash, it would be $0.runfiles.)
The Bash runfiles library bootstraps itself.
Fun fact: The runfiles.bash init code implements a minimal rlocation()
lookup for the runfiles.bash file itself, that demos all of these cases
in five lines of Bash. (Hat tip to László for pointing this
out.)
Enabling runfiles directories on Windows¶
Runfiles directories are disabled by default on Windows. This is because Bazel creates symlinks to actual files within the runfiles directory. Before Windows 10 Insiders build 14972, creating symlinks required using a console elevated to administrator mode. As mentioned above, making runfiles compatibile with Windows in light of this restriction is what motivated the initial development of runfiles libraries.
However, you can now enable Developer Mode on Windows, for later versions of Windows 10 or Windows 11. This will allow symlink creation without admin priviliges. Then explicitly set ‑‑enable_runfiles to enable Bazel to create runfiles directories.
Building the runfile path¶
Once you have the runfiles directory, join the result of rlocationpath to the
end of it to locate a specific runfile. As with using a runfile library, it's
still incumbent upon your code to check the resulting runfile path for existence
before using it.
See the runfiles_demo.py program above, which manually constructs runfiles links alongside using a runfiles library, for an example of how to do this.
Starting child processes that need runfiles¶
Check the documentation for your runfiles library for advice on starting child processes that also use runfiles.
Pass a manually located runfiles directory as RUNFILES_DIR
If you aren't using a runfiles library, and located the runfiles directory
manually, then add its path to the child process's environment as
RUNFILES_DIR.
In most cases, the library will provide a function to access runfiles related
environment variables (e.g., EnvVars(), getEnvVars() or Env()). Add these
variables to the child process's environment, as they may not have been defined
in the parent process's environment.
Adapting the example from rules_python/python/runfiles/README.md, launching a subprocess in Python would look something like this:
| Passing runfiles env vars to a child process in Python | |
|---|---|
Conclusion¶
At this point, all of your targets depending on runfiles should build and run successfully under Bzlmod. Future changes to the canonical repo name format shouldn't break your targets. They should remain portable if built as an external dependency of another repo.
In the next post, we'll learn how to use
rules_pkg properly to avoid the
need for file path macros when building archives under Bzlmod. Following that,
we'll learn how to inject the path to an external repo into our BUILD rules
when we really have to.
Postscript¶
If you want to learn how to write rules that make use of runfiles, see Jay Conrod's post Writing Bazel rules: data and runfiles. (You can also see that I ripped off his style of listing the links to every post in this series.)
Updates¶
2025-10-09¶
-
Put the list of all posts in the series into the collapsible All posts in the "Migrating to Bazel modules" series info block.
-
Added a suggestion to review the Module Extensions comparison table to the Prerequisites section.