Skip to content

Migrating to Bazel Modules (a.k.a. Bzlmod) - Repo Names, Macros, and Variables

The previous two posts in this series showed how to use runfiles mechanisms and rules_pkg mechanisms to avoid dealing with canonical repository names under Bzlmod. However, one special case remains: when you need to depend on the name of a repository directory, either at build time or runtime. This post explains how to access canonical repository names in a portable way to solve such problems. We'll use a macro when we can, and a custom Make Variable when we can't, including when dealing with alias targets.

This article is part of the series "Migrating to Bazel Modules (a.k.a. Bzlmod)":

Prerequisites

As usual with this series, it's important to be acquainted with the following concepts pertaining to external repositories:

If you wish to follow along with the examples, clone the EngFlow/example repository and change to the project directory like so:

Clone EngFlow/example and change to the example directory
git clone https://github.com/EngFlow/example
cd example/bzlmod/canonical-repo-name-injection

Examples of specific problems solved by macros or custom Make variables

The techiques covered throughout this post come from solutions to two specific problems in our codebase.

Custom JavaScript module loader for rules_nodejs's @npm//:node_modules

The first post in this series mentioned that we're migrating our JavaScript rules from the deprecated rules_nodejs to aspect_rules_js. We also use esbuild to bundle our artifacts, and use a custom plugin to resolve our JavaScript module imports.

We'll talk about this later.

I'll cover how we got rules_nodejs to work with Bzlmod in a later blog post. It turns out that it can exist in the same project as aspect_rules_js, which is good news as far as avoiding a big bang migration. There are other tricky aspects to completing the migration, but thankfully, that's not one of them.

The problem is that this module loader plugin needs to scan the node_modules directory in the @npm repo exported by rules_nodejs (i.e., @npm//:node_modules). This plugin runs during the build, and there's no env or other attribute we can use to pass the correct repository path through. So we need to inject this path, including the canonical repo name for @npm, into the plugin's source.

The //:genrule-targets target described in this post models our solution to this problem. We're using a genrule and a custom Make variable to generate a JavaScript module containing an exported constant with the correct path.

This problem doesn't exist in aspect_rules_js.

aspect_rules_js creates a node_modules directory under bazel-bin, avoiding this specific problem. But this approach is still good to know in case you're facing a similar problem with another external repository.

Updating an environment variable in a cmake build target

We have a vendored dependency that normally builds with cmake, which we build using the cmake rule from rules_foreign_cc (currently version 0.10.1). This target also depends on GNU Bison, which we use via rules_bison.

Bison requires access to some data files at build time, at a location defined by the BISON_PKGDATADIR environment variable. rules_bison normally manages this variable, but relying on the default produces the following error for this particular target:

bison error before setting BISON_PKGDATADIR properly
bison: external/rules_bison~~bison_repository_ext~bison_v3.3.2__cfg6F051E8B/
data/m4sugar/m4sugar.m4: cannot open: No such file or directory

The actual BISON_PKGDATADIR files reside within the runfiles directory for the @bison_v3.3.2//bin:bison target created by rules_bison. However, this target, which users depend upon, is an alias to a target in another repository generated by rules_bison that users should not depend upon. The runfiles actually reside in that generated repository, as seen in the error output above, not the @bison_v3.3.2 repository.

To solve this problem, we use a custom Make variable generated by the custom repo_name_variable Rule (described below) to create the correct BISON_PKGDATADIR path.

cmake rule injecting $(bison-repo) into env var
load("@rules_foreign_cc//foreign_cc:defs.bzl", "cmake")
load("//:repo-names.bzl", "repo_name_variable")

# ...snip...

cmake(
    # ...snip...
    env = {
        "BISON_PKGDATADIR": "$$EXT_BUILD_ROOT/" +
            "$(execpath @bison_v3.3.2//bin:bison).runfiles/" +
            "$(bison-repo)/data",
    },
    # ...snip...
    toolchains = [ ":bison-repo" ],
)

repo_name_variable(
    name = "bison-repo",
    dep = "@bison_v3.3.2//bin:bison",
)

Look what I found after sending this post for review...

You can see that we prefix BISON_PKGDATADIR with the (not well documented) EXT_BUILD_ROOT variable from rules_foreign_cc. While waiting for final approval for this post, I idly searched around in the rules_foreign_cc code some more for EXT_BUILD_ROOT references. I happened to find the private _expand_locations_in_string() function, used to replace "$(execpath " in env values with "$$EXT_BUILD_ROOT$$/$(execpath ". Removing $$EXT_BUILD_ROOT/ from the env values of our actual target did work. So, if you see EXT_BUILD_ROOT in your cmake() or other rules_foreign_cc rules, you may be able to remove it. Try it and see.

Do not hardcode canonical repo names

In many cases, it may seem easy and expedient to hardcode the canonical repo name. But as always, we must remember this warning from Bazel modules: Repository names and strict deps (emphasis theirs):

Note that the canonical name format is not an API you should depend on and is subject to change at any time. Instead of hard-coding the canonical name, use a supported way to get it directly from Bazel...

It's also worth remembering that the canonical name format has changed recently, and it will change again soon. We'll also see in this post examples of more complex canonical names that are even trickier to hardcode.

Look closely at the quoted documentation...

The Bazel documentation quoted above actually already uses the new, upcoming canonical repo name format, with + replacing the use of ~. The new format isn't yet the default, which is why the examples in this post still use ~. This inconsistency between the current default and the current documentation underscores the very point made here to not hardcode canonical repo names.

Runfiles libraries do not resolve directory paths

The macro and custom Make variable methods are necessary because, unfortunately, we can't use runfiles libraries to help us resolve external repository directories. We'll use our example program from the Repo Names and Runfiles post to illustrate this. (EXAMPLE_DIR in the output below refers to the cloned location of the runfiles directory.)

Runfile library resolves a file, not its directory
$ pushd ../../runfiles/engflow
$ bazel build //:runfiles_demo
$ bazel-bin/runfiles_demo frobozz/1-gue.txt frobozz

RUNFILES_MANIFEST_FILE: EXAMPLE_DIR/engflow/bazel-bin/runfiles_demo.runfiles_manifest
runfiles dir:           EXAMPLE_DIR/engflow/bazel-bin/runfiles_demo.runfiles
current working dir:    EXAMPLE_DIR/engflow

From the command line arguments:
  runfile path:  frobozz/1-gue.txt
  runfiles link: RUNFILES_DIR/frobozz/1-gue.txt
  link exists:   False
  actual path:   OUTPUT_BASE/external/frobozz~/1-gue.txt
  exists:        True

  runfile path:  frobozz
  runfiles link: RUNFILES_DIR/frobozz
  link exists:   False
  actual path:   None
  exists:        False

# Don't forget to change back to the original example directory!
$ popd

Even though the runfiles library successfully found the actual file, it could not find its parent directory specifically.

You might be able to hack a runfile path to get a repo name...

Of course, the runfiles paths returned from runfiles libraries follow a pattern, as do the paths returned from rlocationpath. It would be possible to parse the canonical repo name from these paths in a somewhat portable way. However, it's not really worth the effort. The other approaches covered in this post are easier to implement and apply, while being more future proof as well.

JavaScript currently has no compatibile runfiles libraries

In my Repo Names and Runfiles post, I listed runfiles libraries for several common programming languages. JavaScript was not one of them.

If you're using Bazel to build JavaScript, it seems doubtful you're using runfiles libraries anyway. The lack of a Bzlmod compatible library seems to indicate a lack of demand. That means hacking a path returned by a Bzlmod-aware runfiles library isn't an option to begin with.

Accessing canonical repo names via macros or custom Make variables

We can use either macros or custom Make variables (generated via custom Rules) to access canonical repo names in a portable way. We'll see the differences between the two approaches, and when you must use a custom rule instead of a macro.

I'll use the bzlmod/canonical-repo-name-injection project in the EngFlow/example repo to demonstrate and compare these approaches. The example code in that project is directly inspired by the code we use to solve specific problems described towards the end of this post.

Copy the repo-names.bzl file (or parts of it) into your own code base.

The repo-names.bzl file within the project directory contains all of the macros and rules described in this post. If you find these macros and rules useful, you're welcome to copy this file, in whole or in part, into your own code base. Either way, just make sure to preserve the copyright notice at the top of the file.

Using macros

For many (most?) external repository issues, these very straightforward Bazel macros will do the trick.

Macros from //:repo-names.bzl
# These macros execute during the loading phase. For `alias` targets, they
# return the name of the repo containing the `alias`, not the repo of the
# target from its `actual` attribute.

def canonical_repo(target_label):
    """Return the canonical repository name corresponding to target_label."""
    return Label(target_label).repo_name

def workspace_root(target_label):
    """Return a target's repo workspace path relative to the execroot."""
    return Label(target_label).workspace_root

To see these macros in action, run:

Executing //:repo-macros and examining the macro output
1
2
3
4
5
6
$ bazel build //:repo-macros && cat bazel-bin/repo-macros.txt

[ ...snip... ]
@pnpm
  canonical_name: aspect_rules_js~~pnpm~pnpm
  workspace_root: external/aspect_rules_js~~pnpm~pnpm

This executes the following genrule that converts the value of the repo_target variable using the above macros:

Using the canonical_repo() and workspace_root() macros
load("//:repo-names.bzl", "canonical_repo", "workspace_root")

# This value propagates through the rules below.
# Try it with other targets of your choosing!
repo_target = "@pnpm"

# Demonstrates the macro-based approach.
genrule(
    name = "repo-macros",
    outs = ["repo-macros.txt"],
    cmd = "printf '%s\n  canonical_name: %s\n  workspace_root: %s\n' >$@" % (
      repo_target, canonical_repo(repo_target), workspace_root(repo_target)
    )
)

Any repo name from MODULE.bazel generated by one of the following functions is fair game:

Macros require only a repo name

These macros only need a valid apparent repository name from MODULE.bazel, not an existing BUILD target. This is different from the custom Make variable approach below, which requires an existing target, since Bazel will resolve it during the analysis phase.

Here's the Frobozz Magic Remote Caching and Execution Platform Company's MODULE.bazel file.

bzlmod/canonical-repo-name-injection/MODULE.bazel
"""Example module for canonical repo name injection"""

# Main module
module(name = "frobozz", version = "1.2.3")

# aspect_rules_js
bazel_dep(name = "aspect_rules_js", version = "2.0.1", repo_name = "rules_js")

pnpm = use_extension("@rules_js//npm:extensions.bzl", "pnpm")
use_repo(pnpm, "pnpm")

# rules_bison
bazel_dep(name = "rules_bison", version = "0.2.2")

bison = use_extension(
    "@rules_bison//bison/extensions:bison_repository_ext.bzl",
    "bison_repository_ext",
)
bison.repository(version = "3.3.2")
use_repo(bison, "bison_v3.3.2")

Here are the results of running the same command for other values of repo_name using the other repos from MODULE.bazel:

//:repo-macros output for different repo targets
@frobozz
  canonical_name:
  workspace_root:

@rules_js
  canonical_name: aspect_rules_js~
  workspace_root: external/aspect_rules_js~

@rules_bison
  canonical_name: rules_bison~
  workspace_root: external/rules_bison~

@bison_v3.3.2//bin:bison
  canonical_name: rules_bison~~bison_repository_ext~bison_v3.3.2
  workspace_root: external/rules_bison~~bison_repository_ext~bison_v3.3.2

Note that:

  • The main repository, @frobozz, returns the empty string in both cases.
  • The assigned repo name @rules_js resolved to the canonical repo name of the underlying @aspect_rules_js repo.
  • The workspace_root is always the canonical_name prefixed with external/. However, it's probably best to use workspace_root where possible, as it seems more future proof than relying upon the external/ path prefix.
  • @bison_v3.3.2//bin:bison is an alias to a target in a private, generated repo. The macros produce values from the alias, not from the target to which it points, since the alias isn't resolved during the loading phase.

Macros do not evaluate underlying alias targets

The macro method may be perfect for your use case. There is one way in which it might break down, however: If a target is an alias to a target in another repository. This is the case with the @bison_v3.3.2//bin:bison target in our example.

In this case, there's no way a Label constructed in a macro during the loading phase can know the specified target is an alias. This is because an alias is a Rule, and rules aren't executed until the analysis phase. You need access to the actual Target provided by the alias rule during analysis. This means you need to write a custom Rule that depends on the alias target.

From the documentation for attributes of type attr.label:

At analysis time (within the rule's implementation function), when retrieving the attribute value from ctx.attr, labels are replaced by the corresponding Targets. This allows you to access the providers of the current target's dependencies.

Using custom Make variables

Unlike macros, custom Make variables are generated by custom Rules during the analysis phase. They'll work in genrules, and in any rule attribute marked as "Subject to 'Make variable' substitution".

The following rules define custom Make variables corresponding to the canonical_name() and workspace_name() macros.

Custom Make variable rules from //:repo-names.bzl
# These rules will produce the repo name of the target of an `alias` rule,
# defined by its `actual` attribute.

def _variable_info(ctx, value):
    """Return a TemplateVariableInfo provider for Make variable rules."""
    return [platform_common.TemplateVariableInfo({ctx.attr.name: value})]

repo_name_variable = rule(
    implementation = lambda ctx: _variable_info(
        ctx, ctx.attr.dep.label.repo_name
    ),
    doc = "Defines a custom variable for its dependency's repository name",
    attrs = {
        "dep": attr.label(
            mandatory = True,
            doc = "target for which to extract the repository name",
        ),
    },
)

repo_dir_variable = rule(
    implementation = lambda ctx: _variable_info(
        ctx, ctx.attr.dep.label.workspace_root
    ),
    doc = "Defines a custom variable for its dependency's repository dir",
    attrs = {
        "dep": attr.label(
            mandatory = True,
            doc = "target for which to extract the repository dir",
        ),
    },
)

To see these variables in action, run:

Executing //:repo-vars and examining the custom variables output
1
2
3
4
5
6
$ bazel build //:repo-vars && cat bazel-bin/repo-vars.txt

[ ...snip... ]
@pnpm
  canonical_name: aspect_rules_js~~pnpm~pnpm
  workspace_root: external/aspect_rules_js~~pnpm~pnpm

This looks exactly the same as the //:repo-macros output so far, but the implementation of the rule is quite different.

Using the repo_name_variable and repo_dir_variable rules
# Demonstrates the custom Make variable-based approach. Depends on the
# ":repo-target" alias to show how the variables have access to the underlying
# target.
genrule(
    name = "repo-vars",
    outs = ["repo-vars.txt"],
    cmd = r"""printf '%%s\n' \
          '%s' \
          '  repo-name: $(repo-name)' \
          '  repo-dir:  $(repo-dir)' >$@""" % repo_target,
    toolchains = [
        ":repo-name",
        ":repo-dir",
    ]
)

# This alias demonstrates how the rules from `repo-names.bzl` work with resolved
# aliases during the analysis phase. Calling the `canonical_repo()` and
# `workspace_root()` macros with ":repo-target" in this file will return the
# empty string, since macros execute during the loading phase.
alias(
    name = "repo-target",
    actual = repo_target,
)

# These rules create custom variables that other rules can use by adding
# ":repo-name" and ":repo-dir" to their `toolchains` attribute.
# - https://bazel.build/reference/be/make-variables#custom_variables
# - https://bazel.build/rules/lib/providers/TemplateVariableInfo
repo_name_variable(
    name = "repo-name",
    dep = ":repo-target",
)

repo_dir_variable(
    name = "repo-dir",
    dep = ":repo-target",
)

Running this command with different repo_target values produces similar output—except we must specify a fully accessible target, not just an apparent repo name. This is because the repo_target is evaluated during the analysis phase; if it doesn't exist, the build will break.

//:repo-vars output
@frobozz//:repo-macros
  repo-name:
  repo-dir:

@rules_js//npm:defs
  repo-name: aspect_rules_js~
  repo-dir:  external/aspect_rules_js~

@rules_bison//bison:current_bison_toolchain
  repo-name: rules_bison~
  repo-dir:  external/rules_bison~

Technically, all of the results above come from the //:repo-target alias, whose target is set to the repo_target variable. This confirms that the custom Make variable rules return the repository values for the underlying target.

For a more complex example, here's the output for the @bison_v3.3.2//bin:bison alias target.

//:repo-vars output for the bison alias
1
2
3
@bison_v3.3.2//bin:bison
  repo-name: rules_bison~~bison_repository_ext~bison_v3.3.2__cfg00000B62
  repo-dir:  external/rules_bison~~bison_repository_ext~bison_v3.3.2__cfg00000B62

Several points to notice:

  • In this case, :repo-target is an alias to an alias. The rules have access to the actual target at the end of the alias chain.
  • @bison_v3.3.2//bin:bison is an alias to a target in a private, generated repo. As opposed to the macros, the custom Make variable rules return the correct repository information.
  • Hardcoding the canonical name of the repository to which the bison alias refers would be an extra nightmare, given the fingerprint suffix. This fingerprint can and will change based on the state of the main repository.

Injecting canonical repo names via generated source files

The next step is to inject these values into a program, either via environment variables, command line arguments, or generated source files.

The runfiles post covered using environment variables and command line arguments, including using the env and args attributes of builtin test and binary targets. The js_binary rule from aspect_rules_js provides several attributes for encoding environment variables and command line arguments. We'll cover generating source files here.

Generating source files with runfiles paths usually isn't necessary.

Generating a source file is also an option for injecting runfiles paths for actual files (not directories), but it's usually unnecessary in that case. This is because runfiles libraries will interpret the first path segment as the apparent repository name, so such paths can be safely hardcoded. In JavaScript, this currently isn't an option. However, you can still pass rlocationpath values via environment variables or command line arguments and join them to JS_BINARY__RUNFILES (or whatever environment variable is available).

The example project has a target to generate a JavaScript source file, which is then imported into a small example program. Running the example program as follows will produce the following output, with the following details elided:

  • OUTPUT_BASE represents the result of bazel info output_base, e.g., /home/mbland/.cache/bazel/_bazel_mbland/1234567890abcdef.
  • ARCH represents the build architecture output path component, e.g., k8-fastbuild.
  • RUNFILES_DIR in the PWD: path is the value of the runfiles: path shown just above.
Running the example program, producing info on the @pnpm repo
$ bazel run //:repo-dir-check

[ ...snip... ]
INFO: Running command line: bazel-bin/repo-dir-check_/repo-dir-check
rule name: genrule-constants

target:    @pnpm
location:  aspect_rules_js~~pnpm~pnpm/pnpm_/pnpm

macroName: aspect_rules_js~~pnpm~pnpm
macroDir:  external/aspect_rules_js~~pnpm~pnpm

repoName:  aspect_rules_js~~pnpm~pnpm
repoDir:   external/aspect_rules_js~~pnpm~pnpm

runfiles:  OUTPUT_BASE/execroot/_main/bazel-out/ARCH/bin/repo-dir-check_/
             repo-dir-check.runfiles
PWD:       RUNFILES_DIR/_main
binDir:    bazel-out/ARCH/bin

result:    PWD/external/aspect_rules_js~~pnpm~pnpm

Here's the breakdown of what these output fields are for:

Field Description
rule name BUILD rule used to build the program
target value of repo_target from the BUILD file
location rlocationpath of :repo-target (alias of repo_target)
macroName result of the canonical_repo(repo_target) macro call
macroDir result of the workspace_root(repo_target) macro call
repoName repo-name custom Make variable or rule dependency target
repoDir repo-dir custom Make variable or rule dependency target
runfiles JS_BINARY__RUNFILES env var value, set by js_binary
PWD working directory of the running repo-dir-check program
binDir $(BINDIR) Make variable or ctx.bin_dir.path
result actual repository directory path, including canonical name

In the above output, we can see that:

  • The program runs within the _main directory of its runfiles tree when run via bazel run, hence the value of PWD.
  • The macros and the custom Make variables produce the same values, since @pnpm isn't an alias target.
  • The repo_target directory itself resides directly within PWD, so binDir isn't required to locate it.
  • The @pnpm repository path in this case is equivalent to PWD/macroDir or PWD/repoDir.

We'll return to this program shortly to explain how it's constructed, and then run it with different repo_target values.

Using genrule() to generate a source file

The //:genrule-constants target converts both macro and custom Make varable values into JavaScript constants. Its output file, genrule-constants.js, is renamed to constants.js and then included in the data attribute of the //:repo-dir-check target. This program can also run independently of bazel build or bazel run invocations.

This is a comprehensive example, but your genrule may be simpler.

This genrule is somewhat complex since it illustrates how to apply a mixture of different Make variables and macros. Your own genrules need not be so complex; take from this example only what you need.

# A genrule producing a constants module illustrating how to incorporate:
# - the `BINDIR` predefined Make variable
# - the `rlocationpaths` predefined variable, called on the ":repo-target" alias
# - the `canonical_name()` and `workspace_root()` macros, called during the
#   loading phase on a variable
# - the `repo_name_variable` and `repo_dir_variable` targets, evaluated during
#   the analysis phase, which supplies a custom variable as a `toolchains`
#   attribute target
#
# Replacing `repo_target` with the string ":repo-target" in the macros below
# will produce the empty string, since the macros won't see the resolved alias.
genrule(
    name = "genrule-constants",
    srcs = [":repo-target"],
    outs = ["genrule-constants.js"],
    cmd  = r"""printf 'module.exports.%%s;\n' \
        'ruleName = "genrule-constants"' \
        'target = "%s"' \
        'binDir = "$(BINDIR)"' \
        'location = "$(rlocationpaths :repo-target)"' \
        'macroName = "%s"' \
        'macroDir = "%s"' \
        'repoName = "$(repo-name)"' \
        'repoDir = "$(repo-dir)"' >$@""" % (
            repo_target,
            canonical_repo(repo_target),
            workspace_root(repo_target)
        ),
    toolchains = [
        ":repo-name",
        ":repo-dir",
    ],
)

This rule produces the following bazel-bin/genrule-constants.js module from the @pnpm repo target. This file is renamed to bazel-bin/constants.js by the //:constants-impl rule, which //:repo-dir-check depends on directly. (We'll see why this rule exists later; your own builds need not implement such an intermediary rule.)

Contents of bazel-bin/genrule-constants.js
1
2
3
4
5
6
7
8
module.exports.ruleName = "genrule-constants";
module.exports.target = "@pnpm";
module.exports.binDir = "bazel-out/ARCH/bin";
module.exports.location = "aspect_rules_js~~pnpm~pnpm/pnpm_/pnpm";
module.exports.macroName = "aspect_rules_js~~pnpm~pnpm";
module.exports.macroDir = "external/aspect_rules_js~~pnpm~pnpm";
module.exports.repoName = "aspect_rules_js~~pnpm~pnpm";
module.exports.repoDir = "external/aspect_rules_js~~pnpm~pnpm";

The repo-dir-check.mjs example program validates the location of the repo_target directory, both when run via bazel run or when run directly from the repository root. It builds directory paths using the various constant values and returns one that actually exists.

repo-dir-check.mjs
import {
  ruleName, target, location, macroName, macroDir, repoName, repoDir, binDir,
} from './constants.js';

import * as path from 'node:path';
import * as fs from 'node:fs/promises';
import * as process from 'node:process';

console.log(`rule name: ${ruleName}\n`);

console.log(`target:    ${target}`);
console.log(`location:  ${location}\n`);

console.log(`macroName: ${macroName}`);
console.log(`macroDir:  ${macroDir}\n`);

console.log(`repoName:  ${repoName}`);
console.log(`repoDir:   ${repoDir}\n`);

console.log(`runfiles:  ${process.env.JS_BINARY__RUNFILES}`)
console.log(`PWD:       ${path.resolve()}`);
console.log(`binDir:    ${binDir}\n`);

async function checkDir() {
  const dirname = path.resolve(...arguments);
  await fs.access(dirname, fs.constants.R_OK | fs.constants.X_OK);
  return dirname;
}

try {
  const result = await Promise.any([
    checkDir(macroDir),
    checkDir(repoDir),
    checkDir(binDir, macroDir),
    checkDir(binDir, repoDir),
  ])
  console.log(`result:    ${result}`);

} catch (err) {
  console.error('repo directory not found:')
  console.error(err.errors ? err.errors.join('\n'): err);
  process.exit(1);
}

The next batch of example output results from executing node bazel-bin/repo-dir-check.mjs directly. In this output:

  • EXAMPLE_DIR is where I've cloned the example repository, plus the parent of the project directory (i.e., $HOME/example/bzlmod).
  • BINDIR in the PWD: path is the value of the binDir: path shown just above.
Running repo-dir-check.mjs directly
$ node bazel-bin/repo-dir-check.mjs

rule name: genrule-constants

target:    @pnpm
location:  aspect_rules_js~~pnpm~pnpm/pnpm_/pnpm

macroName: aspect_rules_js~~pnpm~pnpm
macroDir:  external/aspect_rules_js~~pnpm~pnpm

repoName:  aspect_rules_js~~pnpm~pnpm
repoDir:   external/aspect_rules_js~~pnpm~pnpm

runfiles:  undefined
PWD:       EXAMPLE_DIR/canonical-repo-name-injection
binDir:    bazel-out/ARCH/bin

result:    PWD/BINDIR/external/aspect_rules_js~~pnpm~pnpm

We can see from this output that when running the program directly:

  • PWD is our project root, since this is where we execute node.
  • runfiles: is undefined, because the program isn't run via bazel run.
  • The @pnpm path in this example is equivalent to either PWD/BINDIR/macroDir or PWD/BINDIR/repoDir.

Observing the difference between macros and custom Make variables

To show the difference between macros and custom Make variables in the context of our genrule, we'll set the repo_target to the @bison_v3.3.2//bin:bison alias target.

First we'll run it via bazel run:

bazel run //:repo-dir-check output for the bison alias
$ bazel run //:repo-dir-check

[ ...snip... ]
rule name: genrule-constants

target:    @bison_v3.3.2//bin:bison
location:  rules_bison~~bison_repository_ext~bison_v3.3.2__cfg00000B62/bin/bison

macroName: rules_bison~~bison_repository_ext~bison_v3.3.2
macroDir:  external/rules_bison~~bison_repository_ext~bison_v3.3.2

repoName:  rules_bison~~bison_repository_ext~bison_v3.3.2__cfg00000B62
repoDir:   external/rules_bison~~bison_repository_ext~bison_v3.3.2__cfg00000B62

runfiles:  OUTPUT_BASE/execroot/_main/bazel-out/ARCH/bin/repo-dir-check_/repo-dir-check.runfiles
PWD:       RUNFILES_DIR/_main
binDir:    bazel-out/ARCH/bin

result:    PWD/_main/external/rules_bison~~bison_repository_ext~bison_v3.3.2__cfg00000B62

Here we see that, as opposed to running it with @pnpm:

  • The macro values and the custom Make variable values differ.
  • The program finds PWD/repoDir, not PWD/macroDir.

And by running the program directly via node:

node bazel-bin/repo-dir-check.mjs output for the bison alias
rule name: genrule-constants

target:    @bison_v3.3.2//bin:bison
location:  rules_bison~~bison_repository_ext~bison_v3.3.2__cfg00000B62/bin/bison

macroName: rules_bison~~bison_repository_ext~bison_v3.3.2
macroDir:  external/rules_bison~~bison_repository_ext~bison_v3.3.2

repoName:  rules_bison~~bison_repository_ext~bison_v3.3.2__cfg00000B62
repoDir:   external/rules_bison~~bison_repository_ext~bison_v3.3.2__cfg00000B62

runfiles:  undefined
PWD:       EXAMPLE_DIR/canonical-repo-name-injection
binDir:    bazel-out/ARCH/bin

result:    PWD/BINDIR/external/rules_bison~~bison_repository_ext~bison_v3.3.2__cfg00000B62

Extra credit: Using a custom Rule to generate a source file

In many (most?) cases, if you have to generate a source file, writing a single bespoke genrule will be all you need to do. However, if you find yourself writing multiple such genrules, you may consider writing a reusable custom Rule to generate source files instead.

In this case, defining custom Make variables isn't necessary if the rule receives repo targets via attributes of type:

Recall that this is because attr.label values resolve to Target instances during the analysis phase, when the Rule executes. The same holds true for attr.label values included in instances of these other attributes.

I won't copy all of the code into this post, but here is what using gen_js_constants() from repo-names.bzl looks like:

//:custom-rule-constants target definition
# A custom rule producing a constants module from its `vars`, `repo_names`, and
# `repo_dirs` attributes.
#
# This rule illustrates how to use:
# - The `rlocationpaths` predefined variable, called on the ":repo-target" alias
# - The `canonical_name()` and `workspace_root()` macros, called during the
#     loading phase on a variable
# - The `repo_names` and `repo_dirs` attributes of type
#   `attr.label_keyed_string_dict`, whose keys are resolved Target values
#   (including aliases) during the analysis phase
#
# Note that:
# - For the `vars` attribute, the keys become constant names, and the values
#   become constant values.
# - For the `repo_names` and `repo_dirs` attributes, the keys become constant
#   values, and the values become constant names.
#
# Replacing `repo_target` with the string ":repo-target" in the macros below
# will produce the empty string, since the macros won't see the resolved alias.
gen_js_constants(
    name  = "custom-rule-constants",
    deps = [":repo-target"],
    vars  = {
        "target":    repo_target,
        "location":  "$(rlocationpaths :repo-target)",
        "macroName": canonical_repo(repo_target),
        "macroDir":  workspace_root(repo_target),
    },
    repo_names = {":repo-target": "repoName"},
    repo_dirs  = {":repo-target": "repoDir"},
)

The intermediate //:constants-impl rule plays a role in switching between the genrule-based and custom rule-based constants.js generators using a custom command line flag. Running the following will select the //:custom-rule-constants implementation, which produces (almost) exactly the same output as the //:genrule-constants implementation.

Running the example program with //:custom-rule-constants
1
2
3
4
bazel run --//:constants=custom-rule //:repo-dir-check
cat bazel-bin/custom-rule-constants.js
vimdiff bazel-bin/{genrule,custom-rule}-constants.js
node bazel-bin/repo-dir-check.mjs

You probably don't need a custom command line flag.

You do not need a custom command line flag in your own project. I've included it here for ease of comparison between the genrule and custom Rule approaches. That, and it's a neat trick to know about—but you do not need to use it yourself.

Conclusion

This post concludes our series regarding how to repair broken repository paths after enabling Bzlmod. Hopefully they provide enough information to overcome such problems in your own build.

Broken repository paths aren't the only class of Bzlmod challenges requiring hands-on intervention, of course. Next we'll see how to replace some of our WORKSPACE stanzas with module extensions, particularly for dependencies that haven't yet been adapted to handle Bzlmod. We'll learn how to make sense of—and ultimately avoid—circular dependency errors in the process.

Credit where it's due

Ricard Solé developed the esbuild loader and the original genrule, which he used to inject $(BINDIR). I piggybacked on top of that existing genrule to inject the canonical name of the @npm repository from rules_nodejs.

I got the idea for writing a custom rule to generate a source file containing constants from Bazel's tools/python/gen_runfile_constants.bzl.