Migrating to Bazel Modules (a.k.a. Bzlmod) - Maintaining Compatibility, Part 2¶
In the previous post, we reviewed guidelines for
maintaining compatibility with both Bzlmod and legacy WORKSPACE
builds, and
older and newer dependency versions. I promised that in this post and the next,
we'd discuss testing approaches to help ensure that this remains the case.
However, a discussion in the Bazel Slack workspace has revealed a Bzlmod and
legacy WORKSPACE
compatibilty issue I'd missed in the previous post. So in
this post, I'll discuss what to do with the class of legacy WORKSPACE
configuration macros that use Label
with computed repository names.
As we'll see, this one issue alone ended up meriting a substantial post in
itself. We'll cover adding dependency attributes to repository rules, generating
.bzl
files to resolve Label
s, chaining together module extensions, and using
macros in generated BUILD
files. The former two options are relatively
straightforward, but we cover the latter two options in case your use case
requires them.
Such legacy WORKSPACE
macros commonly seem to pertain to toolchain
configuration, selecting repositories to instantiate based on user defined
parameters. So we'll use a small example project to illustrate these solutions
as they apply specifically to toolchain configuration.
In the next post, I promise to get into the tasty testing stuff, in what promises to be the third of a four part trilogy.
This article is part of the series "Migrating to Bazel Modules (a.k.a. Bzlmod)":
- Migrating to Bazel Modules (a.k.a. Bzlmod) - The Easy Parts
- Migrating to Bazel Modules (a.k.a. Bzlmod) - Repo Names and Runfiles
- Migrating to Bazel Modules (a.k.a. Bzlmod) - Repo Names and rules_pkg
- Migrating to Bazel Modules (a.k.a. Bzlmod) - Repo Names, Macros, and Variables
- Migrating to Bazel Modules (a.k.a. Bzlmod) - Module Extensions
- Migrating to Bazel Modules (a.k.a. Bzlmod) - Fixing and Patching Breakages
- Migrating to Bazel Modules (a.k.a. Bzlmod) - Repo Names, Again…
- Migrating to Bazel Modules (a.k.a. Bzlmod) - Toolchainization
- Migrating to Bazel Modules (a.k.a. Bzlmod) - Maintaining Compatibility, Part 1
- Migrating to Bazel Modules (a.k.a. Bzlmod) - Maintaining Compatibility, Part 2
Prerequisites¶
As always, please acquaint yourself with the following concepts pertaining to external repositories if you've yet to do so:
The inciting incident¶
I have an existing macro that i'm trying to convert to an extension, which (broadly speaking) is a toolchain that declares a heavyweight http_archive, then gets a label to it, then makes a lightweight toolchain repo with references to the backend repo. I did run into to problems, and I was curious if there were reasonable workarounds that don't involve massively changing the impl (since, at least at the moment, I need to support both bzlmod and workspaces).
It took time for more details to emerge, but in retrospect, I can more clearly
see the original problem from this initial statement. Specifically, a legacy
WORKSPACE
macro:
- computes a repository name,
- then creates a Label from that repo name
- to generate file paths for another repo.
The Label
then breaks when calling the macro from a module extension, since
MODULE.bazel
doesn't invoke use_repo on the computed repository name.
Since Mike mentioned that this macro instantiates toolchains, I suspected that
the solution to this problem would resemble the @rules_scala_toolchains
repo. He mentioned not being able to update use_repo directives using
extension_metadata and 'bazel mod tidy'. The Label
evaluation breaks the
extension before it can return the extension_metadata
required by bazel mod
tidy
(a "chicken-and-egg" problem).
However, in my experience with Toolchainization, exposing toolchain dependency repositories, even those with stable names, is a code smell exposing a design flaw. A properly designed module extension can encapsulate these dependencies so they're available to the generated toolchain repository, without requiring clients to use them directly.
What I somehow failed to mention in the that post, however, was that the BUILD
files in @rules_scala_toolchains
do not contain concrete targets. Instead,
they contain macros that instantiate toolchains (and other related targets),
and those macros can expand the Label
values instead of the module extension.
Later in the conversation, Mike suggested that a 'load' statement to read values from one repo while instantiating another might help:
like, in theory i could do something like:
That instantly reminded me of creating a chain of separate module
extensions. This is when one module extension creates a repository using a
standard name so that other extensions can access its files and BUILD
targets.
In this case:
-
One extension can instantiate an intermediate repository (using the standard name) that embeds information about the repository with the calculated name in a
.bzl
file. -
Another extension can then load this
.bzl
file from the intermediate repository to access the information it needs.
Then while writing this blog post, two more (far less complex) ideas occurred to me:
-
Adding a dependency attribute to the repository rule that needs information about the repository with the calculated name
-
Emitting the calculated repository name into the other repository for evaluation by a
Label
or aload
statement
I recreated the original problem using the example project described in this post, and verified that all of these approaches could work. Whichever solution you may choose to apply is ultimately up to your specific use case, constraints, and taste.
The upshot is that existing legacy WORKSPACE
macros using Label
with
computed repository names will need to change for Bzlmod compatibility. In many
cases, they shouldn't necessarily require massive changes.
Why did I get so obsessed with this problem?
Throughout this series, I've covered repository name handling (ad nauseam),
module extensions, toolchainization, and compatibility between Bzlmod and
legacy WORKSPACE
builds. This problem stands within the intersection of
all four concerns, and I doubt Mike is the only person that has encountered
or will encounter it. Add all this up, and you have the recipe for my brand
of catnip.
Why legacy WORKSPACE macros using Label with computed repo names break Bzlmod¶
Bazel allows both Bazel module extension implementations and legacy WORKSPACE
macros to instantiate multiple repositories. In a legacy WORKSPACE
macro, it's
possible to compute a repository name, create the repository, then use a
Label with that computed repo name. The macro can, for example, use
Label.workspace_root to apply that repo's directory path when instantiating
another repository.
In contrast, module extensions can only invoke Label
accessors for a
repository if MODULE.bazel
invokes use_repo to bring it into scope.
The use_repo
call can be for the same module extension that creates the repo,
but either way, it's required for the Label
to work. This is generally only
possible, or at least convenient, when using a standard, unchanging repo names.
Therefore, legacy WORKSPACE
macros that pass computed repository names to
Label
will break when invoked within a module extension.
This largely applies to toolchain configuration macros. The repositories
required by toolchain targets tend to vary based on configuration parameters.
This is the case with both Mike Lundy's toolchain configuration macro and with
the rules_scala configuration macros and module extensions. At the same
time, repos required only by toolchains need not be visible to any other code
outside of the package that defines them. Therefore the advice here applies
largely to updating toolchain configuration macros, to avoid having to invoke
use_repo
on all their configuration dependent repositories.
The first two solutions below are far less complex than the last two. If the dependency attribute or emitted repo name solutions work for you, you won't need to extract separate extensions or write toolchain macros. As with every post in this series, use only as much as you need, and feel free to ignore the rest.
Example project for reproducing the problem and trying the solutions¶
Here's a small example that demonstrates what happens. We'll update this project as we examine and apply different techniques. I've highlighted significant lines in several files as we examine each technique, mostly lines changed from the immediately previous appearance of the file.
Every solution we'll apply to the example project works with Bazel 7.4.0 and later. The calculated repository path may differ between Bazel versions due to changes in the canonical repository name format, but all the results are equivalent.
Bazel versions older than 7.4.0 don't support the dependency attribute
solution, but all other solutions will work. Also, Bazel 6 doesn't support the
--[no]enable_workspace
flag. Use only --[no]enable_bzlmod
with Bazel 6, or
better yet, migrate to Bazel 7.4.0 or later first, then return to this post.
This example employs the "hub repository" model described in the
Toolchainization post. (Strangely, I didn't mention that pattern name in
that other blog post, but Mike Lundy reminded me of it.) Each version of it
generates a @toolchain_repo
containing a package with a toolchain
target, automatically registered by MODULE.bazel
. The legacy WORKSPACE
file
invokes register_toolchain_repo_toolchains
to achieve the same effect.
The hub repository model avoids unnecessary use_repo calls.
As mentioned above, having to invoke use_repo
to bring toolchain
dependencies into scope in MODULE.bazel
is a code smell. It should not be
necessary for modules to depend on these dependency repositories
directly, including the module defining the toolchain. Ideally, modules
other than the module defining the toolchain shouldn't necessarily have to
depend directly on the hub repository itself, i.e., @toolchains_repo
.
You can define alias targets from one of your module's permanent
packages to expose toolchains that aren't registered by default. For
example, //toolchains:testing_toolchain
could be an alias to a target
within @toolchains_repo//testing
. This way users need not use
@toolchains_repo
directly to access such toolchains.
If you're not already using it, consider using Bazelisk. It's a Bazel
wrapper that uses the .bazelversion
file to select the exact Bazel version for
the build. You can override the .bazelversion
value on the command line by
prefixing bazel
commands with the USE_BAZEL_VERSION
environment variable.
.bazelversion | |
---|---|
.bazelrc | |
---|---|
WORKSPACE | |
---|---|
Note that Bzlmod only allows register_toolchains
calls in MODULE.bazel
. The
@toolchain_repo//...:all
specifier enables register_toolchains
to discover
and register all toolchain
targets in the resulting target set.
MODULE.bazel | |
---|---|
The backend_repo
generates a config.bzl
file in which it records its own
REPO_PATH
, which we'll use in one of the solutions.
Note that in backend_repo.bzl
and toolchain_repo.bzl
, the rctx
parameter
is of type repository_ctx.
toolchain_repo
uses the BUILD.toolchain_repo
template to generate a BUILD
file containing the backend_repo_path
. The instantiate_toolchain_repo
macro
is the original legacy WORKSPACE
macro we'll eventually update for Bzlmod
compatibility.
Note how we inject the original module's canonical repository name into
the template via Label("//:all").repo_name
. This enables the generated BUILD
file within @toolchain_repo
to load
the repository_path_toolchain.bzl
file.
repository_path_toolchain
provides its repo_path
attribute as a field within
ToolchainInfo. This is the attribute that @toolchain_repo
sets to the
@backend_repo
path via its backend_repo_path
attribute.
repository_path
records the repo_path
from its configured toolchain into a
text file. This is what we'll use to verify that the @toolchain_repo
successfully passes the @backend_repo
path to its toolchain.
toolchain_repo_ext
is a thin wrapper around the instantiate_toolchain_repo
macro.
toolchain_repo_ext.bzl | |
---|---|
Finally, here's the BUILD
file that defines our toolchain_type
and the
:repo_path
target that generates bazel-bin/repo_path.txt
.
BUILD | |
---|---|
Here's the source of the problem:
-
The original legacy
WORKSPACE
macroinstantiate_toolchain_repo()
first creates@backend_repo
. -
It then assigns
Label("@backend_repo").workspace_root
to thebackend_repo_path
attribute of@toolchain_repo
.
The original legacy WORKSPACE macro using Label("@backend_repo") | |
---|---|
This works under Bzlmod, for now, because MODULE.bazel
brings backend_repo
into scope via use_repo
. Note that the resulting REPO_PATH
values are
relative to bazel info output_base.
Building successfully under WORKSPACE and Bzlmod | |
---|---|
Now let's remove backend_repo
from the use_repo
call, simulating what would
happen if the macro computed the repository name.
Removing backend_repo from use_repo in MODULE.bazel | |
---|---|
The Label
within instantiate_toolchain_repo
continues to work under a legacy
WORKSPACE
build, but fails under Bzlmod:
So if you have a legacy WORKSPACE
configuration macro that fits this
description, there are a few options for making it Bzlmod compatible. None of
the following methods amount to major surgery, and all continue to hide the
computed repository name while retaining legacy WORKSPACE
compatibility.
Add a dependency attribute to the repository rule¶
If you control the repository rule that needs information from another
repository's Label
, try updating it to take a dependency attribute
instead. This should eliminate the need for the macro to create its own Label
.
A dependency attribute is a repository rule attribute of type:
Each of these attributes will resolve to Target objects, with
Label.workspace_root
and other Label
methods available.
This is the easiest, least complex solution.
If the dependency attribute solution works for you, there's no need to read the rest of the blog post after this section. Take the easy win and run with it!
This only works for Bazel 7.4.0 and later.
This solution doesn't work for Bazel versions older than 7.4.0, which
includes bazelbuild/bazel#23585: [7.4.0] Let repo rule attributes
reference extension apparent names. Older Bazel versions will still
produce the same unknown repo 'backend_repo'
error. All of the other
solutions will work with Bazel 6.5.0 and Bazel 7.1.0 and later, however.
If you're using Bazel 6, consider migrating to Bazel 7.4.0 or later first, then coming back to this problem. Bazel 7 has far more complete Bzlmod support to begin with, and Bazel 7.4.0 and later will enable this dependency attribute solution.
In our example, updating toolchain_repo
and instantiate_toolchain_repo
thus
fixes the Bzlmod build without updating the legacy WORKSPACE
file. Its
backend_repo
attribute is now a Label
, and {REPO_PATH}
becomes
rctx.attr.backend_repo.workspace_root
.
Do not pass a Label to a dependency attribute!
Notice that we're passing the string "@backend_repo"
, not
Label("@backend_repo")
. If you do pass an actual Label object to a
dependency attribute, hilarity will ensue. Try it, and prepare to lose
your mind when you see the no repository visible as '@backend_repo'
error.
Always prefix repo names with '@' when passing them to a Label!
Forgetting the @
prefix when applying a repo name to Label
or a
dependency attribute makes it look like a target in the current package.
In other words, Label("backend_repo")
will look like :backend_repo
. Its
workspace_root
will point to the repo containing the .bzl
file with the
Label
expression, not the @backend_repo
root. Try it and see!
Bazel >= 7.4.0 users: STOP READING HERE IF YOU CAN¶
Success
If you're using Bazel 7.4.0 or later, and the dependency attribute method works for your use case, then please don't continue reading! You're all done! Enjoy your life!
Failure
If you're using Bazel < 7.4.0, you'll have to try the next solution instead. Hopefully then you can stop reading. (Or, upgrade to Bazel 7.4.0 first, then try the dependency attribute solution before the next one.)
Danger
All of the other solutions are significantly more complex, perhaps unnecessarily so for your purposes. Proceed only if you have a really good reason to try them.
Emit the apparent repo name for evaluation within the generated repository¶
Another possible solution is to emit the computed repository name for evaluation
within the @toolchain_repo
repository. First, we update BUILD.toolchain_repo
to load the REPO_PATH
from a local config.bzl
file:
Then we update toolchain_repo
to emit config.bzl
, which uses a Label
to
evaluate @backend_repo
.
Loading generated values directly from the backend repo¶
As an alternative, let's presume that the backend repo already provides a
.bzl
file containing the information you need. This is true of the
@backend_repo
from our example project. In this case, you can have
@toolchain_repo
load @backend_repo//:config.bzl
directly, without using
Label.workspace_root
.
Again, we first update BUILD.toolchain_repo
to load
the file from
@backend_repo
:
And now we pass the backend_repo
string attribute of toolchain_repo
as the
{REPO_NAME}
field of BUILD.toolchain_repo
:
Emitting either the absolute or relative path from the backend repo¶
If you're keeping track, the current backend_repo
implementation uses
rctx.path("")
to inject its absolute path into its REPO_PATH
value. This
results in the following, where <OUTPUT_BASE>
is the value of bazel info
output_base
:
@backend_repo emitting its absolute path | |
---|---|
If you control the repository rule for your @backend_repo
equivalent, you can
use Label
instead of rctx.path("")
to drop the <OUTPUT_BASE>/
. The package
and target passed to Label
need not actually exist, but the target string must
begin with //
:
Bazel < 7.4.0 users: STOP READING HERE IF YOU CAN¶
Success
If you're using a Bazel version older than 7.4.0, and injecting the repo name for evaluation in the repository's files works, please stop here.
Danger
As with the earlier plea to Bazel 7.4.0 and later users, please don't use a more complex solution than necessary. Quit while we're all ahead, for the good of humanity.
Extract new macros and module extensions and chain them together¶
It's quite common for one legacy WORKSPACE
macro to depend on a repository
generated by another. Having one module extension depend upon a repository from
another is the logical derivative of this concept. In other words, we can
achieve the same effect under Bzlmod by extracting multiple macros and module
extensions, and effectively chaining them together.
For example, in rules_scala, users must instantiate the
@rules_scala_config
repository by invoking scala_config()
before invoking
any other configuration macros. The //scala:toolchains.bzl
file loads the
Scala version and other configuration parameters from
@rules_scala_config//:config.bzl
:
This works in a legacy WORKSPACE
build because it fully executes each
statement in order. However, load
is forbidden in MODULE.bazel
files, and a
module extension can't load
a file from a repository generated by the
extension itself. So the load
statements move to the top of each module
extension implementation file, and the Bzlmod API looks like this:
What's happening here is:
-
The first module extension,
scala_config
, loads and invokes the originalscala_config
macro to instantiate the@rules_scala_config
repo. -
MODULE.bazel
callsuse_repo
on thescala_config
extension to bring the@rules_scala_config
repo into the module's scope. -
The second module extension,
scala_deps
, loads//scala:toolchains.bzl
, which can now load@rules_scala_config//:config.bzl
.
Parallel macros and extensions nudge users closer to Bzlmod adoption.
This process of designing macros and module extensions together aligns the
legacy WORKSPACE
API more closely with the new module extension API.
Note the similarities between the Bzlmod and legacy WORKSPACE
APIs in the
rules_scala
example above. This effectively forces legacy WORKSPACE
users to take a step closer to Bzlmod adoption, without losing functionality
or forcing an immediate Bzlmod migration.
MAKE SURE YOU REALLY WANT TO DO THIS¶
Now remember that the problem we're trying to solve is when the macro computes the repository name based on its configuration arguments. So we need to generate an intermediate repository with a stable name, containing references to the repository with the calculated name. Another level of indirection and all that.
There are two variations on this theme of extracting a new macro and module
extension from the original legacy WORKSPACE
macro's implementation.
Danger
Did you try adding a dependency attribute or emitting the backend repo name for evaluation in the generated repo first? If either of those techniques would work on their own, there's no need to keep reading.
But if you have your reasons for wanting to extract a new macro and module extension anyway, hopefully this helps you do it well.
Instantiate backend repo in the first extension, reference it in the second¶
For the first variation, we'll instantiate the backend repo in the first macro/extension, and reference it in the second.
-
Extract a macro that computes the backend repository's name and instantiates it.
-
Have this macro produce the intermediate repository using a stable name, containing
.bzl
files and/or alias targets referring to the calculated repository name. -
Have the original macro's implementation depend upon the
.bzl
files from the intermediate repository, and/or have its generated repository's files reference the intermediate repository. -
Have the legacy
WORKSPACE
file load and invoke the new macro, before the original macro call. -
Once this works under a legacy
WORKSPACE
build, create separate module extensions that load and invoke each macro. -
Invoke
use_repo
on the first extension to bring the intermediate repository into scope so the second extension can access it.
Here's how our example project looks after this transformation. For starters,
here are the new MODULE.bazel
and legacy WORKSPACE
files:
Updating MODULE.bazel
and the legacy WORKSPACE
file is actually the last
step in the process. However, it's useful to have the end result in mind before
we discuss the macro and module extension transformations.
First we add the new toolchain_config_repo.bzl
file, with the new
instantiate_toolchain_config_repo
instantiating a backend_repo
. Notice that
this new macro accepts a contrived configuration parameter, config_value
, used
to define the name of the backend_repo
instance. The rule embeds this
"calculated" name in a Label
within @toolchain_config//:config.bzl
.
Use a dependency attribute with Bazel 7.4.0 or later.
If you're using Bazel 7.4.0 or later, you can use a dependency
attribute in toolchain_config_repo
and directly embed the path string
instead.
Notice that toolchain_repo.bzl
now loads REPO_PATH
from
@toolchain_config//:config.bzl
and applies it to BUILD.toolchain_repo
.
This means it also no longer needs its own repository rule attribute for the
backend repo path or name:
We also restore BUILD.toolchain_repo
to its original state, whereby it
originally took a {REPO_PATH}
parameter:
toolchain_config_repo_ext.bzl
is a bit more complex, since we need to create
the tag_class plumbing required by the module extension. We need this to
pass config_value
through to the instantiate_toolchain_config_repo
macro.
Note that mctx
is an abbreviation for the module_ctx object.
More module extension writing tips
For more advice on writing module extensions, see the Module
Extensions post from this series. For examples of more advanced module
extension helpers, see rules_scala's //scala/macros:private/bzlmod.bzl
helpers and their usage in rules_scala
's module extensions.
And now it's time to reap the fruits of our labors:
Using @toolchain_config under WORKSPACE and Bzlmod | |
---|---|
Configure backend repo in the first extension, instantiate it in the second¶
It's actually still possible to create the backend repo in the
instantiate_toolchain_repo
macro, if you'd prefer to do that. With a few
changes, @toolchain_config
could emit configuration information on how
instantiate_toolchain_repo
should instantiate the backend repo. This would
make it much more similar to how scala_config
and scala_deps
work for
rules_scala.
The process is similar to the previous option, but the first few steps are slightly different:
-
Extract a macro that computes the backend repository's name without instantiating it.
-
Have this macro produce the intermediate repository using a stable name, publishing the computed backend repository name in an accessible
.bzl
file. -
Have the original macro's implementation load the computed backend repository name from the intermediate repo's
.bzl
file and instantiate the backend repo. -
Have the original macro's repository rule emit a
.bzl
file to resolve the computed backend repository name references viaLabel
orload
.
The rest of the process is the same: The legacy WORKSPACE
file invokes the new
macros, separate module extensions invoke each macro, and use_repo
brings the
intermediate repository into scope.
Here's the new toolchain_config_repo
, no longer instantiating the
backend_repo
, and emitting the repo name into config.bzl
:
toolchain_repo
loads the REPO_NAME
from @toolchain_config
, instantiates the
backend_repo
, and emits a Label
into //:config.bzl
.
Use a dependency attribute with Bazel 7.4.0 or later.
Again, with Bazel 7.4.0 and later, toolchain_repo
could take REPO_NAME
as a dependency attribute and inject the workspace_root
directly. It
could inject this path into //:config.bzl
as in this example, or inject it
directly into BUILD.toolchain_repo
.
Extract a macro to generate targets¶
This last technique converts the repository rule so that its generated BUILD
files invoke a macro that can evaluate the necessary Label values.
When a repository rule currently emits complex BUILD
targets directly, such as
toolchain and other supporting targets, macroization may yield other
benefits as well:
-
Macros invoked by a
BUILD
file can evaluateLabel
instances after Bazel finishes instantiating all repositories, and can perform arbitrarily complex target generation. (This is essentially equivalent to the "emitting the backend repo name for evaluation in the generated repo" solution.) -
They enable easy reuse of toolchain definition logic between regular
BUILD
files and toolchain repositoryBUILD
files. This enables the project to configure standard toolchains for specific use cases, so the user won't need to define such configurations. -
They may enable users to define their own custom toolchain configurations using dependencies other than the ones provided by your own project.
-
They work the same way under both
WORKSPACE
and Bzlmod.
I didn't cover toolchain macros in the Toolchainization post.
rules_scala
's toolchain setup macros predated my work on Bzlmodifying it;
several previously resided in the @rules_scala//scala
package. In
other words, I didn't create the scala_toolchain macros, so I didn't
think to focus on them in the Toolchainization post.
In our Bazel Slack conversation, I didn't think about recommending toolchain
setup macros until Mike posted examples from his toolchain repo. The
targets he shared illustrating the output of the existing legacy WORKSPACE
macro contained attributes specifying repo paths like this:
Based on my rules_scala
experience, my mind leapt to using macros to resolve
Label
s and instantiate targets when I saw this. Though any of the previous
techniques we've already covered so far might work, it may still be worth seeing
how target-generating macros might help.
STOP NOW IF YOU DON'T REALLY NEED THIS¶
Danger
Again, if dependency attributes or emitting the backend repo name for evaluation in the generated repo already work for your needs, please stop here. You win. Enjoy your victory. Only proceed if you indeed need new toolchain macros to define standard toolchains or want to provide users the option of defining their own. (And there may be a way to enable user defined custom toolchains using module extensions instead of macros, though I've yet to try it.)
Using a legacy macro¶
So here's the setup_repo_path_toolchain
macro, which is considerd a legacy
macro under Bazel 8. If you're have to support building with Bazel 6 or 7,
this is the only kind of macro that will work.
Notice the two very different uses of Label
:
-
It uses native.package_relative_label to interpret the repository name in the context of the
BUILD
file invoking the macro. In our case, this will be theBUILD
file for the top level@toolchain_repo
package, created bytoolchain_repo_ext
. This is the context required to resolve the calculated backend repo name. -
It uses
Label
to reference the//:toolchain_type
target from thetoolchain_repo_examples
repository itself as a constant. This value should not change depending on the context of the macro's caller.
To Label, or not to Label, that is the question.
For more advice on when to wrap target strings in some kind of Label
,
see When to wrap target strings in a Label.
Now we have a much simplified BUILD.toolchain_repo
that merely invokes
setup_repo_path_toolchain
:
BUILD.toolchain_repo: Invoking setup_repo_path_toolchain | |
---|---|
And finally, we update toolchain_repo
to emit the {REPO_NAME}
parameter
instead of the config.bzl
file.
Another rules_scala example...
//scala:toolchains_repo.bzl defines the scala_toolchains_repo
repository rule. All of its BUILD
file templates are stored as string
constants. It may provide ideas and inspiration for generating BUILD
files
that invoke toolchain macros more complex than the macro from this example.
Using symbolic macros with Bazel 8¶
If you're using Bazel 8, you can convert setup_repo_path_toolchain
to be a
symbolic macro instead. Much like using a dependency attribute,
using a Label
attribute for repo_name
would then eliminate the need for
native.package_relative_label
:
Advantages of defining multiple module extensions, as opposed to just one¶
You may notice that, for our example project, you could still use one macro and one module extension, if you really wanted to. Why go through the extra complication of extracting the separate macros and module extensions?
Here are a few possible answers to that question:
-
Perhaps you don't need to do this for your use case! Try one of the earlier methods to inject references to the first repository into the second repository and see how well it works for you.
-
Maintiaining a strict separation between well defined repository instantiation stages may improve maintainability. Each stage becomes easier to reason about, and easier to evolve separately, with a well defined interface between them.
This is the same argument for breaking long, complicated functions and classes into smaller ones, each with a more focused responsibility. In other words, separate extensions may prove more readable and maintainable, depending on the level of detail involved in instantiating their repositories.
This argument becomes stronger when distinct sets of configuration data apply to different concerns within your Bazel module. Defining separate
tag_class
es can help with this, too, buttag_class
es still require an extension implementation to compile them. If the extension implementation grows large and complicated, that may be a code smell indicating that splitting the extension may help with maintainability.For example, examine the //scala/extensions:config.bzl and //scala/extensions/deps.bzl extension files within
rules_scala
. Yes, they technically could've been part of a single extension, but I find the separate extensions and files much easier to reason about. -
Module extensions in general provide much more flexible, yet much more precise and reliable semantics. Their behavior is defined entirely by the extension implementation, not by the order in which they appear in the module graph. This gives maintainers much more freedom to define the exact behavior, while making it easier for users to reason about the behavior as well.
This also means separate extensions can define completely separate semantics. One extension could take
tag_class
values only from the root module. When the root module doesn't use it, it could either break the build (e.g., the example repo) or supply default values (e.g.,scala_deps
). Another extension could compile information from across the module graph, and not necessarily require the root module to use it (e.g.,scala_toolchains
).
The best answer may be that it could simplify configuration for users of your
Bazel module, while better hiding your module's implementation details. This is
because under Bzlmod, unlike building under the legacy WORKSPACE
model, the
root module isn't responsible for configuring every repository in the build.
The legacy WORKSPACE
module requires the main repository to invoke setup
macros for all repositories. In contrast, the MODULE.bazel
file of each module
can use its own module extensions directly. These extensions may be available to
users, or they could reside within packages private to the module. Either way,
users aren't necessarily required to use a particular module extension, unless
the extension implementation enforces its usage.
For example, another project depending on our example project's Bazel module could only need to import the first module extension:
MODULE.bazel file using the example repo as a dependency | |
---|---|
Note that this example doesn't invoke use_repo
on either @toolchain_config
or @toolchain_repo
. The toolchain_repo_examples
module could refuse to
declare these repos as part of the public API. In fact, toolchain_repo_ext
could be declared private to the toolchain_repo_examples
module for extra
clarity. The MODULE.bazel
file of toolchain_repo_examples
would be the sole
consumer. Clients wouldn't ever have to worry about @toolchain_repo
at all.
Back to the original problem¶
At this point, Mike Lundy and I have made lots of progress on the original problem, but it's not totally solved yet. It continues to be a great conversation, a gift that keeps on giving, much like Bzlmod itself. If we're able to arrive at a working solution, I'll update this post with an announcement just below.
At the moment, Mike appears to making progress on applying the dependency attribute solution in the context of a single extension. We'll eventually discuss partitioning the @toolchain_repo into different packages would enable MODULE.bazel to register only a subset of toolchains by default. So it's too early to declare victory, but there's good reason to be optimistic at this point!
Lots of credit to Jay Conrod for emphasizing the simpler solution.
I owe a lot to my colleague Jay Conrod, who's done so many of the reviews for this Bzlmod blog series. In this case, he emphasized his strong preference for promoting the dependency attribute approach, the least complex option of the four. This significantly influenced my conversation with Mike Lundy from that point as well.
Conclusion¶
In the end, whether or not to one solution over the other comes down to existing build configurations, constraints, and personal taste. Any one of them may seem more appealing for a specific use case for a variety of reasons. Better yet, you can try them all and see which you like best.
But seriously, try to stick with using dependency attributes or emitting the first repo name for evaluation in the second repo if you can. Resort to defining separate module extensions or a new target-generating macro only if you really need their additional benefits. The less accidental complexity in the world, the better!
As always, I'm open to questions, suggestions, corrections, and updates relating to this series of Bzlmodification posts. It's easiest to find me lurking in the #bzlmod channel of the Bazel Slack workspace. I'd love to hear how your own Bzlmod migration is going—especially if these blog posts have helped!