Bazel Community Day – San Francisco¶
On May 23, 2023, EngFlow teamed up with Snap to organize a Bazel Community Day for the Bay Area. Over 70 people attended the event, hosted at Snap's offices in beautiful San Francisco.
The meetup started and ended with informal discussion and snacks. The main event was a series of talks.
Migrating (parts of) an Embedded Linux Distribution to Bazel - Kyle Teske (Roku)¶
Roku is in the process of migrating their embedded operating system from a home-grown Make build system to Bazel. In this talk, Kyle spoke about the challenges they encountered. They were able to generate BUILD files for some libraries automatically, but Bazel has strict requirements on dependencies, and Roku found that they needed to fix cyclic dependencies and includes of private headers.
The migration is ongoing, but Roku expects to see significantly better reliability using Bazel, together with an improvement in build time.
Bazel UX: Flags & Console Output - Ulf Adams (EngFlow)¶
Ulf Adams, the co-founder and CTO of EngFlow and one of the original authors of Bazel, shared his thoughts on the Bazel command line user experience.
In the first half of the talk, Ulf recommended reorganizing Bazel's nearly 600 flags into six categories to aid understanding and to surface flags that are frequently used.
- Debug - flags that change how Bazel displays information about what is happening during a build.
- Input - flags that change how Bazel sees the work, affecting workspace layout, what is considered source, which targets are built, and which platforms they are built for.
- Rules - language-specific flags that change Bazel's behavior for built-in rules.
- Migration - flags that support the controlled rollout of semantic changes to Bazel and its rules including changes to Starlark, the rule API, and rule semantics.
- Strategy - flag that change how Bazel executes actions including caching, sandboxing, remote execution, and test sharding.
- Meta - low-level flags that affect how Bazel works internally, mostly dealing with in-memory caches and out-of-memory errors.
In the second half of the talk, Ulf spoke about Bazel's terminal output during a build and suggested ways it could more clearly surface errors and other relevant information to users. There are many different personas and use cases for terminal output though. Ulf encouraged the audience to think about the output that's important to them and work in the Bazel open source project if it interests them.
Migrating away from rules_docker to rules_oci - Alex Eagle (Aspect)¶
Alex Eagle, the founder and co-CEO of Aspect and the maintainer of rules_oci and many other rule sets, spoke about building container images with Bazel using rules_oci. At the beginning of the talk, Alex tagged the v1.0.0 release!
rules_docker has been maintained by Google, but it has been difficult for Google engineers and for the community to keep it up to date. It relies on checked-in binaries that aren't open source, and many language-specific rules.
The new rules_oci is smaller and simpler, relying on pkg_tar for building layers and on capable image manipulation tools like Crane.
Taming node_modules in RBE: Airbnb's Journey - Sharmila Jesupaul (Airbnb)¶
Sharmila Jesupaul, a software engineer at Airbnb, spoke about building and testing TypeScript code at scale using Bazel's remote execution. The team is in the process of optimizing their CI system to test a large volume of front-end code in a 15 minute time budget. Sharmila shared several techniques they used to make large actions run faster.
Packaging dependencies as tarballs and pruning unneeded files sped up type checking actions enormously from 35 minutes to 12 minutes.
Sharmila and her team found that actions still spent a significant amount of time extracting tarballs of node_modules directories. They compromised on hermeticity for the sake of speed by caching extracted directories on each remote execution worker outside the execution root. This allowed remote actions to skip extraction when an earlier action had extracted the same tarball. This sped up unit testing from 30 minutes to 22 minutes. This approach still had some drawbacks: workers' caches are lost when they are scaled down, and changes to tarballs invalidate the cache.
The last technique Sharmila shared was packaging a read-only tarball cache in the Docker image used by remote workers. This allowed even newly started worker machines to hit the cache, speeding up test times to 16 minutes.