See notes on the release for how to get started.
NOTE: For releases prior to 0.10.1, please also see these notes.
The toolchain can automatically detect your OS and arch type, and use the right pre-built binary LLVM distribution. See the section on "Bring Your Own LLVM" below for more options.
See in-code documentation in rules.bzl for available
attributes to llvm_toolchain.
LLVM does not come with distributions for all host architectures in each version. In particular patch versions often come with few prebuilt packages. This means that a single version probably is not enough to address all hosts one wants to support.
This can be solved by providing a target/version map with a default version.
The example below selects 15.0.6 as the default version for all targets not
specified explicitly. This is like providing llvm_version = "15.0.6", just
like in the example on the top. However, here we provide two more entries that
map their respective target to a distinct version:
llvm_toolchain(
name = "llvm_toolchain",
llvm_versions = {
"": "15.0.6",
"darwin-aarch64": "15.0.7",
"darwin-x86_64": "15.0.7",
},
)We currently offer limited customizability through attributes of the llvm_toolchain_* rules. You can send us a PR to add more configuration attributes.
The MODULE.bazel example below demonstrates how to use an LLVM release that
is not yet present in the bundled distribution table. The required SHA-256s
can be obtained with utils/extra_distributions.sh -v <version>
(see Distribution data and scripts below).
llvm = use_extension("@toolchains_llvm//toolchain/extensions:llvm.bzl", "llvm", dev_dependency = True)
llvm.toolchain(
name = "llvm_toolchain",
llvm_version = "20.1.4",
extra_llvm_distributions = {
"LLVM-20.1.4-Linux-ARM64.tar.xz": "4de80a332eecb06bf55097fd3280e1c69ed80f222e5bdd556221a6ceee02721a",
"LLVM-20.1.4-Linux-X64.tar.xz": "113b54c397adb2039fa45e38dc8107b9ec5a0baead3a3bac8ccfbb65b2340caa",
"LLVM-20.1.4-macOS-ARM64.tar.xz": "debb43b7b364c5cf864260d84ba1b201d49b6460fe84b76eaa65688dfadf19d2",
"clang+llvm-20.1.4-x86_64-pc-windows-msvc.tar.xz": "2b12ac1a0689e29a38a7c98c409cbfa83f390aea30c60b7a06e4ed73f82d2457",
},
)The following WORKSPACE snippet shows how to add a specific version for a specific target before
the version was added to the bundled distribution data under
toolchain/distributions/.
llvm_toolchain(
name = "llvm_toolchain",
llvm_version = "19.1.6",
sha256 = {"linux-x86_64": "d55dcbb309de7ade4e3073ec3ac3fac4d3ff236d54df3c4de04464fe68bec531"},
strip_prefix = {
"linux-x86_64": "LLVM-19.1.6-Linux-X64",
},
urls = {
"linux-x86_64": [
"https://github.com/llvm/llvm-project/releases/download/llvmorg-19.1.6/LLVM-19.1.6-Linux-X64.tar.xz",
],
},
)A majority of the complexity of this project is to make it generic for multiple use cases. For one-off experiments with new architectures, cross-compilations, new compiler features, etc., my advice would be to look at the toolchain configurations generated by this repo, and copy-paste/edit to make your own in any package in your own workspace.
bazel query --output=build @llvm_toolchain//:all | grep -v -e '^#' -e '^ generator'Besides defining your toolchain in your package BUILD file, and until this
issue is resolved, you would
also need a way for bazel to access the tools in LLVM distribution as relative
paths from your package without using .. up-references. For this, you can
create a symlink that uses up-references to point to the LLVM distribution
directory, and also create a wrapper script for clang such that the actual
clang invocation is not through the symlinked path. See the files in the
@llvm_toolchain//: package as a reference.
# See generated files for reference.
ls -lR "$(bazel info output_base)/external/llvm_toolchain"
# Create symlink to LLVM distribution.
cd _your_package_directory_
ln -s ../....../external/llvm_toolchain_llvm llvm
# Create CC wrapper script.
mkdir bin
cp "$(bazel info output_base)/external/llvm_toolchain/bin/cc_wrapper.sh" bin/cc_wrapper.sh
vim bin/cc_wrapper.sh # Review to ensure relative paths, etc. are good.See bazel tutorial for how CC toolchains work in general.
Version attributes can be requirements of the form first, first:<condition>,
latest or latest:<condition>.
In case of latest, the latest distribution matching the optional condition
will be selected.
In case of first, the first distribution matching the optional condition
will be selected.
The condition consists of a comma separated list of semver version comparisons
supporting <, <=, >, >=, ==, !=. Examples:
latestlatest:>=20.1.0latest:>17.0.4,!=19.1.7,<=20.1.0first:>=15.0.6,<16
It is further possible to provide the version or requirement from an environment
variable with a fallback version or requirement. In this case it is important to
also use the bazel flag --repo_env=LLVM_VERSION=version_or_requirement. It is
important to use both correctly because otherwise the resulting builds are not
reproducible. The main purpose of using an environment variable to encode the
version for integration or batch testing on multiple platforms where multiple
LLVM versions should be tested.
getenv(ENVIRONMENT_VARIABLE_NAME,fallback)
Example MODULE.bazel
llvm.toolchain(
name = "llvm_toolchain",
llvm_versions = {
"": "getenv(LLVM_VERSION,latest:>=17.0.0,<20)",
"darwin-x86_64": "15.0.7", # Verify this works as opposed to using one version.
},
)In this example, MacOS x86 machines have their LLVM version hard-coded to
15.0.7. For all other targets the LLVM version is read from the environment
variable LLVM_VERSION which must be referenced on the Bazel command line as
explained above. If the variable is not present, then the LLVM version defaults
to the requirement expression latest:>=17.0.0,<20.
If toolchains are registered (see Quickstart section above), you do not need to
do anything special for bazel to find the toolchain. You may want to check once
with the --toolchain_resolution_debug flag to see which toolchains were
selected by bazel for your target platform.
For specifying unregistered toolchains on the command line, please use the
--extra_toolchains flag. For example,
--extra_toolchains=@llvm_toolchain//:cc-toolchain-x86_64-linux.
The following mechanisms are available for using an LLVM toolchain:
- Host OS information is used to find the right pre-built binary distribution
from llvm.org, given the
llvm_versionorllvm_versionsattribute. The LLVM toolchain archive is downloaded and extracted as a separate repository with the suffix_llvm. The detection logic forllvm_versionis not perfect, so you may have to usellvm_versionsfor some host OS type and versions. We expect the detection logic to grow through community contributions. We welcome PRs. - You can use the
urlsattribute to specify your own URLs for each OS type, version and architecture. For example, you can specify a different URL for Arch Linux and a different one for Ubuntu. Just as with the option above, the archive is downloaded and extracted as a separate repository with the suffix_llvm. - You can also specify your own bazel package paths or local absolute paths
for each host os-arch pair through the
toolchain_rootsattribute (without bzlmod) or thetoolchain_rootmodule extension tags (with bzlmod). Note that the keys here are different and less granular than the keys in theurlsattribute. When using a bazel package path, each of the values is typically a package in the user's workspace or configured throughlocal_repositoryorhttp_archive; the BUILD file of the package should be similar to@toolchains_llvm//toolchain:BUILD.llvm_repo. If using onlyhttp_archive, maybe consider using theurlsattribute instead to get more flexibility if you need. - All the above options rely on host OS information, and are not suited for
docker based sandboxed builds or remote execution builds. Such builds will
need a single distribution version specified through the
distributionattribute, or URLs specified through theurlsattribute with an empty key, or a toolchain root specified through thetoolchain_rootsattribute with an empty key.
A sysroot can be specified through the sysroot attribute (without bzlmod) or
the sysroot module extension tag (with bzlmod). This can be either a path on
the user's system, or a bazel filegroup like label. One way to create a
sysroot is to use docker export to get a single archive of the entire
filesystem for the image you want. Another way is to use the build scripts
provided by the Chromium
project.
The toolchain supports cross-compilation if you bring your own sysroot. When cross-compiling, we link against the libstdc++ from the sysroot (single-platform build behavior is to link against libc++ bundled with LLVM). The following pairs have been tested to work for some hello-world binaries:
- {linux, x86_64} -> {linux, aarch64}
- {linux, aarch64} -> {linux, x86_64}
- {darwin, x86_64} -> {linux, x86_64}
- {darwin, x86_64} -> {linux, aarch64}
A recommended approach would be to define two toolchains, one without sysroot
for single-platform builds, and one with sysroot for cross-compilation builds.
Then, when cross-compiling, explicitly specify the toolchain with the sysroot
and the target platform. For example, see the MODULE.bazel
file for llvm_toolchain_with_sysroot and the test
script for cross-compilation.
bazel build \
--platforms=@toolchains_llvm//platforms:linux-x86_64 \
--extra_toolchains=@llvm_toolchain_with_sysroot//:cc-toolchain-x86_64-linux \
//...By default a single LLVM distribution (the "toolchain root") provides both the
clang/lld binaries that run (the exec configuration) and the libraries that
get linked into the produced binaries (the target configuration). When the
target needs a different distribution than the exec tools (for example a
target-arch build of libc++ or compiler-rt), specify it separately through the
target_toolchain_roots attribute (without bzlmod) or the
target_toolchain_root module extension tag (with bzlmod). It is the
per-target counterpart of toolchain_roots / toolchain_root, and falls back
to the exec toolchain root when unset.
When linking against libstdc++ from a sysroot, three attributes (each keyed by target OS/arch pair) tune how it is found and linked:
stdlib: in addition to the values described under Sysroots,dynamic-stdc++(optionallydynamic-stdc++-<ver>) behaves likestdc++but linkslibstdc++.soinstead of the default staticlibstdc++.a.multiarch: overrides the multiarch tuple used to construct the sysroot include and library paths. Useful when the sysroot uses a non-standard tuple, e.g. Yocto'saarch64-oe4t-linux.cxx_include_layout: selects how libstdc++ headers and the gcc runtime libs are laid out in the sysroot.debian(the default) expects/usr/include/<multiarch>/c++/<ver>and/usr/lib/gcc/<multiarch>/<ver>;yoctoexpects/usr/include/c++/<ver>/<multiarch>and/usr/lib/<multiarch>/<ver>.
All three are optional: omit them to get static libstdc++ linking, the builtin
multiarch tuple, and the debian layout.
The toolchain supports multi-platform builds through the combination of the
exec_os, exec_arch attribute pair, and either the distribution attribute,
or the urls attribute. This allows one to run their builds on one platform
(e.g. macOS) and their build actions to run on another (e.g. Linux), enabling
remote build execution (RBE). For example, see the MODULE.bazel
file for llvm_toolchain_linux_exec and the test
script for running the build actions on
Linux even if the build is being run from macOS.
bazel build \
--platforms=@toolchains_llvm//platforms:linux-x86_64 \
--extra_execution_platforms=@toolchains_llvm//platforms:linux-x86_64 \
--extra_toolchains=@llvm_toolchain_linux_exec//:cc-toolchain-x86_64-linux \
//...The following is a rough (untested) list of steps:
- To help us detect if you are cross-compiling or not, note the arch string as
given by
python3 -c 'import platform; print(platform.machine()). - Edit
SUPPORTED_TARGETSin toolchain/internal/common.bzl with the os and the arch string from above. - Add
target_system_name, etc. in toolchain/cc_toolchain_config.bzl. - For cross-compiling, add a
platformbazel type for your target platform in platforms/BUILD.bazel, and add an appropriate sysroot entry to yourllvm_toolchainrepository definition. - If not cross-compiling, bring your own LLVM (see section above) through the
toolchain_rootsorurlsattribute. - Test your build.
Sandboxing the toolchain introduces a significant overhead (100ms per action,
as of mid 2018). To overcome this, one can use
--experimental_sandbox_base=/dev/shm. However, not all environments might
have enough shared memory available to load all the files in memory. If this is
a concern, you may set the attribute for using absolute paths, which will
substitute templated paths to the toolchain as absolute paths. When running
bazel actions, these paths will be available from inside the sandbox as part of
the / read-only mount. Note that this will make your builds non-hermetic.
The toolchain is tested to work with rules_go, rules_rust, and
rules_foreign_cc.
The LLVM distribution also provides several tools like clang-format. You can
depend on these tools directly in the bin directory of the distribution. When
not using the toolchain_roots attribute, the distribution is available in the
repo with the suffix _llvm appended to the name you used for the
llvm_toolchain rule. For example, @llvm_toolchain_llvm//:bin/clang-format
is a valid and visible target in the quickstart example above.
When using the toolchain_roots attribute, there is currently no single target
that you can reference, and you may have to alias the tools you want with a
select clause in your workspace.
As a convenience, some targets are aliased appropriately in the configuration
repo (as opposed to the LLVM distribution repo) for you to use and will work
even when using toolchain_roots. The complete list is in the file
aliases.bzl. If your repo is named llvm_toolchain,
then they can be referenced as:
@llvm_toolchain//:omp@llvm_toolchain//:clang-format@llvm_toolchain//:llvm-cov
The toolchain supports Bazel's layering_check feature, which relies on
Clang modules to implement strict
deps (also known as "depend on what you use") for cc_* rules. This feature
can be enabled by enabling the layering_check feature on a per-target,
per-package or global basis.
The list of LLVM releases this toolchain knows about lives as JSONC data
under toolchain/distributions/. A repository
rule merges every JSONC file into a single lookup table at module-load time:
| file | role |
|---|---|
pre_github.jsonc |
LLVM 6.x–9.x hosted on releases.llvm.org. Hand-maintained, frozen. |
github_legacy.jsonc |
LLVM 10.x–18.x with pre-19.x irregular naming. Hand-maintained, frozen. |
github.jsonc |
LLVM 19.x and newer. Regenerated end-to-end by utils/update_distributions.sh. |
extra.jsonc |
Empty by default. Downstream slot for additional bundled releases; loaded last, so any key here overrides the same key in the other files. |
Each file has the shape:
base_url is optional. Its "" key sets a per-file default URL template;
the optional "<version>" keys override individual releases. Templates may
contain {version} (substituted at materialization time) and should end
with / — the basename is appended directly. Files that omit base_url
fall back to the standard GitHub release URL
(https://github.com/llvm/llvm-project/releases/download/llvmorg-{version}/).
Entry keys are either a tarball basename (URL derived via base_url)
or a full URL/path (used verbatim, bypassing base_url). Comments are
stripped before parsing, and trailing commas are tolerated.
-
utils/update_distributions.sh— refreshestoolchain/distributions/github.jsoncby paging through the GitHub releases API forllvm/llvm-projectand rewriting the file in place. Use this when contributing a new LLVM release to the bundled list. The script also regenerates the test golden file so the diff stays self-contained. No tarballs are downloaded — checksums come from GitHub's release-asset.digestfield, with existing values preserved for older assets that predate that field. SetGITHUB_TOKENto avoid the unauthenticated API rate limit. See-hfor details. -
utils/extra_distributions.sh— prints checksums for a single LLVM release, formatted to paste straight into theextra_llvm_distributionsattribute shown earlier. Use this only when the version you want is not yet bundled ingithub.jsonc; if it is, just bumpllvm_versionand let the toolchain pick up the existing entries. Falls back to downloading tarballs and computing SHA-256 locally for older assets that don't have a.digest. See-hfor details.
Both scripts run on Linux, macOS, and on Windows under Git Bash / MSYS2 /
WSL. They require bash, curl, jq, and awk;
utils/extra_distributions.sh additionally needs sha256sum (Linux, Git
Bash) or shasum (macOS).
Other examples of toolchain configuration:
{ "_meta": { "description": "...", "base_url": { "": "https://example.com/llvm-{version}/", "<version>": "https://override/{version}/", // optional per-version }, }, "<tarball-basename>": "<sha256>", "<full-url-or-path>": "<sha256>", }