gtucker.io
kernel.org containers
Upstream containers with compiler toolchains from kernel.org

Many moons ago, some discussions were being held around having common container images with toolchains maintained by the upstream Linux kernel community. It didn’t quite happen back then, TuxMake was launched by Linaro soon after with images forked from the KernelCI ones and they have been diverging ever since. Where do things stand now?

A renewed interest

There does seem to be a trend forming in 2024. The debate around using GitLab CI for testing the upstream kernel has been revived and TuxMake has now started supporting containers with the kernel.org toolchains. It looked like all the stars were finally lining up so I picked this topic up again and took the time to articulate it with a proposal: first as an email discussion followed by a Linux Plumbers talk. Both have been well received with some constructive feedback and ideas. Concrete steps are now forming down that path; let’s take a look at a potential way forward.

Yellow Brick Road

Lots of things have already been done to improve development workflows and quality control for the Linux kernel. Some of them are directly available upstream and have become second nature such as Git, test frameworks (kselftest, KUnit) and HTML documentation. As the software world keeps turning, new tools become de facto standard among all developers and the upstream kernel community is no exception - within reason, of course. One of the most obvious ones are containers: Docker was voted #1 used tool on the 2023 Stack Overflow survey and again in 2024.

How can the Linux kernel benefit from containers? Most people are using them to run builds and tests in reproducible, controlled, containerized environments. If an automated test system finds an issue, having a consistent way for developers to reproduce it under the exact same conditions is paramount when looking for a fix.

This also applies to cutting edge kernel development, for example with features in linux-next moving hand-in-hand with particular compiler changes - typically Clang and more recently Rust, but also RISC-V support and new architectures among other topics. It’s also a neat way of working with multiple compiler toolchain versions.

The broader concept here is to create a continuum between individual developers and automated services. For toolchain containers, this can be achieved via a number of small steps.

πŸ“¦ Packages

A rather straightforward improvement would be to create standard packages with the kernel.org toolchains e.g. .deb, .ipk or .rpm in addition to the existing tarballs. The main benefit would be to facilitate adding or removing them while taking care of dependencies automatically. The LLVM toolchains page provides an informal list of things to be installed manually which could typically be formalised with regular packaging meta-data.

It would of course take a bit of maintenance work to keep them alive as well as some additional storage space if several formats were applied to all the packages. Maybe only the most popular ones could be maintained initially and then coverage would expand as needed. Actually producing the packages can be easily achieved using tools such as Open Build Service.

This is somewhat an optional step as all the subsequent ones can be done regardless; having packages would just make things a bit easier.

🚒 Containerfiles

Container images are created using Containerfiles, or Dockerfiles. Since the OCI has defined an open and authoritative image specification, there is no vendor lock-in when using containers. We’ll just refer to Containerfiles in a generic way going forward.

Proof of Concept

As an experiment to see how viable this would be, I’ve started a kernel.org containers repository which can build an initial set of images. They include x86 host toolchains with GCC 14 for arm64 and x86_64 targets as well as Clang 18 from the kernel.org toolchains. The images are automatically built using GitLab CI and stored in the project’s container registry. For example, to use the default gcc image from within a kernel source tree:

$ docker run -it -v $PWD:/home/kbuild/linux -w /home/kbuild/linux \
    registry.gitlab.com/gtucker/korg/gcc \
    make defconfig bzImage -j$(nproc)
[... pulling image the first time ...]
[... HOSTCC ...]
*** Default configuration is based on 'x86_64_defconfig'
#
# configuration written to .config
#
[... build goes on ...]
  BUILD   arch/x86/boot/bzImage
Kernel: arch/x86/boot/bzImage is ready  (#1)

A full kernel can be built although the dependencies for any extras aren’t included yet (kselftest, perf, documentation etc.).

To rebuild the images from scratch locally and have your own tags:

$ git clone https://gitlab.com/gtucker/korg.git
$ cd korg
$ make PREFIX=korg-
  BUILD   korg-toolchain-base
  BUILD   korg-clang:18.1.8-x86
  TAG     korg-clang:18
  TAG     korg-clang
  BUILD   korg-gcc:14.2-x86-x86
  TAG     korg-gcc:14
  TAG     korg-gcc
  BUILD   korg-gcc:14.2-x86-arm64
  TAG     korg-gcc:14-arm64
  TAG     korg-gcc:arm64

Then the previous example becomes:

$ docker run -it -v $PWD:/home/kbuild/linux -w /home/kbuild/linux korg-gcc make defconfig
*** Default configuration is based on 'x86_64_defconfig'
#
# No change to .config
#

Each image is a combination of host architecture, target architecture and compiler version. Only a subset of the kernel.org toolchains are covered for now as a proof-of-concept. It’s also possible to use Podman instead of Docker, see the project’s README for more details.

Upstream

The aim is to make upstream-friendly container definitions to prepare an RFC patch series and continue the discussion with the community on mailing lists. Typically, this could be added in tools/containers. Particular care has been taken in the following respects:

  • the Containerfile is vendor agnostic
  • image builds are managed with Make
  • external dependencies are kept to standard Debian (other distros may be used)
  • only kernel.org toolchains are being installed

βš™οΈ Git Hooks

There are ongoing discussions around adding a gitlab-ci.yml file upstream, and the proof-of-concept repository uses GitLab CI to automatically build and push Docker images. However, this is unlikely to meet broad adoption as things currently stand. A more community-oriented approach would be using Git hooks: they’re vendor neutral as purely based on Git, optional and up to each maintainer to enable as they want in their own trees.

I went ahead and made an experimental post-receive hook which blocks the client on a git push while images are being built. This isn’t quite good enough for the real world, it should instead be starting a task in the background with maybe an email reply when it’s done, but it’s the most basic way of wiring things up and ItWorksβ„’:

$ git push
Enumerating objects: 67, done.
Counting objects: 100% (67/67), done.
Delta compression using up to 8 threads
Compressing objects: 100% (59/59), done.
Writing objects: 100% (67/67), 12.69 KiB | 764.00 KiB/s, done.
Total 67 (delta 26), reused 0 (delta 0), pack-reused 0
remote: Building Docker images [6852976572e37c9b7e340f65155d9190c32c9f31]
remote: 6852976572e3 Makefile: rename base to korg-toolchain-base
remote:   BUILD   korg-toolchain-base
remote:   BUILD   gtucker/clang:18.1.8-x86
[...more BUILD and TAG...]
remote: Pushing Docker images
remote:   PUSH    gtucker/clang:18
[...more PUSH...]
remote:   PUSH    gtucker/korg-gcc:latest

The logic of that hook could easily be wrapped into another runtime environment, for example with a Docker-in-Docker container which is what the GitLab CI pipeline does. Even simpler, it could just be started as a detached process although this of course has some security implications. It could also start a pipeline somewhere, for example in Tekton. Each kernel maintainer or developer could pick and choose what works for them.

πŸ—„οΈ Container registry

Building upon the concept of running Git hooks to automatically update container images, the next logical step would be to set them up on git.kernel.org with either a dedicated repository or in kernel trees if merged upstream such as mainline, linux-next and some maintainer trees. There would then need to be an appropriate registry where to store these images. Still keeping in mind the same concerns of avoiding vendor lock-in and using standard tools, hosting a dedicated registry.kernel.org would seem to make more sense than relying on Docker Hub or GitLab. A popular solution for doing this would be Harbor. The container registry protocol is also specified by the OCI so other alternatives may be adopted seamlessly.

πŸͺ„ Makefile

This last step was motivated by an important point which Nathan raised as part of the initial email thread.

On Mon, Jul 08, 2024 at 22:30:31AM -0700, Nathan Chancellor wrote:

Having first party Dockerfiles could be useful but how would they be used? Perhaps building a kernel in such a container could be plumbed into Kbuild, such that the container manager could be invoked to build the image if it does not exist then build the kernel in that image? This might be a lofty idea but it would remove a lot of the friction of using containers to build the kernel so that more people would adopt it?

I haven’t looked into actually building the container image directly as a side-effect of running make. However I did some experiments to wrap the build in a container by providing the image name as per this patch:

0001-RFC-Makefile-wrap-in-container-if-CONTAINER-is-defin.patch

which is also available on GitLab.

For example, from within a kernel tree:

$ git fetch https://gitlab.com/gtucker/linux.git linux-6.11-make-container --depth=1
$ git checkout FETCH_HEAD
$ mkdir -o build/x86
$ make CONTAINER=registry.gitlab.com/gtucker/korg/gcc:14 O=build/x86 defconfig
Running in docker registry.gitlab.com/gtucker/korg/gcc:14
Unable to find image 'registry.gitlab.com/gtucker/korg/gcc:14' locally
14: Pulling from gtucker/korg/gcc
[...]
Digest: sha256:928a79af1de7550ffa1e0955177c6ef11b6a2b49334d4bced3f41c60b37a31e2
Status: Downloaded newer image for registry.gitlab.com/gtucker/korg/gcc:14
make[1]: Entering directory '/src/build/x86'
  GEN     Makefile
  HOSTCC  scripts/basic/fixdep
[...]
*** Default configuration is based on 'x86_64_defconfig'
#
# configuration written to .config
#
make[1]: Leaving directory '/src/build/x86'

Then after that it’s kernel building as usual:

$ make CONTAINER=registry.gitlab.com/gtucker/korg/gcc:14 O=build/x86 bzImage -j$(nproc)
Running in docker registry.gitlab.com/gtucker/korg/gcc:14
make: warning: -j2 forced in submake: resetting jobserver mode.
make[1]: Entering directory '/src/build/x86'
  GEN     Makefile
[...]

If you’ve built the container images locally or have your own tags, you may even reduce the example to its most minimal form:

$ make CONTAINER=korg-gcc defconfig
$ make CONTAINER=korg-gcc

There are of course already a few known issues with this:

  • docker -it uses CRLF DOS line endings

Dropping the interactive mode should be possible, this was a quick way of propagating Ctrl-C to the container.

  • uid may not be the current user (1000)

A user id mapping or namespace would be needed to address this.

  • The source tree is mounted directly

Preventing containers from modifying the source tree would be better, typically with overlays.

  • The if / else in the top-level Makefile is rather invasive

Having a separate Makefile would mean zero changes when not using containers.

  • Other glitches

There are a few more rough edges: using the -j option causes a submake warning, only one target can be specified per command line (e.g. no bzImage modules) and interrupting the build with Ctrl-C quite doesn’t work reliably (related to the docker -it issue).

Still, it’s an interesting step in facilitating adoption of upstream-supported build containers. The next question is going to be whether this may fly and if it’s worth pursuing efforts accordingly. Having the actual container images available would seem like a prerequisite so I’ll work on that first.

Going Further

While mentioning these ideas to people at Plumbers, I thought I might meet some resistance as this could seem a bit too disruptive. In fact, the opposite happened and all I got was encouragement and more ideas to go even beyond builds. For example, adding container images with QEMU to quickly test the kernel images (how about make boot CONTAINER=korg-qemu:x86 then?) and others with test dependencies pre-installed such as kselftest and in particular for the eBPF special case.

We’ll see where this takes us, probably there are going to be many more aspects to take into consideration but the fun is in the journey. Looking forward to some follow-up email discussions, and feel free to reach out directly on gtucker@gtucker.io as always.


Last modified on 2024-09-30