gtucker.io
Building and Running a Linux Kernel
Overview of what it takes to build a Linux kernel from source and boot it using QEMU and a minimal user-space

Have you ever wondered what happened for a Linux kernel source tree to turn into something that could run a computer? How can a bunch of C files produce something able to initialise all the hardware resources and make them available to applications? Even without any desire to become a skilled developer, this first adventure can be a highly educative and perhaps slightly magical one to go through. There are basically 3 steps: get the code, build it and run it. Let’s take a look.

Note: This assumes some familiarity with Linux in general and compiling C programs. The sample commands have been tested with Debian Bookworm and Fedora 40.

1. Show me the code

Tux jigsaw piece 1

First of all, we must start by getting a local copy of the source code. We’ll be using Git for this, and here’s how to install it on Debian, Ubuntu or any other Debian-like system:

sudo apt update && sudo apt install -y git

Similarly, on Fedora and friends:

sudo dnf install git

In this blog post we’ll be using the latest mainline release as of today which is identified by the v6.7 tag. We’ll be creating a “shallow clone” using the --depth=1 option to avoid downloading years of past kernel Git history. Bear in mind that it can already take a little while (about 250 MiB) but it’s still much more lightweight than a full git clone. Here’s one way to do it:

Note: The ~/src/linux directory choice is entirely arbitrary for the sample commands provided here.

mkdir -p $HOME/src/linux
cd $HOME/src/linux
git init .
git remote add origin https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
git fetch origin v6.7 --depth=1
git checkout FETCH_HEAD -b linux-6.7

This repository should now have a linux-6.7 branch with a local copy of all the source code. To double check that it went as expected, if you run git log --no-decorate there you should get the following output:

commit 0dd3ee31125508cd67f7e7172247f05b7fd1753a
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Sun Jan 7 12:18:38 2024 -0800

    Linux 6.7

Nice one! You’ve reached the first checkpoint.

The magic of open-source software.

2. Building

Tux jigsaw piece 2

Next, let’s get all the tools needed to compile this kernel. Compiling means turning the source code written by humans (the text files we just got) into a kernel image that can be run by a machine (binary file). We mostly need a C compiler and some utilities to orchestrate the compilation. In our case, we’ll use the GCC compiler. There are many ways to set up a build environment but we’ll just focus on the simplest option here and directly install the bare minimum required.

On Debian, Ubuntu and other Debian-like distributions:

sudo apt update && sudo apt install --no-install-recommends -y \
    bash bc bison flex gcc git libelf-dev libssl-dev make

On Fedora and friends:

sudo dnf install -y \
    bash bc bison diffutils flex gcc git make \
    elfutils-libelf-devel openssl-devel

Note: Many more packages may be required to build other parts of the kernel than the main image such as the documentation, kselftest, device trees, perf, Rust drivers etc. but that’s way beyond the scope of this post.

The following sample commands are run from the top of the source tree just like the git commands. Building a Linux kernel can take different shapes. The first aspect is the compiler which we’ve chosen to be GCC so this won’t change. Then the CPU architecture needs to be set, by default it’s the current one on the present machine although it’s possible to build for another CPU architecture (a.k.a. cross-compiling). We won’t get into that and instead just assume we’re running on an Intel x86 compatible machine. I might make a follow-up blog post about doing an arm64 native build one day…

Then the next thing every kernel build needs is a configuration file. This will specify which parts of the kernel to build into the main image, or not build at all, or build as a separate module to load at runtime. Some options will even directly affect how parts of the source files are compiled. In our case, we’ll rely on the default config or defconfig that comes with the source tree. It doesn’t really require modules so we’ll just ignore them for now.

To make the config file:

make defconfig

It should do a few things to resolve its own dependencies and end with these messages:

*** Default configuration is based on 'x86_64_defconfig'
#
# configuration written to .config
#

This as well as all the kernel build commands shown here rely on the make tool which has rules defined in the Makefile. It’s a standard way for C programs in particular to define their compilation steps. So make will implicitly load the Makefile which contains a defconfig target which is used to produce the .config file with the actual kernel config.

Next, we’ll build the main kernel image. We’ll tell make how many CPUs are available using nproc to compile up to the same number of source files in parallel with the -j option:

make -j$(nproc)

Note: You may also use NCPU=$(ls -1 /sys/class/cpuid | grep -c cpu); make -j${NCPU} if you don’t have nproc installed on your system.

The build may last between under a minute on a fast workstation to an hour on a low-power device. Regardess, it should eventually end with something like this:

  LD      arch/x86/boot/compressed/vmlinux
  ZOFFSET arch/x86/boot/zoffset.h
  OBJCOPY arch/x86/boot/vmlinux.bin
  AS      arch/x86/boot/header.o
  LD      arch/x86/boot/setup.elf
  OBJCOPY arch/x86/boot/setup.bin
  BUILD   arch/x86/boot/bzImage
Kernel: arch/x86/boot/bzImage is ready  (#1)

The bzImage file is the kernel image name on x86 systems. Here’s a quick sanity check to confirm it is indeed in this path:

$ ls -lh ~/src/linux/arch/x86/boot/bzImage
-rw-r--r-- 1 gtucker gtucker 13M Feb 25 13:16 /home/gtucker/src/linux/arch/x86/boot/bzImage

Note: You may build the modules at this point with make modules but that’s not covered by this tutorial.

Good job! You’ve reached the second checkpoint.

It comes with your very own kernel image as a small digital reward.

3. Running

Tux jigsaw piece 3

A kernel is only there to provide a software abstraction layer on top of the hardware for applications to be able to use it. Well this may be over-simplifying things quite a bit, and in this tutorial we’re using QEMU rather than directly running on real hardware, but basically a kernel is not very useful on its own without the user-space part of the operating system. So let’s fit that last piece of the puzzle in place to get the full picture.

There are of course many, many Linux-based operating system variants available out there and apart from maybe a few obscure cases they all include a user-space image on top of the kernel. Here we’ll use one supplied with this blog post, copied from kernelci.org and created using buildroot. It’s already packaged as a disk image and all good to go except it’s compressed to save space. You can build your own too or use another framework or distribution, the steps to run it should essentially remain the same.

Just like we installed git earlier on, let’s install curl and xz and run this command to download and decompress our user-space:

curl https://gtucker.io/bin/buildroot-x86.ext2.xz | xzcat > buildroot-x86.ext2

We’ll need one more package qemu-system-x86 which can equally be installed with apt or dnf like we did before. Then we can finally boot our system:

qemu-system-x86_64 \
  -kernel arch/x86/boot/bzImage \
  -drive file=buildroot-x86.ext2 \
  -append "console=ttyS0 root=/dev/sda" \
  -nographic \
  -m 512

Let’s take a quick look at each option:

  • -kernel is to specify the path to the kernel image
  • -drive is to create a virtual disk drive, here from the buildroot image
  • -append is to append arguments to the kernel command line (see below)
  • -nographic is to disable any virtual displays and just use serial console
  • -m 512 is to allocate 512 MiB of RAM to QEMU

You may also add --enable-kvm to make use of KVM acceleration, if available.

A few more important things to note:

The kernel command line is a string passed to the kernel before booting it. It can contain arbitrary values, most of which will be parsed and used by the kernel itself. In this case, root=/dev/sda is to tell the kernel to mount the /dev/sda block device as the root file system. We know in this case that it corresponds to the buildroot image provided via the -drive option. The console=ttyS0 part is to tell the kernel where to send its messages and this is then reused for the interactive console in user-space.

The root file system is persistent and mounted from the image file on the host computer, so it may get corrupted and the kernel may then fail to mount it and hang during the next boot. That can be fixed using losetup on the host and e2fsck or simply by deleting the image and creating a fresh one from the original.

It can be handy to mount the buildroot image on the host to share files with QEMU. It’s safer to only do that once the kernel has been shutdown or mount it read-only on the host to avoid unexpected behaviour, e.g.:

sudo mount buildroot-x86.ext2 -o ro /mnt

To do a clean shutdown of the system, just run halt. This is good practice to avoid corrupting the root file system. In fact it will only stop the kernel but QEMU will still be running, which lets you do things like reboot instead. To actually exit QEMU at any time, typically after halting the system or a crash, do ctrl+a then c, q and enter.

You should have seen the kernel log by now which will shed some light on what is going on during boot and confirm that your kernel is actually working. Now let’s just run a couple of sanity checks in the QEMU console:

/ # dmesg | head
[    0.000000] Linux version 6.7.0 (root@06da23279f88) (gcc (GCC) 14.0.1 20240217 (Red Hat 14.0.1-0), GNU ld version 2.41-34.fc40) #1 SMP PREEMPT_DYNAMIC Sun Feb 25 13:31:09 UTC 2024

Note: This kernel was built with Fedora so it says the GCC version is 14.0 from Red Hat. With Debian you’ll have a different one but this has no significant impact here.

To check that the kernel command line is as we requested it via QEMU:

/ # cat /proc/cmdline
console=ttyS0 root=/dev/sda

Congratulations! You’ve reached the third and last checkpoint.

Hope you enjoyed the ride.

What next?

Tux full jigsaw pizzle

The initial reason for me to write this tutorial was to help engineers who are working on topics around the kernel to become more familiar with it. Typically, people developing KernelCI would be seeing test results and discussions around kernel bugs and concepts that are hard to grasp without a minimum amount of real kernel experience. It’s important to feel comfortable to walk in the users’ shoes sometimes.

Obviously this post is barely scratching the surface, the Linux kernel is a very large and complex project. It’s arguably the largest true open-source project to this day. Let’s think of a few small things to do next if you want to keep pursuing.

In practice, you can try modifying the kernel config with make menuconfig (you’ll need the ncurses package) and rebuilding it with different options, or make changes to the source code as an experiment or add your own module then build it and copy it to the disk image and insmod it at runtime. You could try to reproduce a known bug on a particular kernel revision by checking it out from Git and doing a new build, maybe enable some debug options and help find or test a fix. And with a bit more time you could even boot a hardware device with your kernel, such as an x86 laptop. Please note that a full distro will need a much bigger kernel build though.

More generally speaking you can always read the official documentation and in particular the user & admin guide part. There are also plenty of books available if you’re serious about doing kernel programming such as the all-time classic Linux Device Drivers 3rd Edition which is ageing reasonably well and the just-released Linux Kernel Programming 2nd Edition. Finally, you might want to take a look at the LKML mailing list which is opening yet another large topic around kernel development workflows. I think I might also write a tutorial about that at some point later this year. But that’s all for today.

Happy hacking!


Last modified on 2024-03-04