Linux From Scratch with Training Wheels
or, "Linux from scratch using virtual machines on a Apple Mac M1 laptop"
Update March 31, 2022. Qemu 6.2 ships with support for native acceleration on Mac M1s but it seems that newer versions of the kernel shipping with Ubuntu don't work well with it. However, the original gist mentioned in this article that uses an older qemu along with Alexander Graf's patch set seems to work fine.
Steps
Working LFS hackers should probably read this article in reverse-section order, just getting to the meat of things, and then reading the introductory material for more context.
Introduction
Linux from scratch (LFS) is a step-by-step tutorial for building your own Linux distribution. Using the LFS approach, you start with a major Linux distribution that you prefer (e.g. Ubuntu) and bootstrap your own custom distribution. Your final LFS distribution will have nothing from the original distribution you started your bootstrapping from.
LFS is an excellent learning tool even if you have been developing on or administering Linux systems for a long time. There is some kind of magic in the "learning by doing" approach - your brain will pick up a bunch of subtleties after you create a distribution where you, for example, configure every single aspect of the boot process. LFS is also a really good way to learn, in no particular order:
-
Kernel hacking - you will have created the operating system around the kernel that you do hacking on, which indirectly helps you reason about custom kernels that you make.
-
Custom kernels - for the same reason above, you will get a deep understanding of why and how to customize a kernel
-
Building Linux software from source - yes, we all know how to run "make" and "make install" but, trust me, there's more to it than that. If you want to level-up your knowledge on how to build packages on Linux this is a great way.
-
How to build cross-compilers - you will simulate the process of building a distribution targeting an entirely new machine. This will include bootstrapping the entire GNU toolchain (gcc, libc, etc)
LFS using Virtual Machines
Today there's a stunningly great amount of open-source virtualization tools for any operating system you are running, which allows you to perform the steps in the LFS book in VMs.
This is a good thing because it lets you experiment with disk partitioning, grub and the boot process (including UEFI), and rescuing your unbootable distribution more easily than on your main box.
This approach I'm terming "LFS with Training Wheels". Believe me if you experiment enough and learn the concepts deeply, as part of that process you will end up accidentally erasing disks, adding a typo to a critical boot script, or other natural errors that turn out to be catastrophic. In a virtual environment you are almost always able to recover from any of these. Learning to recover from them is good knowledge that you can apply in a "real" environment.
LFS on VMs on a Mac M1 (ARM 64 / Cortex)
This blog post will show specifics on how to do all this on a Mac M1 laptop. This is only for my convenience, hopefully all the steps here are easy to translate to other host machines with the help of the Google machine.
Overall Approach
The approach described here, and it's not the only one that will work, just one way, is to take the following steps:
-
Run two guest VMs on your computer, one with a major Linux distribution, the second your custom LFS distribution. We will pretend that the first guest running the major Linux distribution is a physical host computer. In the LFS book, you will need to mentally translate every time it talks about the host as the first guest VM. Yes it's turtles all the way down.
-
The second guest will start just as a virtual disk image. Think of adding a physical hard drive to your laptop/desktop and building a bootable Linux on that disk. We will attach a blank, unformatted, raw virtual disk image to the first guest VM (our pretend host) to start, and built it into a working, independently bootable image.
-
We will play some games with virtual flash files / drives so that we can safely play with grub and UEFI booting. This will include, at key points, restoring a backup file after grub goes crazy formatting our first guest vm's boot partition. There's slicker ways to do this, like making our first host vm dual-boot, but, our method will be simple.
Details
This article augments the LFS book. Give the book a skim now, then use the steps here.
Supplemental Step 1: Before you start the LFS book:
Create a guest VM using qemu that serves as the "host".
On a Mac M1, currently (March 2022), this github gist is the best guide. (The guide uses a slightly older version of qemu along with a patchset by Alexander Graf. This seems to work better than using an unpatched version of the latest Qemu. For me, building qemu from source along with the patchset as described works well.)
At the end of the process, hopefully after your brand-new Ubuntu image is bootable, you can create a final script to run your host VM:
start_host.sh: (before second drive)
qemu-system-aarch64 \
-machine virt,accel=hvf,highmem=off \
-cpu cortex-a72 -smp 8 -m 2G \
-device virtio-keyboard-pci \
-drive "format=raw,file=edk2-aarch64-code.fd,if=pflash,readonly=on" \
-drive "format=raw,file=ovmf_vars.fd,if=pflash" \
-drive "format=qcow2,file=virtual-disk.qcow2" \
-nic hostfwd=tcp:127.0.0.1:9922-0.0.0.0:22 \
-nographic
This script, line by line:
- Starts the ARM64 CPU emulator
- Uses the ARM Cortex A72 instruction set for the CPU, with 8 hyper-threaded cores an 2 gigs of memory
- Includes a keyboard device
- Includes 2 flash drives to simulate UEFI booting. The first is the UEFI boot code, the second is a writable drive for that code.
- Include a hard drive for the machine
- Allows localhost 9922 to be used to ssh into the device
- Turns of the display, so everything is displayed in the terminal
You can modify this step to use a different kind of virtualization if you would like, for example VirtualBox.
Now we add a second blank drive that will become built out to be our LFS distribution.
qemu-img create -f qcow2 virtual-lfs-disk.qcow2 30G
Note we are reserving 30G for the drive, but it will only take up space that is needed.
We add this disk to our start_host.sh script and we are on our way:
start_host.sh : (now with second drive)
qemu-system-aarch64 \
-machine virt,accel=hvf,highmem=off \
-cpu cortex-a72 -smp 8 -m 2G \
-device virtio-keyboard-pci \
-drive "format=raw,file=edk2-aarch64-code.fd,if=pflash,readonly=on" \
-drive "format=raw,file=ovmf_vars.fd,if=pflash" \
-drive "format=qcow2,file=virtual-disk.qcow2" \
-drive "format=qcow2,file=virtual-lfs-disk.qcow2" \
-nic hostfwd=tcp:127.0.0.1:9922-0.0.0.0:22 \
-nographic
At this point you should follow the LFS book until you get to Chapter 2.4 "Creating a new partition"
Supplemental Step 2: Partitioning the LFS disk
Once you reach 2.4 "Creating a new partion" in the LFS book:
While SSH'ed into the VM (host VM) from step 1, as root, run the "cfdisk"
(curses-based fdisk tool) on /dev/vdb
(NOT /dev/vda
), like so:
cfdisk /dev/vdb
Note that whatever mistakes you make will not affect the main hard drive running your major Linux distribution.
I create 3 partitions:
- An "EFI" system partition 1M big (will eventually be
/boot/efi
) - A "Linux Filesystem" partion 200M big (will eventually be
/boot
) - A "Lunux Filesystem" partition with the rest of the disk (will
eventually be
/
)
As the books says, it's completely up to you how you want to partition the disk. One single partition over the whole disk will work fine for our purposes, or you can add multiple partitions as described in the book.
I would advise experimenting with cfdisk, writing out a few partitions, nuking them, rewriting a different set, just playing with it until you are comfortable.
At this point the steps in 2.5 and above should work fine.
Supplemental Step 3: Advice on a custom kernel
Take this step while on Chapter 10.3 Linux-5.13.12, where you create a kernel for your new machine.
You can iteratively make lots of custom kernels after finding success once. I would advise doing the minimum amount of customization at first. Your instinct will be to make the leanest, meanest kernel possible but hold off on that for future iterations.
Just go with the defaults from "make menuconfig", and also add these drivers to support qemu virtualization: (basically get all the major virtio drivers in)
CONFIG_VIRTIO_INPUT=y
CONFIG_VIRTIO_PCI=y
CONFIG_VIRTIO_PCI_LEGACY=y
CONFIG_VIRTIO_CONSOLE=y
CONFIG_DRM_VIRTIO_GPU=m
Supplemental Step 4: Installing GRUB and UEFI booting safely
Follow these steps in the next section 10.4. Using GRUB to Set Up the Boot Process.
This is the trickiest part. We are going to:
- Backup our virtual flash drives for the working UEFI booting of our host VM (first, non LFS VM)
- Install grub for UEFI, destructively wiping out our host flash drive
- Copy this new flash for our new LFS VM, and create a qemu script to run our LFS machine
- Swap our backup copy from step 1 so that our host VM still works
Step 1: Back up boot flash drive:
- Shut down the host VM
- in the directory with all of our qemu files:
cp ovmf_vars.fd ovmf_vars.fd.host
- Start the host VM
Step 2: Install grub in UEFI mode
In LFS chapter 10.4, Follow the link to Chapter 5 of the "Beyond Linux From Scratch" to install UEFI booting using grub. This will overwrite "ovmf_vars.fd" in our VM, (hence the backup step 1)
Step 3: Make a copy of the flash boot drive for LFS machine booting:
- Shut down the host VM
- In the directory with all of our qemu file:
cp ovmf_vars.fd ovmf_cars.fd.lfs
You should be able to create a start script for your new LFS VM at this point. Mine looks like this:
start_lfs.sh:
qemu-system-aarch64 \
-machine virt,accel=hvf,highmem=off \
-cpu cortex-a72 -smp 8 -m 2G \
-device virtio-keyboard-pci \
-drive "format=raw,file=edk2-aarch64-code.fd,if=pflash,readonly=on" \
-drive "format=raw,file=ovmf_vars.fd.lfs,if=pflash" \
-drive "format=qcow2,file=virtual-lfs-disk.qcow2" \
-nic hostfwd=tcp:127.0.0.1:9923-0.0.0.0:22 \
-nographic
Step 4: Restore the Backup:
cp ovmf_vars.fd.host ovmf_vars.fd
Other notes:
- There are some idiosyncrasies on the aarm64 architecture. I won't go into too many details here, but this article is worth a careful read when you get to chapter 9.4, managing devices What is ttyama0?
- Don't run both the host machine and LFS machine at the same time! (unless you remove the lfs drive from the host machine first.)