• Run on bare metal

    • By default all compiled languages compile the program to run on the host operating system. So it is built to work run on a operating system running on a cpu with a specific architecture.
    • So when a compiler does that it expects that it has access to the libraries provided by the os for things like threads, files, network etc.
    • But while building a kernel it makes absolutely no sense to compile the program to run on the host operating system. The kernel is expected to run on bare metal1 environment. Sure, it can depend on the cpu architecture but it should be completely independent of the host operating system or any other operating system for that matter.
    • In rust such executable is called a “freestanding” or “bare metal” executable.
    • Rust comes with a standard library that has abstractions for threads, IO etc. The standard library in turn depends on the underlying OS to implement such features. So we cannot use the standard library provided by rust.
    • We could still use the programming features of rust that are independent of any OS like iterators, the memory safety features of rust etc.
  • Disabling the use of standard library2

    • To disable the use of standard library in rust we can use the #![no_std] attribute.
    • Rust by default also uses a runtime which setups stack overflow protection, ability to process command line arguments and spawning the main thread before the main method is invoked. In a no_std environment this runtime will not be available.
    • I have seen in rust docs to use libc to successfully compile the code with no_std. But this shouldn’t be done because libc is a library designed to run in userspace and invokes kernel routines to get its job done. Since our code runs in the kernel space, libc won’t work.
    • Issues with no_std

      • When the program is compiled at this stage it fails to compile complaining about the missing panic handler and that it is required.
      • Define a panic handler using. Panic handler shouldn’t return ((6443e9b3-34fc-421e-8761-dde03547a562)).
      • use core::panic::PanicInfo;
         
        #[panic_handler]
        fn panic(_info: &PanicInfo) -> ! {
            loop {}
        }
      • One other issue is the compiler complaining about missing language item eh_personality.
        • This is used to implement stack unwinding3
        • Stack unwinding is a complex process which depends on os libraries.
        • We disable stack unwinding in rust by asking the program to abort on panic.
          • [profile.dev]
            panic = "abort"
             
            [profile.release]
            panic = "abort"
      • When a rust program that links with standard library is run the execution happens in two stages:
        • Stage 1: A ctr0 runtime is started which setups the stack and the registers.
        • Stage 2: The crt0 runtime invokes the start language item in the rust runtime that invokes the main method.
      • Since our program doesn’t link with the standard library we do not have access to either the c runtime or the rust runtime.
    • The #![no_main] attribute tells Rust to skip its usual expectations on the main function, instead letting the developer manually handle it.
    • The #[no_mangle] attribute tells rust not to magle the name of the main method.
    • We name the entry point as _start, as _start is the entry point of a program from the OS perspective. _start is usually provided by the c runtime, since crt0 is not available for us we have to specify the our own implementation for _start.
    • Marking it extern "C" tells the compiler to use C ABI.
    • #![no_main]
      /*
      ** diverging function because it is called by the bootloader and there is
      nothing to return to. Call exit.
      */
      #![no_mangle]
      pub extern "C" fn _start() -> ! { 
        loop{}
      }
    • Compiling the program at this stage will complain about multiple definition of _start'.
      • This is because, when we compile the rust compiler will try to compile the program to run on the host operating system and tries to compile it with the c runtime. This can be avoided by passing args to the linker to not include C runtime.
      • We do not want to do that because we want to compile the program to run on a bare metal environment instead of the host operating system.
      • Rust makes use of the llvm’s ability to cross compile. It uses something called triple target to describe the environment. id:: 6446052f-4312-4249-83f3-c968a044c3ee
      • We are going to create a bare metal environment.
  • The boot process

    • ((6446052e-b6b9-43e9-914b-8705e2b2f3cd))
    • x86 has two standard bios standards
      • BIOS (Basic Input/Output System) logseq.order-list-type:: number
      • UEFI (Unified Extensible Firmware Interface), this can also emulate BIOS for backward compatibility. logseq.order-list-type:: number
    • The BIOS searches for bootable disks and transfers the control to the bootloader which is the first 512 byte section of the disk.
  • Bootloader

    • Since most bootloaders are larger than 512 bytes they are divided into a first stage bootloader which is the first 512 bytes and then calls the second stage bootloader which can be larger.
    • BIOS starts in 16 bit real mode. The responsibility falls on the bootloader to switch to 32 bit protected mode(virtual memory, paging, multitasking support) and then to 64 bit long mode(access to the entire memory and 64 bit registers).
    • writing a bootloader requires the use of assembly language.
    • Using bootimage crate to make a bootable image.
      • This internally uses bootloader crate and packs the kernel with the bootloader.
    • The Multiboot standard

      • Open standard to defining an interface between bootloader and os.
      • GNU GRUB is an implementation of multiboot.
      • supports only 32 bit mode.
  • Cross compilation

  • As mentioned ((6446052f-4312-4249-83f3-c968a044c3ee))
  • A triple target is usually of the format <arch><sub>-<vendor>-<sys>-<env> describing the cpu architecture(arm, x86_64), vendor(apple, ibm), operating system(linux, darwin, win32) and the ABI
  • Triple target for bare metal x86_64 environment
  • // x86_64-mini_os.json
    {
        "llvm-target": "x86_64-unknown-none", //target triple
        "data-layout": "e-m:e-i64:64-f80:128-n8:16:32:64-S128",
        "arch": "x86_64",
        "target-endian": "little",
        "target-pointer-width": "64",
        "target-c-int-width": "32",
        "os": "none", // expect no os
        "executables": true,
        "linker-flavor": "ld.lld", //to use cross-platform lld linker shipped with rust
        "linker": "rust-lld", //to use cross-platform lld linker shipped with rust
        "panic-strategy": "abort", // abort on panic and skip stack unwinding
        "disable-redzone": true, //TODO need to read more about this, probably has something to do with exception handling
        "features": "-mmx,-sse,+soft-float" // disabling features with - and enabling with +
      //mmx, sse and floating point operations require SIMD registers(128bit) thus
      disabling them. soft-float is software emulated float.
      // disabling SIMD for performance reasons, apps running on top will have access in the future.
    }
  • Build using cargo build -Zbuild-std=core,compiler_builtins --target x86_64-mini_os.json.
  • -Zbuild-std=core,compiler_builtins to rebuild the core module(compiler_builtins is a dependency or core) which is shipped as prebuilt for the host os. This switch will recompile it for the bare metal environment.
  • Memory related functions like memset, memcpy, memcmp are provided by the c library. Since it no longer available we could use nomangle and provide our own implementation or use the ones that rust provides with compiler-builtins-mem. Use the switch -Zbuild-std-features=compiler-builtins-mem
  • VGA text buffer

    • TODO

Footnotes

  1. Bare metal refers to a computer executing instructions directly on the hardware without any OS abstrations.

  2. https://docs.rust-embedded.org/book/intro/no-std.html

  3. stack unwinding is a process where the function entries and the resources that are local to the functions are removed from the call stack and are disposed.