=Paper= {{Paper |id=Vol-1746/paper-15 |storemode=property |title=Utilizing Rust Programming Language for EFI-Based Bootloader Design |pdfUrl=https://ceur-ws.org/Vol-1746/paper-15.pdf |volume=Vol-1746 |authors=Tunç Uzlu,Ediz Şaykol |dblpUrl=https://dblp.org/rec/conf/rtacsit/UzluS16 }} ==Utilizing Rust Programming Language for EFI-Based Bootloader Design== https://ceur-ws.org/Vol-1746/paper-15.pdf
    Utilizing Rust Programming Language for EFI-Based
                     Bootloader Design

                                        Tunç Uzlu and Ediz Şaykol
                          Beykent University, Department of Computer Engineering,
                                      Ayazağa, 34396, İstanbul, Turkey
                              tuncuzlu9@gmail.com; ediz.saykol@beykent.edu.tr



                                                       in Servo, Mozilla Foundations massively parallel web
                                                       browsing engine, which is unique because of its concur-
                     Abstract                          rent process rendering and compositing steps [JML15].
                                                       Rust, as being a systems programming language, has
    Rust, as being a systems programming lan-          ability to operate at the lowest level without any run-
    guage, offers memory safety with zero cost and      time penalty, like C, C++ or Cyclone, but offers com-
    without any runtime penalty unlike other lan-      plete memory safety, unlike these languages. Systems
    guages like C, C++ or Cyclone. System pro-         programming languages are crucial for time criticial
    gramming languages are mainly used for low         tasks like signal processing and also for bare-metal op-
    level tasks such as design of operating system     erations such as design of operating system compo-
    components, web browsers, game engines and         nents, web browsers, game engines where raw hard-
    time critical missions like signal processing.     ware access is a must. Existing systems languages are
    Main disadvantages of the existing systems         memory unsafe and extremely complicated because of
    languages are being memory unsafe and hav-         their low level nature.
    ing low level design. On the other hand, Rust         Systems programming languages are considered es-
    offers high level language semantics, advanced      sential for embedded systems because of low mem-
    standard library with modern skill set includ-     ory availability and exiguous processing power [HL15].
    ing most of the features and functional ele-       The main reason is the lack of garbage collector which
    ments of widely-used programming languages.        causes non-deterministic delays [LAC+ 15]. Garbage
    Moreover, Rust can be used as a scripting lan-     collectors provide very safe memory management, but
    guage like Python, and a functional language       poorly manages the memory space and unpredictably
    like Haskell or any other low level procedural     runs at the background. This design choice also affects
    language like C or C++, since Rust is both         energy consumption which is very important for em-
    imperative and functional having no garbage        bedded systems and changes operating system design
    collector. These design choices make Rust a        paradigm [LMP+ 05].
    suitable match for low level tasks via includ-
    ing high level scalability and maintainability.       On the other hand, Rust is both imperative and
    Meanwhile, EFI (Extensible Firmware Inter-         functional language. Although including different fla-
    face) specification is aimed to remove the lim-     vors, Rust is highly scalable with capable standard
    itations of legacy hardware. Hence, we present     library comparable to high level languages. Rich
    our analysis of utilizing Rust language on EFI-    language semantics and haveing no garbage collector
    based bootloader design for x86 architecture,      makes Rust suitable match for low level tasks while
    to make it useful for both practitioners and       having high maintainability level. Moreover, Rust can
    technology developers.                             be used as a scripting language like Python or as a
                                                       functional language like Haskell because of its inher-
                                                       ited skill set has been mostly adpoted from modern
1   Introduction                                       languages.
Rust programming language has been designed by            C++ is the most powerful systems programming
Graydon Hoare and currently it is actively being de-   language today. Because of its multi paradigm de-
veloped by Mozilla Foundation. It is also being used   sign and zero cost runtime performance, it is widely
used by numerous organizations and people with dif-        tion.
ferent backgrounds. C++ has features with compli-             Rust ecosystem includes Rustc compiler but also a
cated runtime support like RTTI and exceptions dis-        very powerful package manager, Cargo with its registry
abled for most bootloader applications. As it includes     webpage for crates, Rustfmt for code formatting, and
every element from its predecessor C language, it also     Rustdoc. for automatic document generation. Cargo
includes every memory safety pitfall from C. This vari-    has very well dependency management as it offers
ation makes C++ even more vulnerable to memory un-         strict versions of dependencies to be defined. It allows
safety especially architects with C background widely      arbitrary flags to pass to Rustc, the Rust compiler,
rely on these language elements. Cyclone, on the other     but most importantly with target argument [HL15] it
hand, developed as an extension to C language to pro-      is possible to cross compile to another system differ-
vide Rust-like memory safety mechanism with ability        entiating from host operating system. There is also
to port from C to Cyclone without much effort. How-         features argument for conditional compiling. Cargo
ever, this design choice caused the language semantics     reads projects meta information from a Toml file which
to become restrictive and unwieldy.                        is very much like JSON, but more suitable for human
   Another language which is popular and somehow           editing, rather than data serialization.
racing with Rust is Go language because of its low
learning curve. Go is supported by Google and is a         2.1   Rust Programming Concepts
high level language which can be compared to Python
or Ruby. Go neither have generic types nor pro-            Ownership is one of the most important language se-
vides safety over its concurrency model, Goroutines.       mantics of Rust. Variable bindings can have one
Rust has generics with monomorphisation so they are        unique owner. They can be moved, can be borrowed
statically dispatched and has good runtime perfor-         numerous times if they are not previously borrowed
mance [Bal15].                                             as mutable, that can be happened only once. Own-
   Here, we present our analysis of utilizing Rust lan-    ership also works on resources like files or sockets and
guage on EFI-based bootloader design for x86 architec-     across threads. Rust provides traits to offer functional-
ture, to make it useful for both practitioners and tech-   ity similar to inheritance [JML15]. For example, to du-
nology developers. Our analysis in this paper starts       plicate an object Rust have Clone trait [LAC+ 15] also
with presenting Rust language basics in detail in Sec-     there is Copy trait for bitwise copying. Anonymous
tion 2. Then, bootloading basics is presented in Sec-      closure functions are also defined in terms of traits in
tion 3. Since the main idea behind using Rust is pro-      Rust like Fn or FnMut depending on mutability and if
gramming a critical-and-safe low-level task with high-     the closure is called once it should be FnOnce. They
level programming concepts, we found bootloader de-        can not be used as a return value so they should be
sign a typical application for this purpose, and discuss   enclosed into a Box which allocates space from Heap
design choices that make Rust suitable in Section 4.       memory [Lig15].
Finally, Section 5 concludes our paper and states fu-         Rust have Structs in a very similar way to C. The
ture work.                                                 main difference is data structure itself may be pub-
                                                           lic whereas its elements may be private in the code
                                                           space. Rust offers algebraic Enum which is more func-
2   Rust Language Details
                                                           tional and much more advanced compared to that of
Rust is an open source programming language, includ-       C++, which only has type checking. Option generic
ing an issue system for bug reporting and separate         type is a special Enum type with maybe characteris-
RFC tracker for language standardization, which are        tic. It is being used as a selector between a return
located on Github repository. With the help of numer-      value, Some, or an error value, Err (or absence None).
ous contributors around the world, Rust provides pre-      This Option and Error types are suitable for repre-
compiled development environment for Linux, Win-           senting Null pointers so that it is impossible Rust to
dows and OS X. It is also possible to cross compile        have Null pointer errors. This paradigm is also suit-
Rust for Ios, Android, Rasperry Pi and other operating     able for Null pointer optimization as Rust uses LLVM
systems. As Rust is a separate development toolchain       compiler infrastructure and benefits from same back-
from operating system, it is radically closer to deter-    end optimizations of C language family. Pointer safety
ministic code generation process. Hence, Rust is com-      is guaranteed with holding Lifetimes. Like type infer-
pletely decoupled in this perspective. On the other        ence, reference lifetimes can be guessed by Rust com-
hand, languages like C or C++ depends on header            piled and this is called lifetime elision. Sometimes ex-
files and libraries through the operating system, lots      plicit lifetime marks are required as references lifetime
of applications along with various operating system        must be equal or larger than its originating binding.
distributions and updates might influence the collec-          Concurrency is the core of Rust. Same owner-
ship mechanism applies across threads and Rust offers        audience. Like borrowing a master chefs knife, imper-
thread safety mostly on compile time. Channel, for          ative paradigm is powerful when used correctly, but
example, allows data to be send safely across threads       tend to fail because of its destructive nature on global
if the type satisfy Send Marker trait. Markers are          data [Oka99].
Rusts internals to enforce safety rules. Other impor-
tant markers are Sync, can be shared across threads,
Sized, type has a known size at compile time. When
multiple threads need to modify same region of mem-         2.2   Comparing Rust with C and C++
ory classical lock mechanisms like Mutex or RWLock
are provided. The key point is locking in Rust works
on the data itself, not on the code. Software architects    Rust is the remedy for numerous systems program-
using C++ tries to prevent data race by locking the         ming bugs by design. First one is buffer overflow or
code itself by design.                                      underflow on arrays. C++ has no bounds checking
    A well-known analysis on the cost of software test-     for arrays so writing or reading outside of bounds may
ing [Pat01] states that if a design error at the specifi-    cause corruption or page fault depending on operation.
cation phase costs about zero to 10 cents, in the soft-     Rust checks array bounds at runtime because there is
ware testing phase it costs 1 to 10 dollars. However,       no way to detect array size at compile time. Also Rust
if the error is found by the eventual user the cost is      does not allow indexing operation with negative argu-
at least 100 dollars, hence the increase is logarithmic.    ment. Array elements are accessed with Index trait
To help in reducing the errors, Rust is designed to be      and this trait is not defined for negative values. At last
a strong and static language. Dynamic languages suf-        integer overflow remains. Fortunately, Rust checks for
fer from compiler aid or lack of typing depending on        arithmetic overflows if the number is unsigned. This
language design. They have low learning curve and           type of corruption is the main source of buffer related
high portability or embedibility. On the other hand,        attacks for years.
languages with strong typing such as Rust or Haskell           The second is iterator invalidation. With C++,
have higher learning curve but provide superior type        while an iterator is looping over a collection and the
safety at compiling stage. Compilers are far better at      collection has been modified, this causes the iterator
catching bugs than human eye. There are also weak           to be invalid. Data is corrupt or iterator goes into
static languages exist. They offer automatic type con-       an infinite loop depending on operation. With Rust,
version and this unpredictability causes bugs just like     as the collection is borrowed by the iterator, it can
dynamic languages. Undefined behaviors have always           not be borrowed mutably by modifier functions like
been spots for hard to find bugs. For example, C++           Push [Bei15].
language, unlike Rust, does not define size of its main
integer type, int, or char type can be signed or un-           The last one is use-after-free memory bugs. High
signed depending on various factors like compiler, op-      level languages prevent this kind of error by using
erating system or building flags.                            garbage collector while Rust has its unique ownership
                                                            and lifetime semantics to prevent this memory pitfall
    Charles Petzold described a telegraph relay as a de-
                                                            with zero runtime performance cost. Rust also has hy-
vice that a clicker and a sound magnet connected with
                                                            gienic macros and the macros are part of AST trans-
a stick by lazy operator. Because they were moving
                                                            formation [Lig15].
simultaneously [Pet00]. As it is acceptable for the op-
erator to make mistakes when hearing the Morse code             Rust has unsafe blocks for non-ideal conditions like
for a day and clicking the correct dash or dot code         dereferencing raw pointers, type transmute or foreign
as there is no mechanical aid. Dynamic languages are        function interface. With Rust, there is no possibility
somehow the same. Compiler support is an example            to cause concurrency failure outside of unsafe block
for the relay device, with strong type checking, is seri-   even if the design of application is tremendously bad.
ously important to prevent human errors. Rust takes         Raw pointers are ideal for storing MMIO or interrupt
this a step forward by providing compile time memory        controller, system tables memory address as they are
and thread safety. Runtime checks are done only if          stored on constant memory location. C language does
there is no any other choice, like bound checking for       not prevent pointers to be modified outside of their
arrays.                                                     lifetime this is a problem with Rust only when unsafe is
    Rust also have borrowed functional elements from        used. Rust also offers strong foreign function interface
various languages, for example, Iterators. They are         to C language with Extern keyword and talking to C
lazily evaluated and offers numbers of higher order          has no runtime performance cost. This makes calling
functions when an iterator is defined or converted into.     foreign function from EFI is extremely simple with a
Functional flavor is harder for systems programming          simple binding module.
3     Bootloading Basics                                   most importantly runs the system in long mode.
3.1   Legacy Bootloading                                   3.2   Unified Extensible Firmware Interface
Bootloaders are responsible for building memory map,             (UEFI)
finding system tables and launching operating system        EFI specification has been designed by Intel in 1999
kernel. For backwards compatibility reasons CPUs           and now it maintained by UEFI consortium that in-
with x86 architecture used to start in 16-bit real mode    cludes more than 160 companies [ZRM11]. EFI has
which only has access to 1MB of memory. Typical            lots of modern features such as networking, human in-
routine of a bootloader should be first enabling higher     terface device support and bootloader driver model.
memory over A20 gate [Cor16]. Bootloading concepts         It provides safer way to update firmware update with
heavily relies on chipset specification and BIOS inter-     packages, Capsules, that enforce EEPROM valida-
rupts. As they are designed by different hardware           tion [BZ15]. The flowchart of EFI-based bootloading
vendors, conflicts exist on different systems. Such          process is shown in Figure 1.
units have grown organically over years and they have         EFI is built up with numerous modules while boot,
poorly standardized.                                       runtime and driver modules are mandatory. Boot
   Next step should be enabling protected mode, which      module is the key to generating memory map and lo-
provides 32-bit addressing and paging. Activation of       cating systems tables. x86 memory model, while de-
paging is mandatory and also very useful as it provides    pending on memory controller or chipset, has lots of
separation between kernels and user applications pages     gaps in the memory [YZ15]. These include MMIO,
in terms of permissions. Also paging is the key for vir-   configuration registers for PCI devices4, legacy timers,
tual memory along with creation noexecutable pages         video frame buffers or regions belongs to ACPI or
to prevent runtime code execution from text sections.      interrupt controller tables (reclaimable or not). As
Paging is also being used on high level, for example       brute-forcing to generate a memory map is extremely
guard paging is being used to grow stack when there        unstable, EFI provides the map out of the box. Driver
is a page fault exception at the end of program stack.     model allows to create drivers for file systems or NIC
On real mode there is another memory management            devices for richer bootloading environment. While
called segmentation. It works by using different selec-     runtime module offers monotonic timers, system time,
tors for sectioning areas of code and data blocks. After   power supply commands or firmware updating.
protected mode switch segmentation is now obsolete,           EFI bootloader applications can be developed with
but at the same time it is still active and has to be      Rust like any other applications uses foreign function
configured such as it should provide the same flat ad-       interface, but there should be no standard library for
dressing. Some segment registers are still being used      all types of operating systems. The library of Rust
in Linux kernel to detect buffer overflow over function      is rich as high level languages. Most of the language
call return address on stack.                              characteristics provided over standard library and not
   Lastly, there is long mode with provides 64-bit ad-     embedded into languages itself. Rust binaries should
dressing in canonical form and removes historical fea-     be linked into a final Portable Executable (PE). PE
tures like BCD [Cor16]. Different kernels have strict       file format is being used in Windows operating system
requirements about the state that it is going to be        and offers sectioning along with relocation [Hah14].
started. There are also various sub-modes like for em-
ulating real mode interrupts in protected mode, called     4     Designing       EFI-based         Bootloader
virtual-8086 mode, or emulating complicated driver-
required devices in early modes, called system man-
                                                                 with Rust
agement mode. Between this mode switches interrupt         In order to create an EFI application with Rust, first
controller must be reconfigured correctly. At the old       Libcore should be compiled for target platform. Lib-
times real mode interrupts which were invoking appro-      core is the bare-metal subset of Rust standard library
priate BIOS support were being used in place of device     that has no operating system dependency. A few mem-
drivers in order to talk to the hardware.                  ory functions are needed to build Libcore, which can
   As devices became much more complicated operat-         be obtained from Rlibc. It is also possible to use their
ing systems took over all hardware interaction. BIOS       C counterparts. EFI application, Rlibc library and
were started to be used as a bootloader firmware. Its       Libcore should be cross-compiled to target system by
complex nature was such a boredom and also lack in-        correct triplet. Although x86 64-pc-windowsgnu is the
teraction with modern technology, such as network ac-      most suitable triplet (because of a future PE linkage)
cess, was led Intel to design EFI specification which       for such a bootloader application, it is not sufficient.
is a modern platform firmware for bootloading. EFI             There should be a custom target triplet definition
can run applications just like an operating system and     file in JSON format and it should disable few language
Figure 1: The flowchart of EFI (Source: https://en.wikipedia.org/wiki/Uni-fied Extensible Firmware Interface).
features.                                                    SSE, there are also other mathematical floating
                                                             point units such as MMX and 3dNow depending
  • First of them is Compiler-rt, because otherwise          on CPU model. LLVM does not allow us to dis-
     LLVM compiler infrastructures helper library or         able floating point support in such state because
     Rust languages itself should be reconfigured and         Libcore library has floating point code. It should
     recompiled for target architecture even though          be modified and cleaned from floating point in
     there is no need.                                       order to be used in kernel or bootloader program-
  • Second one is Morestack, as there is no highlevel        ming. One example can be that Fxsave or Fxstor
     memory management Morestack is not declared             instructions copy every FPU storage registers into
     by the application and stack is managed manually        stack between function calls.
    so compiler should not define Morestack.                 The EFI application then can be linked with sub-
  • Third one is stack unwinding as when an excep-       system 10 flag, put into FAT32 drive and tested with a
    tion occurs in a bootloader, there is little to no   computer or virtual machine. Ovmf is an open source
    chance to recover. It is also known as landing       BIOS for Qemu having EFI support. Qemus nographic
    pads in Rust and can also be defined as compiler      option makes it easy to integrate into any develop-
    flag.                                                 ment environment. There is also a tool called Multi-
                                                         rust which crates Rust version overrides for folders. It
  • Finally, floating point operations and optimiza-      makes easier to make switch between nightly versions
    tions must be disabled from the triplet configura-    or stable release of Rust. EFI also has a shell which
    tion file. It has been found that floating point op-   is a helper for bootloader design. For example, Pci
    timizations corrupts interrupt handlers with bare-   command lists pci device paths or Memmap shows the
    metal Rust [HL15]. Also in bootloader environ-       memory map. EFI Capsules also support I2C which
    ment, floating point stack or coprocessor have not    can be used to flash ROMs belonging other hardware.
    yet configured. Also most operating system ker-          Historically bootloaders consisted two or three
    nels does not provide floating point functionality    phases. They were loaded into memory step by step,
    in kernel space. Along with the FPU stack and        upgraded the system to a higher mode and prepared
the environment for the next phase. This is no longer      References
required with EFI, but it is possible to keep this de-
                                                           [Bal15]   I. Balbaert. Rust Essentials. Packt Pub-
sign. As an EFI application relies on its own binary
                                                                     lishing, May 2015.
structure and calling convention, it may beneficial to
use a second stage bootloader which has been started       [Bei15]   A. Beingessner. You can’t spell trust with-
from EFI. This second stage application is not sub-                  out rust. Master’s thesis, Charlton Uni-
jected to EFI specification and is just a small kernel                versity, Department of Computer Science,
indented to run the real kernel.                                     2015.
   There are numerous resources on operating systems
design with Rust including [HL15] and [Lig15]. All re-     [BZ15]    M. Bulusu and V. Zimmer. Challanges for
sources with C language are applicable to Rust since                 UEFI and the cloud. In UEFI Plugfest
the syntactic elements of these two languages are sim-               2015, May 2015.
ilar. Also Rusts strong foreign function interfaces pro-
vides strong interaction. C is lingua franca of systems    [Cor16]   Intel Corporation. Intel 64 and IA-32 ar-
languages. It has very good runtime performance and                  chitectures software developers manual vol-
has raw memory management capability. Its abstract                   ume 3 (3a, 3b, 3c and 3d): System pro-
machine model perfectly fits into current hardware                    gramming guide. Technical report, Order
which utilizes program counter, registers and address-               Number: 325384-058US, April, 2016.
able memory, but its type system has aged [Pos14].         [Hah14]   K. Hahn.      Robust static analysis of.
Rust, on the other hand, is fresh and brings lots of                 portable executable malware. Master’s the-
modern features from newer high level designs. It of-                sis, HTWK Leipzig, Department of Com-
fers safety at compile time and abstractions are zero-               puter Science, December 2014.
cost at runtime.
                                                           [HL15]    H.W. Hoiby and S. Lefsaker. Rustygecko -
5   Conclusion and Future Work                                       developing rust on bare-metal - an experi-
                                                                     mental embedded software platform. Mas-
In this paper, the advanced semantics of Rust pro-                   ter’s thesis, Norwegian University of Sci-
gramming language is presented to clarify the possi-                 ence and Technology, 2015.
ble use within EFI-based bootloader design process.
Various design alternatives and choices are mentioned      [JML15]   T.B.L. Jespersen, P. Munksgaard, and
and the point that make Rust a better choice are dis-                K.F. Larsen. Session types for Rust. In
cussed. Since one of the main ideas behind using Rust                Proceedings of the 11th ACM SIGPLAN
is programming a critical-and-safe low-level task with               Workshop on Generic Programming, WGP
high-level programming concepts, we found bootloader                 2015, pages 13–22, New York, NY, USA,
design a typical application for this purpose                        2015. ACM.
   As discussed, Rust offers high level language se-
mantics, advanced standard library with modern skill       [LAC+ 15] A. Levy, M.P. Andersen, B. Campbell,
set including most of the features and functional ele-               D. Culler, P. Dutta, B. Ghena, P. Levis,
ments of widely-used programming languages. More-                    and P. Pannuto. Ownership is theft: Ex-
over, Rust can be used as both a scripting language                  periences building an embedded os in rust.
or a functional language. Additionally, it can also be               In Proceedings of the 8th Workshop on Pro-
used as a low level procedural language since it is both             gramming Languages and Operating Sys-
imperative and functional having no garbage collector.               tems, PLOS’15, pages 21–26, New York,
These design choices make Rust a suitable match for                  NY, USA, 2015. ACM.
low level tasks via including high level scalability and   [Lig15]   A. Light. Reenix: Implementing a unix-
maintainability.                                                     like operating system in rust. Master’s the-
   From the bootloading perspective, the future seems                sis, Brown University, Department of Com-
to be based on EFI on x86 hardware. It currently al-                 puter Science, April 2015.
lows end users to download operating system from the
Internet and install easily. Today memory unsafety         [LMP+ 05] P. Levis, S. Madden, J. Polastre,
causes serious problems, hence adaptation of Rust is                 R. Szewczyk, A. Woo, D. Gay, J. Hill,
not economical or social, it is intellectual. As our fu-             M. Welsh, E. Brewer, and D. Culler.
ture work, we plan to develop a prototype based on                   Tinyos: An operating system for sensor
this design process and validate the use of Rust via                 networks. In Ambient Intelligence, pages
performance experiments.                                             115–148. Springer Verlag, 2005.
[Oka99]   C. Okasaki. Purely Functional Data Struc-
          tures. Cambridge University Press, 1999.
[Pat01]   R. Patton. Software Testing. Sams Pub-
          lishing, 2001.
[Pet00]   C. Petzold. Code: The Hidden Language
          of Computer Hardware and Software. Mi-
          crosoft Press, 2000.

[Pos14]   R. Poss. Rust for functional programmers.
          http://science.raphael.poss.name/rust-
          for-functional-programmers.html,     July
          2014.

[YZ15]    J. Yao and V. Zimmer. A tour beyond bios
          memory map design in UEFI BIOS. Tech-
          nical report, Intel Corporation, February
          2015.

[ZRM11]   V. Zimmer, M. Rothman, and S. Marisetty.
          Beyond BIOS: Developing with the Unified
          Extensible Firmware Interface 2nd Edition.
          Intel Press, January 2011.