Kernel Undefined Behaviour Sanitizer, Report 1

For GSoC 2018, I’m working on the Kernel Undefined Behavior Sanitizer (KUBSAN) project for the integration of Undefined Behavior regression testing on the amd64 kernel.
This article summarizes what has been done up to this point (Phase 1 Evaluation), future goals and a brief introduction to Undefined Behavior.

First things first, let’s get started.

The mailing list project presentation

What is Undefined Behavior?

For Turing-complete languages we cannot reliably decide offline whether a program has the potential to execute an error; we have to run it and see.

Undefined Behavior in C is basically what the ANSI standard leaves unexplained. Code containing Undefined Behavior is ANSI C compatible. It follows all the rules explained in the standard and causes real trouble. In programming terms, it involves all the possible functionalities C code can run. It’s whatever the compiler doesn’t moan about, but when run it causes run-time bugs, hard to locate.

The important and scary thing to realize is that about any optimization based on undefined behavior can start being triggered on buggy code at any time in the future. Inlining, loop unrolling, memory promotion and other optimizations will and a significant part of their reason for existing is to expose secondary optimizations like the ones above.

Solution: Make a UB Sanitizer

What we can do to find undefined behavior errors in our code, is creating a Sanitizer.
Hopefully both CLang and GCC have taken care of such “dream” tools, covering the majority of undefined behavior cases in a meaningful manner.
They allow us to parse the -fsanitize=undefined option when we build our code and the compiler “spits out” simple warnings for us to see.

Adding ATF Tests for Userland UBSan

This was my first deliverable for the integration of KUBSan. The concept was to include tests causing simple C programs to portray Undefined Behavior, such as overflows, erroneous shifting and out of bounds accessing of arrays (VLAs actually).
The ATF framework is not a real “sweetheart” to learn, so it took me more than expected to complete this preliminary step to the project. The good news was that I had enough time to understand Undefined Behavior to a suave depth and make my extensive research for ideas.

Addition of Example Kernel Module Panic_String

Next on our roadmap was the understanding of NetBSD’s loadable kernel modules. For this, I created a kernel module parsing a string from a device named /dev/panic and calling the kernel panic(9) with it as argument, after syncing the system. This took a long time, but in the process I had the priviledge of reading

Compiling the kernel with -fsanitize=undefined

Compiled the kernel with the proper option to catch UB bugs. We got one. Which was reported to the tech-kern mailing list in this Thread.

Adding the option to compile the Kernel with KUBSan

At last what was our last deliverable for GSoC’s first evaluation, was getting the amd64 kernel to boot with the KUBSan option enabled.
This was a trick. We needed the appropriate dummy functions, so we could use them as symbols in the linking process of a kernel build.
At first I created KUBSan as a loadable kernel module, but the chaotic structure of our codebase was to much for me. This means that I searched for 4 whole days a way to link the exported symbols to the kernel build and was unsuccessful :(
But everything happens for a reason, because that one failure ignited me to search for all the available UBSan implementations and I was able to locate the initial support of the KUBSan functionality for:

Which in turn, made me realise that the module was not necessary, since I could include the KUBSan functiuonality to our /sys infrastructure. Which I did and which was successful and which allowed me to boot and run a fully KUBSan-ed kernel.

It hasn’t been uploaded to upstream yet, but you can have a look at my local (and totally messy) fork.

Summary and Future Goals

This first month of GSoC has been a great experience. Last year I participated again with project trying to “revamp” support for Scheme R7RS in the Eclipse IDE (we later tried to create a Kawa-Scheme Language Server-LSP, but that’s a sad story) and my experience was not the best (I had to quit mid-July).

This time collaboration follows a much friendlier, cooperative and result-producing manner.
I’m incredibly happy about that.

A brief summary is that: the Kernel booted with KUBSan and I’m in knowledge of all the tools needed to extent that functionality.
That’s all ye need to know up to this point.

Future goals include:

  • Making a full implementation of KUBSan, with an edge on surpassing other existing implementations,
  • Clear up any license issues,
  • Finish the amd64 implementation and switch focus to the i386,
  • Spread the NetBSD hype

At last, I would like to deliver thanks to my mentors Kamil and Christos for their advices and help with the project, but most of all for their incredible behavior towards the problems I went through this past month.
Much love :)

Further Reading: