sanitizer-fun

Fun with -fsanitize=undefined and Picolibc

Both GCC and Clang support the -fsanitize=undefined flag which instruments the generated code to detect places where the program wanders into parts of the C language specification which are either undefined or implementation defined. Many of these are also common programming errors. It would be great if there were sanitizers for other easily detected bugs, but for now, at least the undefined sanitizer does catch several useful problems.

Supporting the sanitizer

The sanitizer can be built to either trap on any error or call handlers. In both modes, the same problems are identified, but when trap mode is enabled, the compiler inserts a trap instruction and doesn't expect the program to continue running. When handlers are in use, each identified issue is tagged with a bunch of useful data and then a specific sanitizer handling function is called.

The specific functions are not all that well documented, nor are the parameters they receive. Maybe this is because both compilers provide an implementation of all of the functions they use and don't really expect external implementations to exist? However, to make these useful in an embedded environment, picolibc needs to provide a complete set of handlers that support all versions both gcc and clang as the compiler-provided versions depend upon specific C (and C++) libraries.

Of course, programs can be built in trap-on-error mode, but that makes it much more difficult to figure out what went wrong.

Fixing Sanitizer Issues

Once the sanitizer handlers were implemented, picolibc could be built with them enabled and all of the picolibc tests run to uncover issues within the library.

As with the static analyzer adventure from last year, the vast bulk of sanitizer complaints came from invoking undefined or implementation-defined behavior in harmless ways:

Computing pointers past &array[size+1]. I found no cases where the resulting pointers were actually used, but the mere computation is still undefined behavior. These were fixed by adjusting the code to avoid computing pointers like this. The result was clearer code, which is good.
Signed arithmetic overflow in PRNG code. There are several linear congruential PRNGs in the library which used signed integer arithmetic. The rand48 generator carefully used unsigned short values. Of course, in C, the arithmetic performed on them is done with signed ints if int is wider than short. C specifies signed overflow as undefined, but both gcc and clang generate the expected code anyways. The fixes here were simple; just switch the computations to unsigned arithmetic, adjusting types and inserting casts as required.
Passing pointers to the middle of a data structure. For example, free takes a pointer to the start of an allocation. The management structure appears just before that in memory; computing the address of which appears to be undefined behavior to the compiler. The only fix I could do here was to disable the sanitizer in functions doing these computations -- the sanitizer was mis-detecting correct code and it doesn't provide a way to skip checks on a per-operator basis.
Null pointer plus or minus zero. C says that any arithmetic with the NULL pointer is undefined, even when the value being added or subtracted is zero. The fix here was to create a macro, enabled only when the sanitizer is enabled, which checks for this case and skips the arithmetic.
Discarded computations which overflow. A couple of places computed a value, then checked if that would have overflowed and discard the result. Even though the program doesn't depend upon the computation, its mere presence is undefined behavior. These were fixed by moving the computation into an else clause in the overflow check. This inserts an extra branch instruction, which is annoying.
Signed integer overflow in math code. There's a common pattern in various functions that want to compare against 1.0. Instead of using the floating point equality operator, they do the computation using the two 32-bit halves with ((hi - 0x3ff00000) | lo) == 0. It's efficient, but because most of these functions store the 'hi' piece in a signed integer (to make checking the sign bit fast), the result is undefined when hi is a large negative value. These were fixed by inserting casts to unsigned types as the results were always tested for equality.

Signed integer shifts

This is one area where the C language spec is just wrong.

For left shift, before C99, it worked on signed integers as a bit-wise operator, equivalent to the operator on unsigned integers. After that, left shift of negative integers became undefined. Fortunately, it's straightforward (if tedious) to work around this issue by just casting the operand to unsigned, performing the shift and casting it back to the original type. Picolibc now has an internal macro, lsl, which does this:

    #define lsl(__x,__s) ((sizeof(__x) == sizeof(char)) ?                   \
                          (__typeof(__x)) ((unsigned char) (__x) << (__s)) :  \
                          (sizeof(__x) == sizeof(short)) ?                  \
                          (__typeof(__x)) ((unsigned short) (__x) << (__s)) : \
                          (sizeof(__x) == sizeof(int)) ?                    \
                          (__typeof(__x)) ((unsigned int) (__x) << (__s)) :   \
                          (sizeof(__x) == sizeof(long)) ?                   \
                          (__typeof(__x)) ((unsigned long) (__x) << (__s)) :  \
                          (sizeof(__x) == sizeof(long long)) ?              \
                          (__typeof(__x)) ((unsigned long long) (__x) << (__s)) : \
                          __undefined_shift_size(__x, __s))

Right shift is significantly more complicated to implement. What we want is an arithmetic shift with the sign bit being replicated as the value is shifted rightwards. C defines no such operator. Instead, right shift of negative integers is implementation defined. Fortunately, both gcc and clang define the >> operator on signed integers as arithmetic shift. Also fortunately, C hasn't made this undefined, so the program itself doesn't end up undefined.

The trouble with arithmetic right shift is that it is not equivalent to right shift of unsigned values. Here's what Per Vognsen came up with using standard C operators:

    int
    __asr_int(int x, int s) {
        return x < 0 ? ~(~x >> s) : x >> s;
    }

When the value is negative, we invert all of the bits (making it positive), shift right, then flip all of the bits back. Both GCC and Clang seem to compile this to a single asr instruction. This function is replicated for each of the five standard integer types and then the set of them wrapped in another sizeof-selecting macro:

    #define asr(__x,__s) ((sizeof(__x) == sizeof(char)) ?           \
                          (__typeof(__x))__asr_char(__x, __s) :       \
                          (sizeof(__x) == sizeof(short)) ?          \
                          (__typeof(__x))__asr_short(__x, __s) :      \
                          (sizeof(__x) == sizeof(int)) ?            \
                          (__typeof(__x))__asr_int(__x, __s) :        \
                          (sizeof(__x) == sizeof(long)) ?           \
                          (__typeof(__x))__asr_long(__x, __s) :       \
                          (sizeof(__x) == sizeof(long long)) ?      \
                          (__typeof(__x))__asr_long_long(__x, __s):   \
                          __undefined_shift_size(__x, __s))

The lsl and asr macros use sizeof instead of the type-generic mechanism to remain compatible with compilers that lack type-generic support.

Once these macros were written, they needed to be applied where required. To preserve the benefits of detecting programming errors, they were only applied where required, not blindly across the whole codebase.

There are a couple of common patterns in the math code using shift operators. One is when computing the exponent value for subnormal numbers.

for (ix = -1022, i = hx << 11; i > 0; i <<= 1)
    ix -= 1;

This code computes the exponent by shifting the significand left by 11 bits (the width of the exponent field) and then incrementally shifting it one bit at a time until the sign flips, which indicates that the most-significant bit is set. Use of the pre-C99 definition of the left shift operator is intentional here; so both shifts are replaced with our lsl operator.

In the implementation of pow, the final exponent is computed as the sum of the two exponents, both of which are in the allowed range. The resulting sum is then tested to see if it is zero or negative to see if the final value is sub-normal:

hx += n << 20;
if (hx >> 20 <= 0)
    /* do sub-normal things */

In this case, the exponent adjustment, n, is a signed value and so that shift is replaced with the lsl macro. The test value needs to compute the correct the sign bit, so we replace this with the asr macro.

Because the right shift operation is not undefined, we only use our fancy macro above when the undefined behavior sanitizer is enabled. On the other hand, the lsl macro should have zero cost and covers undefined behavior, so it is always used.

Actual Bugs Found!

The goal of this little adventure was both to make using the undefined behavior sanitizer with picolibc possible as well as to use the sanitizer to identify bugs in the library code. I fully expected that most of the effort would be spent masking harmless undefined behavior instances, but was hopeful that the effort would also uncover real bugs in the code. I was not disappointed. Through this work, I found (and fixed) eight bugs in the code:

setlocale/newlocale didn't check for NULL locale names
qsort was using uintptr_t to swap data around. On MSP430 in 'large' mode, that's a 20-bit type inside a 32-bit representation.
random() was returning values in int range rather than long.
m68k assembly for memcpy was broken for sizes > 64kB.
freopen returned NULL, even on success
The optimized version of memrchr was always performing unaligned accesses.
String to float conversion had a table missing four values. This caused an array access overflow which resulted in imprecise values in some cases.
vfwscanf mis-parsed floating point values by assuming that wchar_t was unsigned.

Sanitizer Wishes

While it's great to have a way to detect places in your C code which evoke undefined and implementation defined behaviors, it seems like this tooling could easily be extended to detect other common programming mistakes, even where the code is well defined according to the language spec. An obvious example is in unsigned arithmetic. How many bugs come from this seemingly innocuous line of code?

    p = malloc(sizeof(*p) * c);

Because sizeof returns an unsigned value, the resulting computation never results in undefined behavior, even when the multiplication wraps around, so even with the undefined behavior sanitizer enabled, this bug will not be caught. Clang seems to have an unsigned integer overflow sanitizer which should do this, but I couldn't find anything like this in gcc.

Summary

The undefined behavior sanitizers present in clang and gcc both provide useful diagnostics which uncover some common programming errors. In most cases, replacing undefined behavior with defined behavior is straightforward, although the lack of an arithmetic right shift operator in standard C is irksome. I recommend anyone using C to give it a try.