Finding a Libc for tiny embedded ARM systems

You'd think this problem would have been solved a long time ago. All I wanted was a C library to use in small embedded systems -- those with a few kB of flash and even fewer kB of RAM.

Small system requirements

A small embedded system has a different balance of needs:

  • Stack space is limited. Each thread needs a separate stack, and it's pretty hard to move them around. I'd like to be able to reliably run with less than 512 bytes of stack.

  • Dynamic memory allocation should be optional. I don't like using malloc on a small device because failure is likely and usually hard to recover from. Just make the linker tell me if the program is going to fit or not.

  • Stdio doesn't have to be awesomely fast. Most of our devices communicate over full-speed USB, which maxes out at about 1MB/sec. A stdio setup designed to write to the page cache at memory speeds is over-designed, and likely involves lots of buffering and fancy code.

  • Everything else should be fast. A small CPU may run at only 20-100MHz, so it's reasonable to ask for optimized code. They also have very fast RAM, so cycle counts through the library matter.

Available small C libraries

I've looked at:

  • ╬╝Clibc. This targets embedded Linux systems, and also appears dead at this time.

  • musl libc. A more lively project; still, definitely targets systems with a real Linux kernel.

  • dietlibc. Hasn't seen any activity for the last three years, and it isn't really targeting tiny machines.

  • newlib. This seems like the 'normal' embedded C library, but it expects a fairly complete "kernel" API and the stdio bits use malloc.

  • avr-libc. This has lots of Atmel assembly language, but is otherwise ideal for tiny systems.

  • pdclib. This one focuses on small source size and portability.

Current AltOS C library

We've been using pdclib for a couple of years. It was easy to get running, but it really doesn't match what we need. In particular, it uses a lot of stack space in the stdio implementation as there's an additional layer of abstraction that isn't necessary. In addition, pdclib doesn't include a math library, so I've had to 'borrow' code from other places where necessary. I've wanted to switch for a while, but there didn't seem to be a great alternative.

What's wrong with newlib?

The "obvious" embedded C library is newlib. Designed for embedded systems with a nice way to avoid needing a 'real' kernel underneath, newlib has a lot going for it. Most of the functions have a good balance between speed and size, and many of them even offer two implementations depending on what trade-off you need. Plus, the build system 'just works' on multi-lib targets like the family of cortex-m parts.

The big problem with newlib is the stdio code. It absolutely requires dynamic memory allocation and the amount of code necessary for 'printf' is larger than the flash space on many of our devices. I was able to get a cortex-m3 application compiled in 41kB of code, and that used a smattering of string/memory functions and printf.

How about avr libc?

The Atmel world has it pretty good -- avr-libc is small and highly optimized for atmel's 8-bit avr processors. I've used this library with success in a number of projects, although nothing we've ever sold through Altus Metrum.

In particular, the stdio implementation is quite nice -- a 'FILE' is effectively a struct containing pointers to putc/getc functions. The library does no buffering at all. And it's tiny -- the printf code lacks a lot of the fancy new stuff, which saves a pile of space.

However, much of the places where performance is critical are written in assembly language, making it pretty darn hard to port to another processor.

Mixing code together for fun and profit!

Today, I decided to try an experiment to see what would happen if I used the avr-libc stdio bits within the newlib environment. There were only three functions written in assembly language, two of them were just stubs while the third was a simple ultoa function with a weird interface. With those coded up in C, I managed to get them wedged into newlib.

Figuring out the newlib build system was the only real challenge; it's pretty awful having generated files in the repository and a mix of autoconf 2.64 and 2.68 version dependencies.

The result is pretty usable though; my STM 32L discovery board demo application is only 14kB of flash while the original newlib stdio bits needed 42kB and that was still missing all of the 'syscalls', like read, write and sbrk.

Here's gitweb pointing at the top of the tiny-stdio tree:

gitweb

And, of course you can check out the whole thing

git clone git://keithp.com/git/newlib

'master' remains a plain upstream tree, although I do have a fix on that branch. The new code is all on the tiny-stdio branch.

I'll post a note on the newlib mailing list once I've managed to subscribe and see if there is interest in making this option available in the upstream newlib releases. If so, I'll see what might make sense for the Debian libnewlib-arm-none-eabi packages.