From zero to main(): Systems to write a bootloader from scratch

Please log in or register to like posts.
News

That is the third post in our Zero to main() sequence,
the build we bootstrap a working firmware from zero code on a
cortex-M sequence microcontroller.

Previously, we wrote a startup file to bootstrap our C surroundings, and a linker
script to receive the factual records at the factual addresses
. These two will enable us to
write a monolithic firmware which we can load and scamper on our microcontrollers.

In notice, right here is no longer how most firmware is structured. Digging through seller
SDKs, you’ll perceive that they all suggest the utilization of a bootloader to load your
purposes. A bootloader is a microscopic program which is accountable for loading
and beginning your application.

On this post, we will be able to point why it’s seemingly you’ll presumably perchance honest need a
bootloader, enforce one, and conceal a pair of developed programs it’s seemingly you’ll presumably perchance honest
exercise to receive your bootloader more truly useful.

Like Interrupt? Subscribe to receive our most modern posts straight to your mailbox.

Why that it’s seemingly you’ll enjoy a bootloader

Bootloaders serve many purposes, starting from security to tool architecture.

Most continuously, that it’s seemingly you’ll enjoy a bootloader to load your tool. Some
microcontrollers love Dialog’s
DA14580
enjoy shrimp to no onboard flash and as an different depend on an exterior tool to store
firmware code. If that is so, it’s miles the bootloader’s job to reproduction code from
non-executable storage, corresponding to a SPI flash, to an space of memory that can presumably perchance perchance honest even be
carried out from, corresponding to RAM.

Bootloaders additionally will assist you decouple substances of this system which would possibly presumably perchance perchance be mission
valuable, or that enjoy security implications, from application code which
adjustments continuously. As an illustration, your bootloader would possibly presumably perchance perchance honest enjoy firmware replace
logic so your tool can get better no topic how spoiled a worm ships for your
application firmware.

Closing nevertheless in no scheme least, bootloaders are a well-known element of a
trusted boot architecture. Your bootloader can, as an example, test a
cryptographic signature to receive sure the applying has no longer been replaced or
tampered with.

A minimal bootloader

Let’s originate a easy bootloader together. To launch, our bootloader must carry out two
issues:

  1. Make on MCU boot
  2. Soar to our application code

We’ll must own on a memory map, write some bootloader code, and replace our
application to receive it bootload-in a bother.

Setting the stage

For this situation, we’ll be the utilization of the same setup as we did in our outdated Zero
to Main posts:

Settling on a memory map

We must first own on how grand space we must devote to our bootloader.
Code space is precious – your application would possibly presumably perchance perchance honest draw to need more of it – and also you
is no longer going to be in a bother to interchange this with out updating your bootloader, so receive
this as microscopic as you can be in a bother to.

Another well-known element is your flash sector dimension: you own to must receive sure you
can erase app sectors with out erasing bootloader records, or vice versa.
As a consequence, your bootloader residing must cease on a flash sector boundary
(usually 4kB).

I sure to tear alongside with a 16kB residing, main to the following memory map:

        0x0 +---------------------+
            |                     |
            |     Bootloader      |
            |                     |
     0x4000 +---------------------+
            |                     |
            |                     |
            |     Utility     |
            |                     |
            |                     |
    0x30000 +---------------------+

We are succesful of transcribe that memory real into a linker script:

/memory_map.ld */
MEMORY
{
  bootrom  (rx)  : ORIGIN = 0x00000000, LENGTH = 0x00004000
  approm   (rx)  : ORIGIN = 0x00004000, LENGTH = 0x0003C000
  ram      (rwx) : ORIGIN = 0x20000000, LENGTH = 0x00008000
}

__bootrom_start__ = ORIGIN(bootrom);
__bootrom_size__ = LENGTH(bootrom);
__approm_start__ = ORIGIN(approm);
__approm_size__ = LENGTH(approm);

Since linker scripts are composable, we will be able to be in a bother to embody that memory
map into the linker scripts we write for our bootloader and our application.

You’ll perceive that the linker script above proclaims some variables. We’ll need
those for our bootloader to snatch the build to hunt out the applying. To receive them
accessible in C code, we describe them in a header file:

/memory_map.h */
#pragma once

extern int __bootrom_start__;
extern int __bootrom_size__;
extern int __approm_start__;
extern int __approm_size__;

Implementing the bootloader itself

Next up, let’s write some bootloader code. Our bootloader needs to launch
executing on boot after which leap to our app.

We know carry out the first section from our outdated post: we need a sound stack
pointer at handle 0x0 , and a sound Reset_Handler function constructing our
surroundings at handle 0x4. We are succesful of reuse our outdated startup file and linker
script, with one replace: we exercise memory_map.ld rather than define our dangle
MEMORY piece.

We additionally must set our code in the bootrom residing from our memory rather than
the rom residing in our outdated post.

Our linker script therefore seems to be love this:

/bootloader.ld */
INCLUDE memory_map.ld

/Fragment Definitions */
SECTIONS
{
    .textual screech :
    {
        KEEP(*(.vectors .vectors.*))
        *(.textual screech*)
        *(.rodata*)
        _etext = .;
    } > bootrom
  ...
}

To leap into our application, we must know the build the Reset_Handler of the
app is, and what stack pointer to load. But again, we know from our outdated post
that those must be the first two 32-bit phrases in our binary, so we honest correct need
to dereference those addresses the utilization of the __approm_start__ variable from our
memory map.

/bootloader.c */
#embody 
#embody "memory_map.h"

int main(void) {
  uint32_t *app_code = (uint32_t *)__approm_start__;
  uint32_t app_sp = app_code[0];
  uint32_t app_start = app_code[1];
  /TODO: Inaugurate app */
  /No longer Reached */
  while (1) {}
}

Next we must load that stack pointer and leap to the code. This can require a
little bit of meeting code.

ARM MCUs exercise the msr instruction
to
load instantaneous or register records into system registers, in this case the MSP
register or “Main Stack Pointer”.

Leaping to an handle is completed with a department, in our case with a bx
instruction
.

We wrap those two real into a start_app function which accepts our pocket book computer and sp as
arguments, and receive our minimal bootloader:

/app.c */
#embody 
#embody "memory_map.h"

static void start_app(uint32_t pocket book computer, uint32_t sp) __attribute__((bare)) {
    __asm("           n
          msr msp, r1 /load r1 into MSP */n
          bx r0       /department to the handle at r0 */n
    ");
}

int main(void) {
  uint32_t *app_code = (uint32_t *)__approm_start__;
  uint32_t app_sp = app_code[0];
  uint32_t app_start = app_code[1];
  start_app(app_start, app_sp);
  /No longer Reached */
  while (1) {}
}

Declare: hardware sources initialized in the bootloader must be de-initialized
sooner than management is transferred to the app. Otherwise, you risk breaking
assumptions the app code is making referring to the bid of the system

Making our app bootloadable

We must replace our app to rob honest correct thing about our original memory map. That is again
completed by updating our linker script to embody memory_map.ld and changing our
sections to tear to the approm residing rather than rom.

/app.ld */
INCLUDE memory_map.ld

/Fragment Definitions */
SECTIONS
{
    .textual screech :
    {
        KEEP(*(.vectors .vectors.*))
        *(.textual screech*)
        *(.rodata*)
        _etext = .;
    } > approm
  ...
}

We additionally must replace the vector
desk

feeble by the microcontroller. The vector desk incorporates the handle of every
exception and interrupt handler in our system. When an interrupt signal comes
in, the ARM core will call the handle at the corresponding offset in the vector
desk.

As an illustration, the offset for the Onerous fault handler is 0xc, so when a onerous
fault is hit, the ARM core will leap to the handle contained in the desk at
that offset.

By default, the vector desk is at handle 0x0, which draw that when our chip
powers up, finest the bootloader can kind out exceptions or interrupts! Fortunately, ARM
gives the Vector Table Offset
Register

to dynamically replace the handle of the vector desk. The register is at
handle 0xE000ED08 and has a easy layout:

31                                  7              0
+-----------------------------------+--------------+
|                                   |              |
|              TBLOFF               |   Reserved   |
|                                   |              |
+-----------------------------------+--------------+

The build TBLOFF is the handle of the vector desk. In our case, that’s the launch
of our textual screech piece, or _stext. To space it in our app, we add the following to
our Reset_Handler:

/startup_samd21.c */
/Region the vector desk low handle */
uint32_t *vector_table = (uint32_t *) &_stext;
uint32_t *vtor = (uint32_t *)0xE000ED08;
*vtor = ((uint32_t) vector_table & 0xFFFFFFF8);

One quirk of the ARMv7-m architecture is the alignment requirement for the
vector desk, as specified by piece B1.5.3 of the reference
handbook
:

The Vector desk must be naturally aligned to an impact of two whose alignment worth is bigger than or equal
to (Sequence of Exceptions supported x 4), with a minimal alignment of 128 bytes.The entry at offset 0 is
feeble to initialize the worth for SP_main, find The SP registers on page B1-8. All other entries will must enjoy bit
[0] space, because the bit is feeble to define the EPSR T-bit on exception entry (find Reset habits on page B1-20 and
Exception entry habits on page B1-21 for well-known aspects).

Our SAMD21 MCU has 28 interrupts on high of the 16 system reserved exceptions,
for a entire of 44 entries in the desk. Multiply that by 4 and also you receive 176. The
subsequent energy of two is 256, so our vector desk must be 256-byte aligned.

Inserting all of it together

Resulting from it’s miles onerous to agree with the bootloader attain, we add a print line to
every of our packages:

/boootloader.c */
#embody 
#embody "memory_map.h"

static void start_app(uint32_t pocket book computer, uint32_t sp) {
    __asm("           n
          msr msp, r1 /load r1 into MSP */n
          bx r0       /department to the handle at r0 */n
    ");
}

int main() {
  serial_init();
  printf("Bootloader!n");
  serial_deinit();

  uint32_t *app_code = (uint32_t *)__approm_start__;
  uint32_t app_sp = app_code[0];
  uint32_t app_start = app_code[1];

  start_app(app_start, app_sp);

  // should always by no draw be reached
  while (1);
}

and:

/app.c */
int main() {
  serial_init();
  set_output(LED_0_PIN);

  printf("App!n");
  while (correct) {
    port_pin_toggle_output_level(LED_0_PIN);
    for (int i = 0; i < 100000; ++i) {}
  }
}

Declare that the bootloader must deinitialize the serial peripheral sooner than
beginning the app, or you’ll enjoy a onerous time attempting to initialize it again.

It’s seemingly you’ll presumably perchance perchance presumably assemble both these packages and cargo the resulting elf recordsdata with gdb
that can presumably perchance perchance honest set them at the supreme handle. Nonetheless, the more convenient element
to assist out is to originate a single binary which incorporates both packages.

To assist out that, you wish to undergo the following steps:

  1. Pad the bootloader binary to the fat 0x4000 bytes
  2. Manufacture the app binary
  3. Concatenate the 2

Developing a binary from an elf file is completed with objcopy . To
accommodate our exercise case, objcopy has some to hand choices:

$ arm-none-eabi-objcopy --assist | grep -C 2 pad
  -b --byte                   Possess byte  in every interleaved block
     --gap-maintain               Hang gaps between sections with 
     --pad-to                Pad the closing piece as much as handle 
     --space-launch             Region the launch handle to 
    {--replace-launch|--modify-launch} 

The —pad-to risk will pad the binary as much as an handle, and —gap-maintain will
will assist you specify the byte worth to maintain the gap with. Since we are writing
our firmware to flash memory, we should always maintain with 0xFF which is the erase
worth of flash, and pad to the max handle of our bootloader.

We enforce those rule in our Makefile, to preserve away from having to form them out every
time:

# Makefile
$(BUILD_DIR)/$(PROJECT)-app.bin: $(BUILD_DIR)/$(PROJECT)-app.elf
  $(OCPY) $< $@ -O binary
  $(SZ) $<

$(BUILD_DIR)/$(PROJECT)-boot.bin: $(BUILD_DIR)/$(PROJECT)-boot.elf
  $(OCPY) --pad-to=0x4000 --gap-maintain=0xFF -O binary $< $@
  $(SZ) $<

Last but not least, we need to concatenate our two binaries. As funny as that
may sound, this is best achieved with cat:

# Makefile
$(BUILD_DIR)/$(PROJECT).bin: $(BUILD_DIR)/$(PROJECT)-boot.bin $(BUILD_DIR)/$(PROJECT)-app.bin
  cat $^ > $@

Beyond the MVP

Our bootloader isn’t too truly useful up to now, it finest hundreds our application. We would
carry out honest correct as properly with out it. Within the following sections, I will undergo a pair of
truly useful stuff it's seemingly you'll presumably perchance carry out with a bootloader.

Message passing to comprehend reboot loops

A total element to assist out with a bootloader is video display steadiness. It would be completed
with a somewhat easy setup:

  1. On boot, the bootloader increments a chronic counter
  2. After the app has been true for some time (e.g. 1 minute), it resets the
    counter to 0
  3. If the counter will get to a pair, the bootloader would no longer launch the app nevertheless as an different
    signals an error.

This requires shared, chronic records between the applying and the bootloader
which is retained all the scheme through reboots. On some architectures, non hazardous registers
are on hand which receive this uncomplicated. That is the case on all STM32
microcontrollers which enjoy RTC backup registers.

Extra continuously than no longer, we can exercise a residing of RAM to receive the same consequence. As long
because the system remains powered, the RAM will preserve its bid even supposing the tool
reboots.

First, we carve some RAM for shared records in our memory map:

/memory_map.ld */
MEMORY
{
  bootrom  (rx)  : ORIGIN = 0x00000000, LENGTH = 0x00004000
  approm   (rx)  : ORIGIN = 0x00004000, LENGTH = 0x0003C000
  shared   (rwx) : ORIGIN = 0x20000000, LENGTH = 0x1000
  ram      (rwx) : ORIGIN = 0x20001000, LENGTH = 0x00007000
}

/shared records starts point at the origin of the shared residing */
_shared_data_start = ORIGIN(shared);

We are succesful of then receive a records structure and establish it to this piece, with getters
to be taught it:

/shared.h */

#embody 

uint8_t shared_data_get_boot_count(void);

void shared_data_increment_boot_count(void);

void shared_data_reset_boot_count(void);

/shared.c */

#embody "shared.h"

extern uint32_t _shared_data_start;

#pragma pack (push)
struct shared_data {
  uint8_t boot_count;
};
#pragma pack (pop)

struct shared_data *sd = (struct shared_data *)_shared_data_stat;

uint8_t shared_data_get_boot_count(void) {
  return sd->boot_count;
}

void shared_data_increment_boot_count(void) {
  sd->boot_count++;
}

void shared_data_reset_boot_count(void) {
  sd->boot_count = 0;
}

We assemble the shared module into both our app and our bootloader, and would possibly presumably perchance
be taught the boot count in both packages.

Relocating our app from flash to RAM

Extra continuously, bootloaders are feeble to relocate purposes sooner than they're
carried out. Relocations contains copying the applying code from one location to
yet every other in expose to realize it. That is truly useful when your application is saved
in non-executable memory love a SPI flash.

Elevate into story the following memory map:

/memory_map.ld */
MEMORY
{
  bootrom  (rx)  : ORIGIN = 0x00000000, LENGTH = 0x00010000
  approm   (rx)  : ORIGIN = 0x00010000, LENGTH = 0x00004000
  ram      (rwx) : ORIGIN = 0x20000000, LENGTH = 0x00004000
  eram     (rwx) : ORIGIN = 0x20004000, LENGTH = 0x00004000
}

__bootrom_start__ = ORIGIN(bootrom);
__bootrom_size__ = LENGTH(bootrom);
__approm_start__ = ORIGIN(approm);
__approm_size__ = LENGTH(approm);
__eram_start__ = ORIGIN(eram);
__eram_size__ = LENGTH(eram);

On this case, approm is our app storage and eram is our executable RAM,
the build we must reproduction our program. Our bootloader needs to reproduction the code from
approm to eram sooner than executing it.

We know from our outdated blog post that executable code usually ends up in
the .textual screech piece so we must expose the linker that this piece is saved in
approm nevertheless carried out from eram so our program can attain precisely.

That is similar to our .records piece, which is saved in rom nevertheless lives in
ram while this system is running. We exercise the AT linker focus on in confidence to specify
the storage residing and the > operator to specify the load residing. That is the
resulting linker script piece:

/app.ld */
SECTIONS {
    .textual screech :
    {
        KEEP(*(.vectors .vectors.*))
        *(.textual screech*)
        *(.rodata*)
    } > eram AT > approm
    ...
}

We then replace our bootloader to reproduction our code from one to the different sooner than
beginning the app:

  /booloader.c */

  /reproduction app code to eram */
  uint32_t *src = (uint32_t*) &__approm_start__;
  uint32_t *dst = (uint32_t*) &__eram_start__;
  int dimension = (int) &__approm_size__;
  printf("Copying firmware from %p to %pn", src, dst);
  memcpy(dst, src, dimension);

  /secure app launch & SP */
  uint32_t app_sp = dst[0];
  uint32_t app_start = dst[1];

  /cleanup peripherals right here we would possibly presumably perchance perchance honest enjoy initialized */

  /launch the app */
  start_app(app_start, app_sp);

Locking the bootloader with the MPU

Closing nevertheless no longer least, we can offer protection to the bootloader the utilization of the memory protection
unit to receive it inaccessible from the app. This prevents unintentionally erasing
the bootloader for the duration of execution.

In case you carry out no longer know referring to the MPU, test out Chris’s handsome blog post from a pair of
weeks ago
.

Be aware that our MPU areas must be energy-of-2 sized. Fortunately, our
bootloader already is! 0x4000 is 2^14 bytes.

We add the following MPU code to our bootloader:

/bootloader.c */
int main(void) {
  /... */
  base_addr = 0x0;
  *mpu_rbar = (base_addr | 1 << 4 | 1);
  //  AP=0b110 to receive the residing be taught-finest despite privilege
  //  TEXSCB=0b000010 for the reason that Code is in "Flash memory"
  //  SIZE=13 attributable to we must conceal 16kiB
  //  ENABLE=1
  *mpu_rasr = (0b110 << 24) | (0b000010 << 16) | (13 << 1) | 0x1;

  start_app(app_start, app_sp);

  /No longer reached */
  while (1) {}
}

Closing

We hope studying this post has given you an real suggestion of how bootloaders work, and
what it's seemingly you'll presumably perchance carry out with them. As with outdated posts, code examples are on hand
on Github in the zero to main
repository
.

What cool issues does your bootloader carry out? Repeat us all about it in the feedback,
or at interrupt@memfault.com.

Next time in the sequence, we’ll focus on bootstrapping the C library!

EDIT: Publish written! - Bootstrapping libc with Newlib

Like Interrupt? Subscribe to receive our most modern posts straight to your mailbox.


François Baldassari has worked on the embedded tool groups at Sun, Pebble, and Oculus. He's currently the CEO of Memfault.

Read Extra

Reactions

0
0
0
0
0
0
Already reacted for this post.

Nobody liked ?