STM32 Build Process Explained – A Practical Guide for Firmware Engineers

When you begin working with STM32 or any ARM Cortex-M microcontroller, one of the first low-level structures you encounter is the vector table, located at address 0x0000_0000 (remapped to 0x0800_0000 in Flash on STM32 devices).

In most toolchains, such as IAR Embedded Workbench, Keil, or GCC, this table appears automatically, almost like magic. But in embedded systems, nothing is magic. Every byte placed in memory is intentional, and every startup step has a clear reason behind it.

As your project grows, you will eventually need to override the default startup code, modify the vector table, or take full control of the memory layout. To do this with confidence, you must understand the STM32 build process, how simple .c files are compiled, linked, and finally transformed into the binary image that runs on your microcontroller.

This article breaks down that STM2 build process in a clear, practical way—rooted in real industry experience, the kind that comes from decades of working with microcontrollers and bare-metal systems.

Host vs Target – Understanding Cross-Development:

One of the first concepts every embedded engineer must understand is the difference between the host and the target.
In desktop development, both are usually the same machine. But in embedded development—especially when working with MCUs like the STM32H5—these two environments are fundamentally different.

Host (Development Machine):

This is your PC, where you write code and use tools such as:

IAR Embedded Workbench (EWARM)
ARM GCC / GNU Embedded Toolchain
CMake + GCC
STM32CubeIDE (if applicable)

All compilation, assembling, linking, and debugging tools run on the host.

Target (Execution Platform):

The target is the actual hardware device (STM32 or any MCU) that runs your firmware. It receives a binary image built on the host and executes it in real hardware conditions.

Now the Million-Dollar Question 🤔

Why Not Compile Directly on the MCU?

It is a question every beginner eventually asks, “If the firmware runs on the MCU, why can’t the MCU compile it?”

The answer lies in the hardware limitations of microcontrollers:

Very limited RAM (compared to a PC)
Restricted Flash memory
Far lower processing capability
No OS or filesystem for hosting toolchains

A compiler is a large and complex program that requires a lot of memory, storage, and processing power, things that simply are not available on a microcontroller. That is why the compiler, assembler, and linker all run on your PC, not on the MCU itself.

What Is Cross-Compilation?

When a toolchain running on the host generates machine code for a different CPU architecture (in this case, ARM Cortex-M33), the process is called cross-compilation.

Cross-compilation is the backbone of embedded development because:

The host and target architectures are different
The target cannot build programs for itself
We need binaries formatted specifically for the MCU’s memory map and instruction set

Most of the MCU firmware project you write follows this model:

Develop on the host → build using a cross-compiler → flash the binary to the target → run on MCU

Embedded Build Pipeline and Build Process:

The build process for embedded systems, including the STM32 series, may seem straightforward at first glance, like just compile your code, link it, and flash it. But actually, behind the scenes, it is a complex, step-by-step process. This process takes the code you write (your human-readable source files) and turns it into a precise set of instructions that the microcontroller can run directly.

High-Level Build Flow:

This is the simple sequence that converts your human-written C/C++ code into actual machine instructions that run on an STM32 microcontroller.

+------------------+     +-------------------+     +---------------------+
|   Source Files   | --> |   Compiler        | --> |   Object Files (.o) |
|  (.c, .cpp, .s)  |     |  (iccarm, GCC)    |     |   Relocatable ELF   |
+------------------+     +-------------------+     +---------------------+
                                                       |
                                                       v
                        +------------------------------------------------+
                        |                    Linker                      |
                        |   (icarm, ld) combines all .o + libraries      |
                        +------------------------------------------------+
                                                       |
                                                       v
                       +-----------------------------------------------+
                       |         Final Executable (ELF) c.out          |
                       |     + HEX/ BIN depending on conversion        |
                       +-----------------------------------------------+
                                                       |
                                                       v
                                            Flash to STM32 Target

Let’s walk through each step.

Compiler (Compilation stage):

When you build even a simple STM32 firmware project, you typically start with a few source files, such as:

main.c
delay.c
startup_stm32h563xx.s

During the compilation stage, each of these files is compiled independently. This is an important design choice in all modern embedded toolchains, whether you are using IAR (iccarm), GCC (arm-none-eabi-gcc), or LLVM.

The outputs of this stage are object files, for example:

main.o
delay.o

These object files are structured using the powerful ELF (Executable and Linkable Format) and contain:

Relocatable Machine Instructions: Raw code that has not been assigned a final address yet.
Symbol Tables: Lists of functions and variables defined or referenced by the file.
Sections: Groupings for different types of data (.text, .rodata, .data, .bss).
Relocation Records: Instructions for the linker on what addresses need to be fixed later.

Keynote: Object files are relocatable. They know what to do, but not where they will live.

Understanding ELF: What’s Inside Those .o and .out Files?

If you open any .o file from IAR’s Debug\Obj folder in a plain text editor, the very first characters you will usually see are:

ELF

This three-letter signature identifies the ELF (Executable and Linkable Format), one of the most important and widely used file formats in embedded development. ELF is the standard used by modern toolchains, including IAR, GCC, and many others.

What an ELF File Contains:

An ELF file is much more than raw machine code. It includes several structured sections:

.text – program code
.data – initialized global/static variables
.bss – zero-initialized globals/statics
.rodata – constants and string literals
.debug_* – debugging info for stepping through code
Symbol tables – list of functions, variables, addresses
Relocation tables – information the linker uses to fix addresses

Because an ELF file contains many additional sections, such as debug data, symbol tables, and relocation information the size of a.o file is usually much larger than the actual machine code that will run on the MCU.

Your program’s real flash and RAM usage becomes accurate only after the linker finishes combining all object files. This information is captured in the linker map file, which shows the final memory layout as it will appear on the STM32 device.

What “Relocatable Code” Really Means:

A very important part of the embedded build process, especially on MCUs like the STM32, is the idea of relocatable code. This is why the code inside a .o file is not the final machine code that runs on the chip. The final executable (final.out) always looks different.

Let’s understand this with a simple example.

Calling a Function: What Actually Happens

Suppose you write in main.c and in which calling function,

delay_ms(100);

The compiler converts this into a BL (Branch with Link) instruction.

But at compile time, the compiler does not know:

where delay_ms() will finally be placed in Flash
what the exact address of the call instruction will be
how the linker will arrange all functions

Because the final position of delay_ms() is unknown, the compiler cannot calculate the real branch offset yet.

So instead, it places a placeholder value in the .o file. For example, IAR often uses:

0x07FFFFFE

This means: “The linker will fill in the correct value later.”

The Linker Fixes the Instruction

During linking, all code sections are assigned their final addresses according to the linker script.

Now the linker knows exactly:

where delay_ms() is located
where the BL instruction in main() is located

So, it calculates the correct branch offset:

offset = address(delay_ms) − address(current_PC)

Using this, the linker rewrites the placeholder with a proper Thumb-2 encoded BL instruction.

This is why:

The BL instruction in main.o looks incomplete or incorrect.
The BL instruction in c.out is fully resolved and executable.

How the Linker Resolves Symbols:

When you build an STM32 project, the compiler generates multiple object files (.o). Each of these object files contains information about the symbols it defines and the symbols it needs from somewhere else.

To understand the linker, you only need to understand these two categories.

1. Exported Symbols

These are the symbols an object file defines. The linker can treat them as “available resources.”

Functions: main, delay_ms, init_uart
Global variables: adcBuffer, timerConfig
Special runtime symbols: __vector_table

2. Imported Symbols

These are symbols that the file uses but does not define. The linker must find them elsewhere.

Library functions: memcpy, printf
Functions from other modules: delay_ms
Startup/runtime symbols: __iar_program_start

How the Linker Resolves Everything

When linking begins, the linker maintains two lists:

Defined (exported) symbols
Undefined (imported) symbols

It then walks through all object files and libraries. For every undefined symbol, it tries to find a matching exported symbol from any other file.

This process continues until:

all symbols are resolved, or
a symbol remains unresolved → linker error

This is how the final executable becomes a fully connected program.

Note: The linker does not include an entire library like libc.a in your firmware. An “.a” file contains many object modules, and the linker adds only those referenced by your code. Unused modules are skipped, keeping embedded firmware small, efficient, and optimized for limited STM32 flash memory.

Memory Placement:

Just writing correct code does not finish your work. The other important and often invisible task is deciding where that code and its data will reside in the microcontroller’s memory.

Don’t worry about it 🤔.

This responsibility is handled by the linker, which follows a configuration file called the linker script (*.icf in IAR, *.ld in GCC).

The linker is the Architect of Memory. While the compiler translates code into instructions, the linker determines the final physical memory map of your firmware inside the microcontroller. Without a correct linker script, even perfectly written code will fail to boot or silently corrupt memory.

Let’s Understand What a Linker Script Is.

What is the Linker Script and What Really it Does:

A linker script is a configuration file that tells the linker how to arrange your program in memory. It defines where each part of your firmware—code, data, stack, and heap—will be placed inside the microcontroller.

The compiler only converts source code into machine instructions. It does not know where Flash or RAM is located on the device. The linker script fills this gap by describing the memory layout of the microcontroller and mapping program sections to the correct memory regions.

The linker script is responsible for the following:

Placing program code (.text) in Flash memory
Allocating data sections (.data, .bss) in RAM
Defining the vector table location required during CPU startup
Setting stack and heap boundaries
Creating custom memory regions, such as bootloaders, external Flash, or retained memory

Note: The linker script defines memory layout and addresses; it does not control runtime execution or initialization logic.

Inspecting Object Files and Final Images:

Up to this point, we have followed the firmware build journey from source code → object files → final ELF image. The next essential step is to inspect these build outputs and confirm that the toolchain has produced exactly what we intended.

In embedded systems particularly on modern MCUs such as STM32, this step is not optional. It is often the quickest way to diagnose startup failures, missing interrupts, or unexpected memory growth.

I can understand that by now you might be wondering why inspecting build outputs matters. Don’t worry, I will explain.

Embedded firmware runs in a highly constrained environment:

No operating system protection
Fixed and limited memory
Strict startup and security requirements

A small mistake in the startup code or linker script can lead to silent failures, such as boot issues or unpredictable behavior. By inspecting object files and ELF images, you can verify the memory layout, symbol placement, and generated machine code, instead of relying on assumptions.

If you are using IAR Embedded Workbench, it provides a utility called ielfdumparm that allows you to inspect both:

Individual object files (.o)
The final linked executable (.elf)

Example command
ielfdumparm --all delay.o > delay.txt

This command generates a readable text dump of the object file, making it easier to understand how the compiler and linker have interpreted your source code. You can examine following things using the generated output.

Disassembly: Verify the exact ARM instructions generated for each function.
Symbol tables: Identify which symbols are defined locally, exported to other modules, or still unresolved.
Sections: Confirm the placement of .text, .data, .bss, the vector table, and any custom sections.
Relocation entries: Understand how symbol addresses are adjusted during the final linking stage.
Constants and literal pools: Useful when investigating unexpected Flash or RAM usage.