When you begin working with STM32 or any ARM Cortex-M microcontroller, one of the first low-level structures you encounter is the vector table, located at address 0x0000_0000 (remapped to 0x0800_0000 in Flash on STM32 devices).
In most toolchains, such as IAR Embedded Workbench, Keil, or GCC, this table appears automatically, almost like magic. But in embedded systems, nothing is magic. Every byte placed in memory is intentional, and every startup step has a clear reason behind it.
As your project grows, you will eventually need to override the default startup code, modify the vector table, or take full control of the memory layout. To do this with confidence, you must understand the STM32 build process, how simple .c files are compiled, linked, and finally transformed into the binary image that runs on your microcontroller.
This article breaks down that STM2 build process in a clear, practical way—rooted in real industry experience, the kind that comes from decades of working with microcontrollers and bare-metal systems.
Host vs Target – Understanding Cross-Development:
One of the first concepts every embedded engineer must understand is the difference between the host and the target.
In desktop development, both are usually the same machine. But in embedded development—especially when working with MCUs like the STM32H5—these two environments are fundamentally different.
Host (Development Machine):
This is your PC, where you write code and use tools such as:
- IAR Embedded Workbench (EWARM)
- ARM GCC / GNU Embedded Toolchain
- CMake + GCC
- STM32CubeIDE (if applicable)
All compilation, assembling, linking, and debugging tools run on the host.
Target (Execution Platform):
The target is the actual hardware device (STM32 or any MCU) that runs your firmware. It receives a binary image built on the host and executes it in real hardware conditions.
Now the Million-Dollar Question 🤔
Why Not Compile Directly on the MCU?
It is a question every beginner eventually asks, “If the firmware runs on the MCU, why can’t the MCU compile it?”
The answer lies in the hardware limitations of microcontrollers:
- Very limited RAM (compared to a PC)
- Restricted Flash memory
- Far lower processing capability
- No OS or filesystem for hosting toolchains
A compiler is a large and complex program that requires a lot of memory, storage, and processing power, things that simply are not available on a microcontroller. That is why the compiler, assembler, and linker all run on your PC, not on the MCU itself.
What Is Cross-Compilation?
When a toolchain running on the host generates machine code for a different CPU architecture (in this case, ARM Cortex-M33), the process is called cross-compilation.
Cross-compilation is the backbone of embedded development because:
- The host and target architectures are different
- The target cannot build programs for itself
- We need binaries formatted specifically for the MCU’s memory map and instruction set
Most of the MCU firmware project you write follows this model:
Embedded Build Pipeline and Build Process:
The build process for embedded systems, including the STM32 series, may seem straightforward at first glance, like just compile your code, link it, and flash it. But actually, behind the scenes, it is a complex, step-by-step process. This process takes the code you write (your human-readable source files) and turns it into a precise set of instructions that the microcontroller can run directly.
High-Level Build Flow:
This is the simple sequence that converts your human-written C/C++ code into actual machine instructions that run on an STM32 microcontroller.
+------------------+ +-------------------+ +---------------------+
| Source Files | --> | Compiler | --> | Object Files (.o) |
| (.c, .cpp, .s) | | (iccarm, GCC) | | Relocatable ELF |
+------------------+ +-------------------+ +---------------------+
|
v
+------------------------------------------------+
| Linker |
| (icarm, ld) combines all .o + libraries |
+------------------------------------------------+
|
v
+-----------------------------------------------+
| Final Executable (ELF) c.out |
| + HEX/ BIN depending on conversion |
+-----------------------------------------------+
|
v
Flash to STM32 Target
Let’s walk through each step.
Compiler (Compilation stage):
When you build even a simple STM32 firmware project, you typically start with a few source files, such as:
- main.c
- delay.c
- startup_stm32h563xx.s
During the compilation stage, each of these files is compiled independently. This is an important design choice in all modern embedded toolchains, whether you are using IAR (iccarm), GCC (arm-none-eabi-gcc), or LLVM.
The outputs of this stage are object files, for example:
- main.o
- delay.o
These object files are structured using the powerful ELF (Executable and Linkable Format) and contain:
- Relocatable Machine Instructions: Raw code that has not been assigned a final address yet.
- Symbol Tables: Lists of functions and variables defined or referenced by the file.
- Sections: Groupings for different types of data (.text, .rodata, .data, .bss).
- Relocation Records: Instructions for the linker on what addresses need to be fixed later.
Understanding ELF: What’s Inside Those .o and .out Files?
If you open any .o file from IAR’s Debug\Obj folder in a plain text editor, the very first characters you will usually see are:
ELF
This three-letter signature identifies the ELF (Executable and Linkable Format), one of the most important and widely used file formats in embedded development. ELF is the standard used by modern toolchains, including IAR, GCC, and many others.
What an ELF File Contains:
An ELF file is much more than raw machine code. It includes several structured sections:
- .text – program code
- .data – initialized global/static variables
- .bss – zero-initialized globals/statics
- .rodata – constants and string literals
- .debug_* – debugging info for stepping through code
- Symbol tables – list of functions, variables, addresses
- Relocation tables – information the linker uses to fix addresses
Because an ELF file contains many additional sections, such as debug data, symbol tables, and relocation information the size of a.o file is usually much larger than the actual machine code that will run on the MCU.
Your program’s real flash and RAM usage becomes accurate only after the linker finishes combining all object files. This information is captured in the linker map file, which shows the final memory layout as it will appear on the STM32 device.
What “Relocatable Code” Really Means:
A very important part of the embedded build process, especially on MCUs like the STM32, is the idea of relocatable code. This is why the code inside a .o file is not the final machine code that runs on the chip. The final executable (final.out) always looks different.
Let’s understand this with a simple example.
Calling a Function: What Actually Happens
Suppose you write in main.c and in which calling function,
delay_ms(100);
The compiler converts this into a BL (Branch with Link) instruction.
But at compile time, the compiler does not know:
- where delay_ms() will finally be placed in Flash
- what the exact address of the call instruction will be
- how the linker will arrange all functions
Because the final position of delay_ms() is unknown, the compiler cannot calculate the real branch offset yet.
So instead, it places a placeholder value in the .o file. For example, IAR often uses:
0x07FFFFFE
This means: “The linker will fill in the correct value later.”
The Linker Fixes the Instruction
During linking, all code sections are assigned their final addresses according to the linker script.
Now the linker knows exactly:
- where delay_ms() is located
- where the BL instruction in main() is located
So, it calculates the correct branch offset:
offset = address(delay_ms) − address(current_PC)
Using this, the linker rewrites the placeholder with a proper Thumb-2 encoded BL instruction.
This is why:
- The BL instruction in main.o looks incomplete or incorrect.
- The BL instruction in c.out is fully resolved and executable.
How the Linker Resolves Symbols:
When you build an STM32 project, the compiler generates multiple object files (.o). Each of these object files contains information about the symbols it defines and the symbols it needs from somewhere else.
To understand the linker, you only need to understand these two categories.
1. Exported Symbols
These are the symbols an object file defines. The linker can treat them as “available resources.”
- Functions: main, delay_ms, init_uart
- Global variables: adcBuffer, timerConfig
- Special runtime symbols: __vector_table
2. Imported Symbols
These are symbols that the file uses but does not define. The linker must find them elsewhere.
- Library functions: memcpy, printf
- Functions from other modules: delay_ms
- Startup/runtime symbols: __iar_program_start
How the Linker Resolves Everything
When linking begins, the linker maintains two lists:
- Defined (exported) symbols
- Undefined (imported) symbols
It then walks through all object files and libraries. For every undefined symbol, it tries to find a matching exported symbol from any other file.
This process continues until:
- all symbols are resolved, or
- a symbol remains unresolved → linker error
This is how the final executable becomes a fully connected program.
Memory Placement:
Just writing correct code does not finish your work. The other important and often invisible task is deciding where that code and its data will reside in the microcontroller’s memory.
Don’t worry about it 🤔.
This responsibility is handled by the linker, which follows a configuration file called the linker script (*.icf in IAR, *.ld in GCC).
The linker is the Architect of Memory. While the compiler translates code into instructions, the linker determines the final physical memory map of your firmware inside the microcontroller. Without a correct linker script, even perfectly written code will fail to boot or silently corrupt memory.
Let’s Understand What a Linker Script Is.
What is the Linker Script and What Really it Does:
A linker script is a configuration file that tells the linker how to arrange your program in memory. It defines where each part of your firmware—code, data, stack, and heap—will be placed inside the microcontroller.
The compiler only converts source code into machine instructions. It does not know where Flash or RAM is located on the device. The linker script fills this gap by describing the memory layout of the microcontroller and mapping program sections to the correct memory regions.
The linker script is responsible for the following:
- Placing program code (.text) in Flash memory
- Allocating data sections (.data, .bss) in RAM
- Defining the vector table location required during CPU startup
- Setting stack and heap boundaries
- Creating custom memory regions, such as bootloaders, external Flash, or retained memory
Inspecting Object Files and Final Images:
Up to this point, we have followed the firmware build journey from source code → object files → final ELF image. The next essential step is to inspect these build outputs and confirm that the toolchain has produced exactly what we intended.
In embedded systems particularly on modern MCUs such as STM32, this step is not optional. It is often the quickest way to diagnose startup failures, missing interrupts, or unexpected memory growth.
I can understand that by now you might be wondering why inspecting build outputs matters. Don’t worry, I will explain.
Embedded firmware runs in a highly constrained environment:
- No operating system protection
- Fixed and limited memory
- Strict startup and security requirements
A small mistake in the startup code or linker script can lead to silent failures, such as boot issues or unpredictable behavior. By inspecting object files and ELF images, you can verify the memory layout, symbol placement, and generated machine code, instead of relying on assumptions.
If you are using IAR Embedded Workbench, it provides a utility called ielfdumparm that allows you to inspect both:
- Individual object files (.o)
- The final linked executable (.elf)
Example command
ielfdumparm --all delay.o > delay.txt
This command generates a readable text dump of the object file, making it easier to understand how the compiler and linker have interpreted your source code. You can examine following things using the generated output.
- Disassembly: Verify the exact ARM instructions generated for each function.
- Symbol tables: Identify which symbols are defined locally, exported to other modules, or still unresolved.
- Sections: Confirm the placement of .text, .data, .bss, the vector table, and any custom sections.
- Relocation entries: Understand how symbol addresses are adjusted during the final linking stage.
- Constants and literal pools: Useful when investigating unexpected Flash or RAM usage.
📘 Related Articles:
- Understand the IAR Linker Script for STM32.
- MCU Startup Code: What Happens Before main() Runs:
- STM32 RCC Reset Domains: System, Power & Backup Explained.
- ARM Cortex-M Processor Reset Sequence Explained.
- STM32 Clock Configuration Guide: Understanding the Clock System Step-by-Step.
- How to Calculate Memory Regions in Embedded Systems (Start, End, Size)
- Why Flash Memory is Divided into Banks: A Deep Dive.