Toolchains & Cross-Compilation

Quick Cap

A toolchain is the bundle of programs you need to turn source code into an executable for a target machine. For embedded work, that target is almost never the machine you're typing on — your build host is x86-64 Linux/macOS/Windows, but the target is an ARM Cortex-M or RISC-V MCU. Cross-compilation is the practice of running a build on one architecture to produce binaries for a different one. Interviewers test whether you understand which programs make up the toolchain (gcc, binutils, libc), how target triplets work, and which flags pin a build to specific hardware (-mcpu, -mfpu, -mfloat-abi).

Key Facts:

Cross-compilation: build host ≠ target architecture (e.g., x86-64 → ARMv7-M)
Three components: compiler (gcc), binary utilities (binutils: as, ld, objdump, ...), C library (newlib / newlib-nano / picolibc / glibc)
Target triplet: arch-vendor-os-abi, e.g., arm-none-eabi (ARM, no vendor, no OS, embedded ABI)
Sysroot: a directory tree of headers and libraries representing the target environment
Hardware pinning flags: -mcpu=cortex-m4 -mfpu=fpv4-sp-d16 -mfloat-abi=hard — must match across all TUs and libraries
Spec files (--specs=nano.specs): switch between newlib variants without changing the toolchain install

Deep Dive

At a Glance

Component	What it does	Common embedded form
Compiler	Translates C/C++ to target assembly	`arm-none-eabi-gcc`
Assembler	Assembly to object files	`arm-none-eabi-as` (part of binutils)
Linker	Links objects into executables	`arm-none-eabi-ld` (part of binutils)
C library	`printf`, `memcpy`, `malloc`, syscalls stubs	newlib / newlib-nano / picolibc
C++ runtime	Exception handling, RTTI, STL	libstdc++ (often left out for size)
Math library	`sin`, `cos`, IEEE float helpers	libm (often part of newlib)
Inspection tools	`objdump`, `nm`, `readelf`, `size`	binutils suite

A common pre-built toolchain like Arm GNU Toolchain (formerly GCC ARM Embedded) bundles all of these as arm-none-eabi-* executables.

What "Cross-Compilation" Actually Means

A native compiler runs on machine X and produces binaries for machine X. A cross-compiler runs on machine X (the host) and produces binaries for machine Y (the target). For embedded work this is universal — you never type code on a Cortex-M, so you cross-compile from your laptop.

The cross-compiler must know:

The target ISA — ARM Thumb-2? RISC-V RV32IMC? Xtensa?
The calling convention / ABI — how function arguments are passed, how the stack is laid out
The target's runtime environment — bare metal? Linux user space?
Which C library is in use — and where its headers and code live

The cross-compiler is itself just gcc with different default settings. The same gcc source code, configured for arm-none-eabi, builds an ARM cross-compiler. Configured for x86_64-linux-gnu, it builds a native compiler. The toolchain name encodes its target.

Target Triplets

A target triplet is a string like arm-none-eabi or riscv32-unknown-elf that describes the target environment. The format is:

DiagramTarget Triplet Anatomy

 arch  -  vendor  -  os  -  abi
 |        |          |       |
 |        |          |       └── ABI: eabi, gnu, gnueabi, gnueabihf, ...
 |        |          └── OS: linux, none (bare metal), elf, ...
 |        └── Vendor: typically 'none' or 'unknown' for embedded
 └── Architecture: arm, aarch64, riscv32, riscv64, xtensa, ...

arch-vendor-os-abi: e.g., arm-none-eabi means ARM, no vendor, bare metal, embedded ABI.

Triplet	Meaning
`arm-none-eabi`	ARM Cortex-M / Cortex-R / Cortex-A bare metal — no OS, embedded ABI
`aarch64-none-elf`	ARM 64-bit bare metal
`arm-linux-gnueabihf`	ARM 32-bit Linux user space, hard-float ABI
`riscv32-unknown-elf`	32-bit RISC-V bare metal
`xtensa-esp32-elf`	Xtensa ESP32 bare metal
`x86_64-pc-linux-gnu`	x86-64 Linux user space (your laptop)

The triplet determines the prefix of every tool: arm-none-eabi-gcc, arm-none-eabi-objdump, etc. This lets you have multiple toolchains installed side-by-side without conflict.

libc Choices for Embedded

The standard C library is bigger than most embedded engineers expect. A naive printf("hello\n") can pull in 30-50 KB of code from glibc — utterly unusable on a 256 KB Flash chip. Embedded toolchains ship with smaller alternatives:

libc	Size for hello-world	Features	Typical use
newlib	~25-40 KB	Full printf, math, malloc, locale	Default in arm-none-eabi
newlib-nano	~5-10 KB	Stripped printf (no float by default), simpler malloc	Most embedded projects
picolibc	~3-8 KB	Modern fork; modular, no syscalls in libc, better C11/C17	New projects; preferred when supported
glibc	100+ KB minimum	Full POSIX	Embedded Linux only — never bare metal
musl	~20-30 KB	POSIX, MIT-licensed	Embedded Linux alternative to glibc

Switching newlib variants in arm-none-eabi is just a linker spec file:

text

arm-none-eabi-gcc ... --specs=nano.specs ...      # newlib-nano
arm-none-eabi-gcc ... --specs=nosys.specs ...     # no syscall stubs (provide your own)

To enable float printf in newlib-nano (it's off for size by default):

text

arm-none-eabi-gcc ... --specs=nano.specs -u _printf_float ...

⚠️Pick your libc once and stick with it

You cannot mix object files compiled against different libc variants in the same link. Pick newlib-nano or picolibc up front and configure your build system to use it consistently — including for any third-party libraries you're statically linking.

Sysroot

A sysroot is a directory tree that mirrors the target's filesystem layout and contains the headers (include/) and libraries (lib/) the cross-compiler should use. For bare-metal targets the sysroot is small (just the libc and a couple of CMSIS headers); for embedded Linux (e.g., a Yocto SDK) it can mirror an entire root filesystem.

DiagramSysroot Layout

 sysroot/
   include/       ←  stdio.h, stdint.h, target-specific headers
   lib/           ←  libc.a, libm.a, libgcc.a
     thumb/v7e-m+fp/hard/    ←  multilib variants per ABI

Headers in include/, libraries in lib/, with multilib subdirectories per ABI.

The compiler finds the sysroot via --sysroot=/path/to/sysroot or the toolchain's default. Multilib is the trick that lets one toolchain support many CPU/ABI combinations: the sysroot has subdirectories for each variant (thumb/v7e-m/, thumb/v7e-m+fp/hard/, etc.), and the compiler picks the right one based on -mcpu/-mfpu/-mfloat-abi.

Hardware-Pinning Flags

These flags tell the compiler exactly what hardware it's generating code for. They must match across every translation unit and every library you link — mixing them causes obscure failures (corrupted floats, wrong stack offsets, hard faults).

Flag	Purpose	Example values
`-mcpu=`	Target CPU model	`cortex-m0`, `cortex-m4`, `cortex-m7`, `cortex-a53`
`-march=`	Target ISA (alternative to -mcpu)	`armv7e-m`, `armv8-m.main`, `rv32imc`
`-mthumb`	Use Thumb instruction set (Cortex-M)	(boolean)
`-mfpu=`	FPU model	`fpv4-sp-d16`, `fpv5-sp-d16`, `fpv5-d16`, `none`
`-mfloat-abi=`	How floats are passed	`soft`, `softfp`, `hard`

The float-ABI choice is the trickiest:

`-mfloat-abi=`	Float operations	Function arguments	Notes
`soft`	Software emulation	Integer registers	Works without FPU; slowest
`softfp`	Hardware FPU (if -mfpu set)	Integer registers	FPU used internally; ABI-compatible with soft
`hard`	Hardware FPU	Floating-point registers	Fastest; must match across the whole binary

A typical Cortex-M4F line:

text

-mcpu=cortex-m4 -mthumb -mfpu=fpv4-sp-d16 -mfloat-abi=hard

A Cortex-M0+ line (no FPU):

text

-mcpu=cortex-m0plus -mthumb -mfloat-abi=soft

Toolchain Mismatch Failures

When a build pulls in a precompiled library or a third-party object file with different -mcpu/-mfloat-abi/-mfpu flags than your project, you get one of these:

Symptom	Likely cause
Linker error: "X uses VFP register arguments, Y does not"	-mfloat-abi mismatch (hard vs soft/softfp)
Linker error: "ARM and Thumb architectures incompatible"	-mthumb mismatch
Runtime: corrupted floats, NaN where you expect numbers	softfp vs hard call across module boundaries
Runtime: HardFault on first FPU instruction	FPU not enabled in CPU but code uses FPU instructions
Runtime: stack corruption, weird local variables	Mixed thumb/arm code

The fix is almost always "rebuild the third-party blob with matching flags" or "demand a matching prebuilt from the vendor."

💡Verify with readelf

readelf -A your.elf prints the ARM-specific attributes section, including float ABI, FPU, and ISA. If two object files in the same link disagree, the linker will fail; this command lets you check the final ELF and any input .o directly.

LLVM/Clang and Other Toolchains

GCC dominates embedded but is not the only option:

LLVM/Clang (clang --target=arm-none-eabi) is increasingly viable for ARM. Often produces smaller code than gcc; better diagnostics; lld linker is much faster than ld. CMSIS works fine.
IAR Compiler — proprietary, popular in safety-critical and automotive (formal certifications); very different command-line.
Keil ArmCC — was Arm's proprietary compiler; deprecated in favor of armclang (LLVM-based).
SEGGER Compiler — bundled with Embedded Studio.

For interview purposes, gcc is the assumed default. Mention LLVM if asked about alternatives.

Debugging Story: The Mystery NaN

A team's sensor processing started producing NaN outputs after they linked in a third-party DSP library. The math was correct in unit tests, broken on hardware. Investigation showed the library was prebuilt with -mfloat-abi=softfp while the rest of the firmware was -mfloat-abi=hard. The linker hadn't complained because the library's outer interface was integer-only, but internally it called functions taking float arguments — which the caller passed in FP registers (hard ABI) but the library read from integer registers (softfp ABI), reading garbage and producing NaN.

The fix was a rebuild of the library with -mfloat-abi=hard. The detection method was readelf -A on each .o in the link — it showed the float-ABI tag mismatch immediately.

The lesson: Hardware-pinning flags must match everywhere. When linker silence doesn't catch the mismatch, runtime behavior gets weird in ways that look like math bugs but are actually ABI bugs.

What Interviewers Want to Hear

You can define cross-compilation and explain why embedded requires it
You know the three pieces of a toolchain: compiler, binutils, libc
You can decode a target triplet
You can compare newlib, newlib-nano, picolibc and pick one for a target
You know the role of -mcpu, -mfpu, -mfloat-abi and that they must match across the whole binary
You can name at least one detection technique for ABI mismatches (readelf -A)

Interview Focus

Classic Interview Questions

Q1: "What is cross-compilation and why is it required for embedded development?"

Model Answer Starter: "Cross-compilation is producing binaries for a target architecture different from the build host. Embedded requires it because you almost never have a development environment running on the target MCU — you write code on x86-64 Linux/macOS/Windows and produce ARM, RISC-V, or Xtensa binaries. The cross-compiler is just gcc (or clang) configured to default to a different ISA, ABI, and runtime environment. The tools are usually prefixed by the target triplet, like arm-none-eabi-gcc for ARM bare metal."

Q2: "Walk me through the components of an embedded toolchain."

Model Answer Starter: "Three things: the compiler — gcc — which produces target assembly. Binutils, the suite of binary tools: the assembler as, linker ld, and inspection tools objdump, nm, readelf, size. And a C library — for embedded ARM that's typically newlib, newlib-nano, or picolibc, each making different size/feature trade-offs. Optionally a C++ runtime (libstdc++) and math library (libm). The arm-none-eabi-gcc package bundles all of these as a single install."

Q3: "What does the target triplet arm-none-eabi mean?"

Model Answer Starter: "It's a four-field descriptor of the target environment: architecture (ARM), vendor (none — embedded toolchains don't carry vendor info), OS (none, meaning bare metal — there's no OS to call), and ABI (eabi, the Embedded Application Binary Interface). Compare with arm-linux-gnueabihf which is ARM 32-bit Linux user space with hard-float, or aarch64-none-elf for ARM 64-bit bare metal. The triplet determines the prefix of every tool in the toolchain and lets multiple toolchains coexist."

Q4: "Compare newlib, newlib-nano, and picolibc. When would you pick each?"

Model Answer Starter: "All three are libc implementations targeted at embedded. Newlib is the default in arm-none-eabi but quite large — a hello-world is 25-40 KB. Newlib-nano is a smaller variant with stripped-down printf (no float by default) and simpler malloc, typically 5-10 KB; this is what most embedded projects actually use, switched in via --specs=nano.specs. Picolibc is a newer, modern fork, even smaller (3-8 KB), more modular, and it doesn't ship syscalls in libc itself — you provide them. I'd pick newlib-nano as a safe default, picolibc for new projects targeting very tight Flash, and full newlib only when I genuinely need its features and have the Flash for it."

Q5: "What does -mfloat-abi=hard mean and why does it have to match across the whole project?"

Model Answer Starter: "-mfloat-abi=hard says: pass floating-point arguments and return values in the FPU's floating-point registers. Compare with soft which emulates floats entirely in integer code, and softfp which uses the FPU for math but still passes floats through integer registers for compatibility. The reason it must match everywhere is that it's a calling-convention choice. If a caller compiled hard passes a float in S0 but the callee compiled softfp reads it from R0, the callee gets garbage. The linker catches some mismatches but not all; runtime symptoms are corrupted floats and NaN. readelf -A on each .o confirms the ABI tag."

Trap Alerts

Don't say: "I just use the toolchain that came with my IDE" — interview wants engineering reasoning, not button-pressing
Don't forget: glibc is for Linux — never bare metal. Confusing the two reveals lack of embedded experience.
Don't ignore: Float-ABI mismatches don't always show as linker errors — they sometimes silently corrupt floats at runtime

Follow-up Questions

"How do you build a cross-compiler from source? (crosstool-ng, buildroot)"
"What is --specs=nosys.specs and when would you use it?"
"How does _sbrk get implemented in a bare-metal newlib environment?"
"What is multilib?"
"Why might Clang produce smaller code than gcc for an embedded target?"
"How do you switch a project from newlib to picolibc?"

💡Practice Build Systems Interview Questions

Ready to test yourself? Head over to the Build Systems Interview Questions page for a full set of Q&A with collapsible answers — perfect for self-study and mock interview practice.

Practice

❓ What does the 'eabi' in 'arm-none-eabi' indicate?

❓ Which libc variant is the typical default for size-constrained embedded ARM projects?

❓ A project uses `-mfloat-abi=hard` but links a precompiled library built with `-mfloat-abi=softfp`. What's the most likely failure?

❓ What is a sysroot?

❓ To enable float support in newlib-nano's printf, what extra option is required?

Real-World Tie-In

Multi-Architecture CI Build — A project supporting both Cortex-M0+ (sensor node) and Cortex-M7 (gateway) maintains separate build configurations: M0+ uses newlib-nano + -mfloat-abi=soft, M7 uses newlib-nano + -mfpu=fpv5-d16 -mfloat-abi=hard. CMake toolchain files encode the right combination per target so a single cmake --preset=m0 or cmake --preset=m7 invocation gets it right.

Binary Compatibility Audit Before Vendor SDK Integration — Before integrating a vendor's BLE stack as a precompiled .a, an engineer ran arm-none-eabi-readelf -A vendor_lib.a on every object inside it (extracted with ar -x). All objects showed Tag_FP_arch: VFPv4 and Tag_ABI_VFP_args: VFP registers, matching the project's -mcpu=cortex-m4 -mfpu=fpv4-sp-d16 -mfloat-abi=hard settings. The integration succeeded on the first try.

Toolchain Migration to Picolibc — A team migrated from newlib-nano to picolibc on a heavily Flash-constrained sensor. Hello-world dropped from 7 KB to 4 KB. The migration required implementing _sbrk, _write, and _exit syscall stubs themselves (picolibc doesn't ship default ones) and updating linker invocation to use --oslib=semihost for debug-time printf via the JTAG.

Toolchains & Cross-Compilation

Quick Cap

Deep Dive

At a Glance

What "Cross-Compilation" Actually Means

Target Triplets

libc Choices for Embedded

Sysroot

Hardware-Pinning Flags

Toolchain Mismatch Failures

LLVM/Clang and Other Toolchains

Debugging Story: The Mystery NaN

What Interviewers Want to Hear

Interview Focus

Classic Interview Questions

Trap Alerts

Follow-up Questions

Practice

Real-World Tie-In

Prerequisites

Compilation Pipeline

Up Next

Linker Scripts

Make & CMake for Embedded