Search topics...
Boot & StartupStack & Heapfoundational

How would you detect a stack overflow on a bare-metal system?

0 upvotes
Practice with AISoon
Study the fundamentals first — Boot & Startup topic page

Stack overflow detection on bare-metal is challenging because most Cortex-M processors (M0, M3, M4) have no hardware stack limit checking — the stack pointer simply decrements past the allocated region and silently corrupts whatever memory lies below it (often the heap, .bss, or .data sections). The corruption may not cause an immediate fault, instead manifesting as intermittent data corruption, seemingly random variable changes, or crashes that occur long after the actual overflow. This makes stack overflows among the hardest embedded bugs to diagnose.

Stack painting (also called stack watermarking) is the simplest and most widely used technique. At startup, fill the entire stack region with a known sentinel pattern — typically 0xDEADBEEF or 0xCCCCCCCC. Periodically (or at a debug breakpoint), scan upward from the bottom of the stack region to find the first non-sentinel value — this is the high-water mark, showing the maximum stack depth ever reached. The difference between the high-water mark and the stack base is the remaining margin. If the margin is zero or negative, the stack overflowed. This technique has zero runtime overhead during normal operation (the check is only performed when you inspect it) but is retrospective — it tells you the stack overflowed after the fact, not at the instant it happens.

For real-time detection, use the MPU (Memory Protection Unit) on Cortex-M3 and above. Configure an MPU region at the bottom of the stack as a guard region — a small area (32 bytes to 256 bytes) marked as no-access. When the stack grows into this guard region, the MPU immediately triggers a MemManage fault, catching the overflow at the exact instruction that caused it. This gives you a precise stack trace and program counter for debugging. On Cortex-M0 (which lacks an MPU), you can set a hardware data watchpoint via the debugger on the stack sentinel address — this halts the CPU when the stack pointer reaches the danger zone, but only works during debug sessions. FreeRTOS uses stack painting for its uxTaskGetStackHighWaterMark() API and optional MPU-based detection when configCHECK_FOR_STACK_OVERFLOW is enabled.

Source: Boot & Startup Q&A