Quick Cap
Every C variable lives in a specific region of memory -- stack, heap, .data, .bss, or .rodata -- and understanding which region is which determines whether a variable survives a function return, whether it's zero-initialized at startup, and how much of your limited embedded RAM it consumes. This is the single most fundamental concept in embedded C because memory bugs (stack overflows, use-after-free, uninitialized reads) are the #1 class of production failures.
Interviewers test whether you can point at any variable declaration and immediately say where it lives, when it's initialized, and what happens to it when the scope ends.
Key Facts:
- Five memory sections:
.text(code),.rodata(constants),.data(initialized globals/statics),.bss(zero-initialized globals/statics), stack (locals), heap (malloc) - Only .data and .bss are auto-initialized:
.bssis zeroed;.datais copied from flash. Local variables are NOT initialized -- they contain garbage. - Stack is small: Typically 1-8 KB on Cortex-M MCUs. Large local arrays and deep recursion are the #1 cause of embedded crashes.
- Use
stdint.htypes:uint8_t,int32_t, etc. guarantee size. Plainintandlongvary by platform. staticchanges everything: Astaticlocal moves from stack to.bss/.data, surviving between calls but introducing reentrancy issues.- Heap is often avoided entirely: Many safety-critical embedded systems ban
malloc/freeto eliminate fragmentation and non-deterministic timing.
Deep Dive
At a Glance
| Concept | Detail |
|---|---|
| Memory sections | .text, .rodata, .data, .bss, stack, heap |
| Storage durations | Automatic (stack), static (.data/.bss), allocated (heap), thread-local |
| Initialization | .bss zeroed by startup code; .data copied from flash; stack locals uninitialized |
| Stack size | 1-8 KB typical on Cortex-M; configurable in linker script |
| Integer types | Use stdint.h (uint8_t, int32_t) for portability; avoid int, long |
| Key linker symbols | _estack, _sdata, _edata, _sbss, _ebss -- defined in linker script |
Memory Map
A typical Cortex-M program uses two physical memories — flash (read-only, holds code and constants) and RAM (read-write, holds variables and stack). The linker script places each section:
FLASH RAM┌─────────────────────┐ ┌─────────────────────┐ ← _estack│ │ │ Stack ↓ │ (initial SP)│ .text │ │ local vars, args, ││ (machine code: │ │ return addresses ││ main, ISRs) │ │ ... ││ │ │ │├─────────────────────┤ │ (free space) ││ .rodata │ │ ││ (const globals, │ │ ... ││ string literals, │ │ ││ lookup tables) │ │ Heap ↑ │├─────────────────────┤ │ malloc'd memory ││ .data init values │ ──copy at──▶ ├─────────────────────┤│ (source for RAM │ boot │ .bss ││ .data section) │ │ (zeroed at boot) │└─────────────────────┘ ├─────────────────────┤0x0800_0000 (STM32) │ .data ││ (copied from flash)│└─────────────────────┘0x2000_0000 (STM32)
The stack starts at the top of RAM and grows downward. The heap (if used) starts above .bss and grows upward. If they collide, you get a crash — this is the classic stack-heap collision.
At startup, the C runtime (Reset_Handler / crt0) does two things before main():
- Copies
.datafrom flash to RAM (initialized globals/statics need their initial values) - Zeros
.bssin RAM (uninitialized globals/statics are guaranteed to start at zero)
Local variables on the stack are NOT touched by startup code -- they contain whatever garbage was in that memory location.
Where Does Each Variable Live?
This is the core interview skill: given a variable declaration, identify its memory section.
// .rodata (flash) -- const + global scope = read-only data in flashconst uint32_t FIRMWARE_VERSION = 0x0102;// .data (RAM, copied from flash at boot) -- initialized globaluint32_t sensor_count = 5;// .bss (RAM, zeroed at boot) -- uninitialized global (zero by default)uint32_t error_count;// .data -- static global, same as non-static for memory purposesstatic uint32_t module_id = 42;// .bss -- uninitialized static globalstatic uint32_t call_count;void read_sensor(uint16_t channel) {// Stack -- automatic storage, NOT initialized, destroyed on returnuint8_t buffer[64];// .bss -- static LOCAL, persists between calls, zero-initialized oncestatic uint32_t invocation_count;invocation_count++;// .rodata -- static const local, stored in flash (no RAM cost)static const uint16_t lookup[] = {0, 100, 200, 400, 800};}
The key insight: the static keyword on a local variable moves it from the stack to .bss or .data. This means it persists between function calls (useful for counters, state machines) but also means the function is no longer reentrant -- if two threads call it simultaneously, they share the same static variable.
Local (automatic) variables are NOT initialized. They contain whatever happened to be on the stack. Code like int sum; for(...) sum += x; is a bug -- sum starts with a garbage value. Always initialize local variables at declaration.
Quick Reference — Variable → Memory Section:
| Declaration Pattern | Section | Initialized? | Survives Return? |
|---|---|---|---|
const global/static | .rodata (flash) | Yes (value in flash) | Yes |
| Global/static with initializer | .data (RAM) | Yes (copied from flash) | Yes |
| Global/static without initializer | .bss (RAM) | Zeroed at boot | Yes |
static local with initializer | .data (RAM) | Yes (once) | Yes |
static local without initializer | .bss (RAM) | Zeroed (once) | Yes |
| Local variable | Stack | NO — garbage | No |
malloc() result | Heap | NO — garbage | Until free() |
Fixed-Width Integer Types (stdint.h)
In embedded systems, the size of int and long depends on the platform: int is 16 bits on MSP430, 32 bits on Cortex-M, and 32 or 64 bits on Linux. Code that assumes int is 32 bits will silently break when ported to a 16-bit MCU.
The solution: always use stdint.h types for data that crosses module boundaries, is stored in structs, or is written to hardware registers:
#include <stdint.h>uint8_t reg_value; // Exactly 8 bits, unsigned (0 to 255)int16_t temperature; // Exactly 16 bits, signed (-32768 to 32767)uint32_t timestamp; // Exactly 32 bits, unsigned (0 to 4,294,967,295)int32_t position; // Exactly 32 bits, signed
When is plain int acceptable? For loop counters and temporary arithmetic where exact size doesn't matter and you want the CPU's natural word size for best performance. But for anything stored in a struct, sent over a protocol, or written to a register -- use fixed-width types.
| Type | Bits | Range (unsigned) | Range (signed) | Use Case |
|---|---|---|---|---|
uint8_t | 8 | 0 - 255 | -128 to 127 | Register bytes, flags, small counters |
uint16_t | 16 | 0 - 65,535 | -32,768 to 32,767 | ADC values, sensor data, PWM |
uint32_t | 32 | 0 - 4.29B | -2.14B to 2.14B | Timestamps, addresses, large counters |
uint64_t | 64 | 0 - 18.4E | -9.2E to 9.2E | Cryptography, precise time (avoid on 8/16-bit MCUs) |
If an interviewer asks "what is sizeof(int)?", the correct answer is "it depends on the platform." On Cortex-M it is 4 bytes, on MSP430 it is 2 bytes, on x86-64 Linux it is 4 bytes. This is exactly why stdint.h exists -- to eliminate this ambiguity.
Integer Promotion and Signedness Traps
C's integer promotion rules are a frequent source of embedded bugs. When you perform arithmetic on types smaller than int, the compiler silently promotes them to int before the operation:
uint8_t a = 200;uint8_t b = 100;uint8_t result = a + b; // a and b promoted to int, sum is 300,// then truncated to uint8_t: result = 44 (300 - 256)
The more dangerous case is mixed signed/unsigned comparison:
int x = -1;unsigned int y = 1;if (x < y) {// You expect this branch -- but it is NOT taken!// x is promoted to unsigned: (unsigned)-1 = 4294967295, which is > 1}
This is a classic interview trap. The rule: when signed and unsigned integers are mixed in an expression, the signed value is implicitly converted to unsigned. -1 becomes UINT_MAX, which is greater than any small positive number.
Safe practices:
- Compare signed with signed, unsigned with unsigned
- Cast explicitly when mixing:
if ((int)y > x) - Enable compiler warnings:
-Wsign-compare,-Wconversion - MISRA C requires explicit casts for all implicit conversions
Stack vs Static vs Heap: When to Use Each
| Criteria | Stack (auto) | Static / Global | Heap (malloc) |
|---|---|---|---|
| Lifetime | Function scope only | Entire program | Until free() |
| Initialization | NOT initialized | .bss zeroed, .data from flash | NOT initialized |
| Size limit | Small (1-8 KB total) | Limited by RAM | Limited by RAM |
| Allocation speed | Instant (just move SP) | Zero (placed at link time) | Slow (search free list) |
| Deterministic | Yes | Yes | No (fragmentation) |
| Reentrant | Yes | No (shared state) | Yes (if per-thread) |
| Best for | Small temps, loop vars | Config, LUTs, buffers, state | Rarely used in embedded |
Embedded rule of thumb: Prefer static allocation for anything larger than ~100 bytes. Reserve the stack for small local variables and function call overhead. Avoid heap unless you have a compelling reason and a strategy for fragmentation (memory pools, fixed-size allocators).
The Startup Sequence and .bss/.data
Understanding the startup sequence explains WHY globals are initialized but locals are not:
- Power-on/reset: CPU starts executing from the reset vector (address 0x00000000 or 0x08000000 on STM32)
Reset_Handler: Sets up the stack pointer, then runs the C runtime initialization- Copy
.data: Copies initialized global/static values from flash (where they were stored by the linker) to their RAM addresses - Zero
.bss: Fills the.bsssection with zeros (this is why uninitialized globals are zero) - Call
main(): Your application starts
Local variables skip steps 3-4 because they don't exist yet -- they're created when the function is called, using whatever memory happens to be on the stack at that moment.
This sequence also explains a common embedded issue: large .data sections slow boot time because every initialized global must be copied from flash to RAM. If you have a 10 KB lookup table with initializers, that's 10 KB copied at every boot. If the table is const, it stays in flash (.rodata) and costs zero boot time and zero RAM.
Debugging Story: The 47-Minute Crash
A team was debugging an automotive sensor node that crashed after running for exactly 47 minutes. The crash occurred in different functions each time, making it look random. Weeks of investigation followed: power supply was stable, clocks were correct, communication was working fine right up until the crash.
The root cause: a data-logging function allocated a 2 KB local buffer on a 4 KB stack. The function worked fine when called from main() (call depth of 3), but a new feature added a timer callback that called the same function from within a deeply nested ISR chain (call depth of 12). The additional stack frames pushed total usage past 4 KB, and the stack silently overwrote the .bss section below it -- corrupting global variables. The 47-minute timing corresponded to when the timer callback's execution path first reached the logging function during a particular sensor state.
The fix was twofold: move the buffer to static allocation, and add a stack canary (a known pattern at the bottom of the stack that's checked periodically -- if it's been overwritten, you know the stack has overflowed).
// Before: 2 KB on stack -- overflow riskvoid log_sensor_data(const sensor_t *data) {uint8_t buffer[2048];format_log_entry(buffer, data);write_to_flash(buffer, sizeof(buffer));}// After: static buffer -- safe, but not reentrantvoid log_sensor_data(const sensor_t *data) {static uint8_t buffer[2048];format_log_entry(buffer, data);write_to_flash(buffer, sizeof(buffer));}
Lesson: Always check total stack usage across all call paths, including ISRs. Use the linker map file and static analysis tools (like GCC's -fstack-usage flag) to verify that your deepest call chain fits within the allocated stack.
What interviewers want to hear: You can map any variable to its memory section instantly -- "that's .bss", "that's on the stack", "that const lives in .rodata in flash." You understand why locals are uninitialized (startup code doesn't touch the stack) and why static changes a local's lifetime and reentrancy. You know the practical implications: stack overflow from large locals, boot time cost of .data, and why embedded systems often ban malloc. You use stdint.h by default and can explain the integer promotion trap. You've debugged real memory issues and know the tools (linker map, -fstack-usage, stack canaries).
Interview Focus
Classic Data Types & Memory Interview Questions
Q1: "Where does each type of variable live in memory?"
Model Answer Starter: "Global and static variables go in .data if initialized to a nonzero value, or .bss if uninitialized or zero-initialized. const globals go in .rodata, which is typically in flash. Local variables go on the stack and are NOT initialized -- they contain garbage. malloc'd memory goes on the heap. The key thing is that only .data and .bss are initialized by the C runtime at startup. .data is copied from flash and .bss is zeroed. Stack memory is never touched until the function is called."
Q2: "What is the difference between .data and .bss, and why does it matter?"
Model Answer Starter: ".data holds globals/statics with nonzero initializers -- their values are stored in flash and copied to RAM at boot. .bss holds globals/statics that are zero-initialized -- only the section size needs to be stored, not the actual data, because the startup code just memsets it to zero. The practical implication: a large array initialized to zeros costs zero flash, but the same array initialized to {1, 2, 3, ...} costs N bytes of flash for the initializers plus N bytes of RAM for the copy. This matters when flash and boot time are constrained."
Q3: "Why should you use uint32_t instead of unsigned int in embedded code?"
Model Answer Starter: "unsigned int is platform-dependent -- it's 16 bits on MSP430, 32 bits on Cortex-M, and could be different on other architectures. If your code assumes unsigned int is 32 bits and you port to a 16-bit MCU, you get silent truncation bugs. uint32_t from stdint.h guarantees exactly 32 bits on every platform. I use fixed-width types for anything that goes into a struct, is sent over a protocol, is written to a register, or needs to hold a specific range of values."
Q4: "What happens if you declare a large local array in an embedded function?"
Model Answer Starter: "It goes on the stack, which is typically 1-8 KB on small MCUs. If the array is large relative to the stack size, you risk a stack overflow -- the stack grows into adjacent memory (often .bss) and silently corrupts global variables, causing unpredictable crashes that are extremely hard to debug. The fix is to use static allocation for large buffers, which places them in .bss instead of the stack. The tradeoff is that static makes the function non-reentrant -- if two threads or an ISR and main both call it, they share the same buffer."
Q5: "Explain the integer promotion trap with signed and unsigned comparison."
Model Answer Starter: "When you compare a signed int with an unsigned int, the C standard promotes the signed value to unsigned. So int x = -1; unsigned y = 0; if (x < y) evaluates as false because -1 becomes UINT_MAX when converted to unsigned, which is greater than 0. This is a classic source of bugs. I avoid it by never mixing signed and unsigned in comparisons, using explicit casts when necessary, and enabling -Wsign-compare in my compiler flags."
Trap Alerts
- Don't say: "Local variables are initialized to zero" -- only globals and statics are. Locals contain whatever garbage was on the stack.
- Don't forget: The difference between
.dataand.bss-- interviewers love asking this because it reveals whether you understand the linker and startup code. - Don't ignore: The integer promotion rules -- mixing signed and unsigned types is one of the most common embedded C bugs and a favorite interview trap.
Follow-up Questions
- "How would you measure actual stack usage at runtime on a Cortex-M?"
- "What is the difference between
const int x = 5;at file scope vs inside a function?" - "Why might a
constvariable still consume RAM on some compilers?" - "How does the linker script control the memory layout?"
Ready to test yourself? Head over to the C/C++ Embedded Interview Questions page for a full set of Q&A with collapsible answers — covering volatile, const, static, pointers, memory layout, and more.
Practice
❓ Where does an uninitialized global variable live in memory?
❓ What is the value of a local variable that was declared but not initialized?
❓ Why do embedded systems often avoid malloc() and free()?
❓ What does `sizeof(int)` return on an ARM Cortex-M microcontroller?
❓ What happens when you compare `int x = -1` with `unsigned int y = 0` using `x < y`?
❓ What is the likely sizeof(struct { uint8_t a; uint32_t b; }) on a 32-bit ARM Cortex-M?
Real-World Tie-In
Automotive ECU Boot Time Optimization -- A powertrain ECU had a 200 ms boot time budget but was taking 350 ms. Analysis showed that a 64 KB calibration table was declared as an initialized array in .data, causing a 64 KB flash-to-RAM copy at every boot. Changing the table to const (moving it to .rodata in flash) eliminated the copy, cutting boot time to 180 ms and freeing 64 KB of RAM.
IoT Sensor Node Stack Overflow -- A battery-powered sensor node crashed after a firmware update added a 512-byte local buffer for JSON formatting on a 2 KB stack. Moving the buffer to static allocation fixed the crash, and adding -fstack-usage to the build revealed three other functions dangerously close to the limit.
Medical Device Certification -- During IEC 62304 certification, a code reviewer flagged all uses of malloc() in the firmware. The team replaced dynamic allocation with a fixed-size memory pool (array of pre-allocated buffers), eliminating fragmentation risk and satisfying the safety assessor's requirement for bounded memory usage.