Search topics...

What are the basic concepts of what happens before main() is called in C?

0 upvotes
Practice with AISoon

On a bare-metal/embedded target, between reset and the first instruction of main() the startup code (often crt0 / a Reset_Handler plus the C runtime) must put the machine into the state the C language assumes. The sequence is roughly:

  1. Reset vector. On power-up/reset the CPU fetches a fixed entry from the vector table (or reset vector) and jumps to the reset/startup routine. On Cortex-M, the hardware also loads the initial stack pointer from the first vector-table entry automatically.

  2. Set up the stack pointer. Establish SP to a valid RAM region so that function calls and locals work. (On Cortex-M this is the loaded MSP; on other architectures startup code writes SP explicitly.)

  3. Initialize the .data section. Variables with non-zero initializers live as initial values in flash (the "load address"). Startup copies that block from flash into its RAM location (the "virtual/run address") so globals have their starting values.

  4. Zero the .bss section. Variables with static storage duration and no explicit initializer must read as 0 per the C standard. Startup memsets the .bss region in RAM to zero.

  5. Basic hardware / runtime bring-up. Often: configure clocks/PLL, enable the FPU if present, set up the memory controller, and initialize the heap (the allocator's brk/arena) and possibly stack-overflow guards. The exact amount done here varies by platform.

  6. Run C++ constructors / __attribute__((constructor)) / .init_array. Walk the .init_array (and legacy .init) tables to run global/static C++ constructors and any C constructor functions before user code.

  7. Call main(). Finally the runtime calls main(argc, argv) (argv typically empty on embedded). If main ever returns, the startup code usually runs destructors/atexit handlers and then loops forever or calls exit/_exit.

Key idea: the linker script defines the section addresses (.data load vs. run address, .bss bounds, stack/heap regions), and the startup code uses those symbols to realize the C abstract machine — initialized globals, zeroed globals, a working stack, and constructed statics — before any of your code runs.