What is tail-chaining and how does it improve interrupt throughput?
Tail-chaining is a hardware optimization in the Cortex-M NVIC that eliminates redundant stack operations when a pending interrupt is waiting while an ISR is completing. Normally, returning from an ISR requires unstacking eight registers (6-12 cycles) and then, if another interrupt is pending, re-stacking the same eight registers (6-12 cycles) to enter the next ISR — a total of 12-24 wasted cycles doing nothing but writing and reading the same values to and from the stack. With tail-chaining, the processor detects the pending interrupt during the exception return sequence, skips both the unstack and the re-stack, and directly begins fetching the next ISR's vector. On Cortex-M3/M4, a tail-chained ISR entry takes only 6 cycles compared to 12 cycles for a fresh entry.
This optimization is critical for high-frequency interrupt systems. Consider a DMA half-transfer and transfer-complete interrupt firing back-to-back on a system running at 72 MHz. Without tail-chaining, the gap between ISRs is approximately 24 cycles (333 ns). With tail-chaining, the gap is 6 cycles (83 ns) — a 4x improvement in transition speed. For systems handling dozens or hundreds of interrupts per millisecond (high-speed communication, motor control with multiple sensors), the cumulative cycle savings are substantial.
A related optimization is late-arriving: if a higher-priority interrupt arrives during the stacking phase of a lower-priority interrupt, the processor completes stacking but branches to the higher-priority ISR instead. When the higher-priority ISR finishes, it tail-chains into the original lower-priority ISR without any additional stacking. This means the processor always services the most urgent interrupt first, even if a lower-priority one triggered the initial context save. Both tail-chaining and late-arriving are automatic hardware behaviors — the programmer does not enable or configure them, but understanding them explains why measured ISR latencies are often shorter than the theoretical maximum.
Source: Interrupts & Priorities Q&A
