What causes interrupt latency and how do you minimize it?

Question

Accepted Answer

Interrupt latency is the time from when a peripheral asserts an interrupt request to when the first instruction of the ISR executes. On Cortex-M3/M4, the theoretical minimum is 12 clock cycles (stacking plus vector fetch), but real-world latency is always higher due to several factors. The most significant is a higher-priority ISR already running — the pending interrupt cannot execute until the current ISR returns (or tail-chains), so worst-case latency includes the entire execution time of every higher-priority ISR. This is why keeping all ISRs short is a system-wide discipline, not just a local optimization.

Other contributors: Critical sections where interrupts are globally disabled via PRIMASK add directly to latency — a 10-microsecond critical section adds 10 microseconds of worst-case latency to every interrupt. Flash memory wait states increase the vector fetch time; at 168 MHz on STM32F4 with 5 wait states, a flash read can take 6 cycles instead of 1. This is mitigated by the ART accelerator (prefetch and instruction cache), but a cache miss during vector lookup adds measurable delay. Bus contention from DMA transfers or other bus masters can stall the stacking operation by a few cycles. Multi-cycle instructions being executed when the interrupt fires (like an LDMIA loading 8 registers) must complete before the interrupt is taken.

To minimize latency: (1) Assign correct preemption priorities — the most time-critical interrupt should have the highest (numerically lowest) priority so it preempts everything else. (2) Keep all ISRs as short as possible, especially higher-priority ones. (3) Minimize critical section duration — use the save/restore PRIMASK pattern and keep the protected region to just the data copy. (4) Place interrupt vector tables and critical ISR code in SRAM or TCM rather than flash to eliminate wait-state penalties (on Cortex-M7, placing code in ITCM gives single-cycle access). (5) Avoid using __disable_irq() entirely if possible — use BASEPRI instead to mask only lower-priority interrupts while leaving higher-priority ones enabled. __set_BASEPRI(priority_threshold) blocks interrupts at or below the threshold while allowing more urgent ones through, preserving responsiveness for the most critical handlers.