MCU & System Architecture
intermediate
Weight: 5/10

Interrupts and priorities

Master interrupt service routine design, NVIC priority schemes, nested interrupts, and latency optimization for real-time embedded systems.

mcu
interrupts
isr
priorities
nvic
latency
real-time

Quick Cap

Interrupts let hardware events (timers, communication peripherals, GPIO pins) break into normal program flow so the CPU can respond immediately. On ARM Cortex-M, the Nested Vectored Interrupt Controller (NVIC) manages all external interrupts with configurable priority levels and automatic context saving, making it the heart of every real-time embedded system.

Key Facts:

  • NVIC handles up to 240 external interrupts on Cortex-M, with 3-8 bits of configurable priority
  • Lower number = higher priority: priority 0 preempts priority 1 (a constant source of confusion)
  • Hardware context save: Cortex-M automatically pushes R0-R3, R12, LR, PC, xPSR on interrupt entry — no manual save needed
  • Tail-chaining: back-to-back interrupts skip the context restore/save cycle, reducing latency to 6 cycles
  • ISR golden rule: keep it short, set a flag, do heavy work in the main loop
  • Shared data protection: volatile for visibility, critical sections or atomics for integrity

Deep Dive

At a Glance

CharacteristicDetail
ControllerNVIC (Cortex-M), PLIC (RISC-V)
Max IRQs240 external + 16 system exceptions (Cortex-M)
Priority bits3-8 (vendor-configured, typically 4 on STM32)
Context saveAutomatic (8 registers pushed to stack)
Latency12 cycles typical (Cortex-M3/M4), 15 cycles (M0)
Tail-chaining6 cycles between back-to-back ISRs
Stack usage~32 bytes per nested interrupt level

How an Interrupt Works

When an interrupt fires on Cortex-M, the hardware performs this sequence automatically:

px-2 py-1 rounded text-sm font-mono border
1. Finish current instruction
2. Push 8 registers to current stack (R0-R3, R12, LR, PC, xPSR)
3. Load handler address from vector table
4. Switch to Handler mode (privileged)
5. Execute ISR code
6. Pop 8 registers (detected via special EXC_RETURN value in LR)
7. Resume interrupted code

The entire entry sequence takes 12 clock cycles on Cortex-M3/M4 — this is the minimum possible interrupt latency if no higher-priority interrupt is running.

The Vector Table

The vector table is an array of function pointers in Flash (or RAM if relocated via VTOR). The first entry is the initial stack pointer; the second is the Reset handler; entries 3-15 are system exceptions (NMI, HardFault, SVCall, PendSV, SysTick); entry 16 onward are peripheral IRQs.

Vector #ExceptionTypical Use
0Initial SPStack pointer loaded on reset
1ResetEntry point after power-on / reset
2NMINon-maskable interrupt (cannot be disabled)
3HardFaultUnrecoverable error catch-all
4-10MemManage, BusFault, UsageFault, etc.Configurable fault handlers
11SVCallSupervisor call (used by RTOS)
14PendSVDeferred context switch (used by RTOS)
15SysTickSystem timer tick
16+IRQ0, IRQ1, ...Peripheral interrupts (UART, Timer, GPIO...)

NVIC Priority System

This is the single most confusing aspect of Cortex-M interrupts, and interviewers love testing it.

Priority bits are split into two fields:

FieldControlsEffect
Preemption priority (group priority)Whether an ISR can interrupt another ISRA running ISR with preemption priority 2 will be preempted by an IRQ with preemption priority 1
Sub-priorityTie-breaking when two IRQs pend simultaneouslyAmong IRQs with the same preemption priority, the one with lower sub-priority runs first — but it does NOT preempt

The split is configured globally via AIRCR.PRIGROUP. With 4 priority bits (common on STM32):

PRIGROUPPreemption bitsSub-priority bitsPreemption levelsSub-priority levels
040161
13182
22244
31328
4041 (no preemption)16
⚠️Lower Number = Higher Priority

Priority value 0 is the HIGHEST priority. Priority 15 is the LOWEST. This is the opposite of what most people intuit. When an interviewer asks "which interrupt runs first?", always check the numeric value — lower wins.

ISR Design Rules

The golden rules for writing interrupt service routines:

DoDon't
Clear the interrupt flag firstCall printf(), malloc(), or any blocking function
Set a flag and returnUse delay_ms() or busy-wait loops
Use volatile for shared variablesAccess complex data structures without protection
Keep execution under 10 us for most ISRsCall non-reentrant library functions
Use ring buffers for ISR-to-main data transferPerform heavy computation

A typical well-designed ISR follows this pattern:

c
volatile uint8_t uart_rx_flag = 0;
volatile uint8_t uart_rx_byte;
void USART1_IRQHandler(void) {
if (USART1->SR & USART_SR_RXNE) {
uart_rx_byte = USART1->DR; // Read clears the flag
uart_rx_flag = 1; // Signal main loop
}
}

The main loop checks uart_rx_flag, processes the byte, and clears the flag. The ISR does almost nothing — just captures the data and gets out.

Nested Interrupts

Nesting happens automatically on Cortex-M when a higher-preemption-priority interrupt fires while a lower-priority ISR is running. The hardware pushes the running ISR's context onto the stack and starts the higher-priority handler.

px-2 py-1 rounded text-sm font-mono border
Time ──────────────────────────────────────────────────────►
Main code ████████░░░░░░░░░░░░░░░░░░░░░░░░░░░██████████
IRQ_Low (P3) ██████░░░░░░░░░░░░░░██████████
IRQ_High (P1) ██████████████
^ ^ ^ ^
│ │ │ │
Low fires High fires High Low
(preempts) returns returns

Each nesting level consumes ~32 bytes of stack (8 registers x 4 bytes). With 3 levels of nesting, you need at least 96 extra bytes of stack space beyond normal usage. This is why stack sizing must account for worst-case nesting depth.

Tail-Chaining and Late Arrival

Tail-chaining: When an ISR finishes and another interrupt is already pending, the NVIC skips the full pop-then-push cycle. Instead, it takes only 6 cycles to switch directly to the next handler. This dramatically improves throughput when multiple interrupts fire in rapid succession.

Late arrival: If a higher-priority interrupt arrives during the context-save phase of a lower-priority interrupt (within the 12-cycle entry window), the NVIC diverts to the higher-priority handler instead. When that handler finishes, it tail-chains to the original lower-priority handler. No cycles are wasted.

Sharing Data Between ISR and Main Loop

This is where most bugs hide. Three rules:

Rule 1: volatile — The compiler cannot know that an ISR modifies a variable. Without volatile, the compiler may cache the variable in a register and never re-read it from memory, causing the main loop to spin forever on a stale value.

Rule 2: Atomicity — On a 32-bit Cortex-M, a single 32-bit read or write is atomic. But a 64-bit variable, a struct, or a read-modify-write sequence is NOT atomic. If the ISR fires between a read and a write in the main loop, data corruption occurs.

Rule 3: Critical sections — For non-atomic operations on shared data, temporarily disable interrupts:

c
__disable_irq();
// Access shared multi-byte data safely
shared_struct.field1 = new_val1;
shared_struct.field2 = new_val2;
__enable_irq();

Keep critical sections as short as possible — every cycle with interrupts disabled adds to worst-case latency. For producer-consumer patterns (ISR produces, main loop consumes), a lock-free ring buffer is often better than disabling interrupts.

💡Ring Buffer Pattern

A ring buffer with separate read and write indices (each written by only one side) can be completely lock-free on 32-bit architectures. The ISR writes data and advances the write index; the main loop reads data and advances the read index. No critical section needed, as long as both indices are volatile and their updates are single 32-bit writes.

Edge vs Level Triggered

Trigger TypeFires WhenMissed-Interrupt RiskBest For
EdgeSignal transitions (rising, falling, or both)Yes — if edge occurs during ISR, the second edge is lostButton presses, encoder pulses, one-shot events
LevelSignal is held at a given stateNo — interrupt re-fires until source is clearedPeripheral status flags (UART RX ready, DMA complete)

Most internal MCU peripheral interrupts are effectively level-triggered — the interrupt stays pending as long as the status flag is set. Clearing the flag in the ISR deasserts the interrupt. If you forget to clear the flag, the ISR will fire again immediately upon return, creating an infinite loop.

External GPIO interrupts are typically configurable as edge or level. For mechanical buttons, use edge-triggered with debouncing to avoid multiple spurious interrupts from contact bounce.

Interrupt Latency Breakdown

Total interrupt response time = hardware latency + software overhead:

ComponentCycles (Cortex-M4)Notes
Finish current instruction1-12Multi-cycle instructions (divide, load-multiple) take longer
Context save (push 8 regs)12Fixed by hardware
Fetch ISR address from vector tableIncludedPart of the 12-cycle entry
ISR prologue (compiler-generated)0-4Stack frame setup if needed
Total minimum12Best case, no higher-priority ISR running

Factors that increase latency:

  • A higher-priority ISR already running (must wait for it to finish or yield)
  • Flash wait states (ISR code in Flash with wait states adds stalls)
  • Bus contention (DMA competing for bus access)
  • Critical sections in main code (__disable_irq() blocks all maskable interrupts)
⚠️Common Trap: Long Critical Sections

Every microsecond spent with interrupts disabled adds directly to worst-case interrupt latency. If your main loop disables interrupts for 50 us to update a data structure, then no interrupt can respond faster than 50 us — even if the NVIC hardware latency is only 12 cycles. Minimize critical section duration or use lock-free data structures.

Debugging Story: The Disappearing UART Bytes

A team was developing a motor controller that received commands over UART at 115200 baud. During light motor loads, communication worked perfectly. Under heavy load, bytes were randomly dropped — about 1 in 50 commands was corrupted or lost.

The root cause: the motor control ISR (running at a higher priority than UART) took 120 us to execute in worst case. At 115200 baud, a new byte arrives every ~87 us. When the motor ISR ran for 120 us, it blocked the UART ISR long enough for the UART hardware FIFO (only 1 byte deep on this MCU) to overflow. The second byte overwrote the first before the UART ISR could read it.

The fix had two parts: (1) reduce the motor ISR execution time to under 20 us by moving the heavy computation (PID calculations, sine table lookups) to the main loop and only doing the time-critical PWM register update in the ISR, and (2) enable the DMA-based UART receive to decouple reception from ISR timing entirely.

The lesson: ISR execution time does not just affect that interrupt — it affects the latency of every lower-priority interrupt in the system. Always analyze worst-case timing across all priority levels.

What Interviewers Want to Hear

  • You understand the NVIC priority system (preemption vs sub-priority, lower number = higher priority)
  • You can explain the full interrupt lifecycle: trigger, context save, vector lookup, ISR, context restore
  • You know the ISR golden rules: short, no blocking, volatile for shared data, clear the flag
  • You can analyze interrupt latency and identify what affects it
  • You understand the difference between edge and level triggered
  • You can describe how to safely share data between ISR and main loop (volatile + critical sections or lock-free structures)

Interview Focus

Classic Interview Questions

Q1: "What happens when an interrupt fires on a Cortex-M processor?"

Model Answer Starter: "The hardware finishes the current instruction, pushes eight registers (R0-R3, R12, LR, PC, xPSR) onto the active stack, loads the handler address from the vector table, and begins executing in Handler mode. The entire entry takes 12 cycles on Cortex-M3/M4. When the ISR returns via a special EXC_RETURN value in LR, the hardware pops the saved registers and resumes the interrupted code. If another interrupt is already pending, it tail-chains in 6 cycles instead of doing a full pop-and-push."

Q2: "How do you safely share data between an ISR and the main loop?"

Model Answer Starter: "Three rules: volatile so the compiler does not optimize away re-reads, atomicity for multi-byte data, and critical sections or lock-free structures for compound operations. A single 32-bit read or write is atomic on Cortex-M, but anything larger needs protection. For simple flags, a volatile uint32_t is sufficient. For streaming data, I use a lock-free ring buffer with separate read and write indices. For complex structures, I briefly disable interrupts with __disable_irq() / __enable_irq(), keeping the critical section as short as possible to minimize latency impact."

Q3: "Explain preemption priority vs sub-priority on NVIC."

Model Answer Starter: "Preemption priority determines whether one ISR can interrupt another — a lower preemption value will preempt a running ISR with a higher value. Sub-priority only matters as a tie-breaker when two interrupts with the same preemption priority pend simultaneously — the lower sub-priority runs first, but it cannot preempt the other. The split between preemption and sub-priority bits is configured globally via AIRCR.PRIGROUP. A common mistake is assuming sub-priority enables preemption — it does not."

Q4: "Why should ISRs be kept short?"

Model Answer Starter: "A long ISR blocks all lower-priority interrupts from running, increasing their worst-case latency. In a system with a motor control ISR taking 100 us, a UART ISR cannot respond faster than 100 us even though the hardware latency is 12 cycles. I keep ISRs short by doing the minimum necessary — clear the flag, capture the data, set a flag for the main loop. Heavy processing like PID calculations, protocol parsing, or data logging goes in the main loop or a deferred task."

Q5: "How would you debug a system where interrupts seem to be lost?"

Model Answer Starter: "I would check several things systematically: (1) Is the interrupt flag being cleared in the ISR? If not, a level-triggered interrupt re-fires immediately and appears to work but may mask a second event. (2) Is a higher-priority ISR blocking for too long? Use a scope or GPIO toggle to measure ISR execution times. (3) Is the main loop disabling interrupts for extended periods? (4) For edge-triggered interrupts, is the edge happening during the ISR when the flag is already cleared? I would toggle a GPIO at ISR entry/exit and use a logic analyzer to correlate interrupt signals with ISR timing."

Trap Alerts

  • Don't say: "All interrupts have the same priority" — Cortex-M has a configurable priority system; understanding it is fundamental
  • Don't forget: volatile on every variable shared between ISR and main code — the most common real-world ISR bug
  • Don't ignore: The impact of ISR execution time on lower-priority interrupt latency — this is a system-level concern, not per-interrupt

Follow-up Questions

  • "What is tail-chaining and how does it improve interrupt throughput?"
  • "How would you handle priority inversion between two ISRs that share a resource?"
  • "What is the difference between __disable_irq() and masking a specific interrupt with NVIC?"
  • "How do you determine the right stack size when using nested interrupts?"
  • "What happens if you forget to clear an interrupt flag in a level-triggered ISR?"
💡Practice Interrupt Interview Questions

Ready to test yourself? Head over to the Interrupts Interview Questions page for a full set of Q&A with collapsible answers — perfect for self-study and mock interview practice.

Practice

On Cortex-M, which priority value represents the HIGHEST priority?

What is tail-chaining in the NVIC?

Why must variables shared between an ISR and the main loop be declared volatile?

What happens if you forget to clear the interrupt flag in a level-triggered ISR?

A Cortex-M4 system has 4 priority bits and PRIGROUP=2 (2 bits preemption, 2 bits sub-priority). IRQ_A has priority 0x40 and IRQ_B has priority 0x80. Can IRQ_A preempt a running IRQ_B?

Real-World Tie-In

Automotive Engine Control — An engine ECU uses interrupts at three priority levels: crankshaft position sensor (highest, sub-microsecond deadline for ignition timing), fuel injector control (medium, millisecond-level), and CAN bus communication (lowest). The crankshaft ISR does nothing but capture a timer value and set a flag; the actual ignition angle calculation runs in a high-priority RTOS task. This separation ensures the capture timestamp is always accurate regardless of system load.

Industrial Sensor Hub — A process monitoring system reads 8 analog sensors via DMA (triggered by timer interrupt) while simultaneously receiving Modbus commands over UART. The DMA-complete ISR simply swaps buffer pointers (double-buffering); the UART ISR feeds bytes into a ring buffer. All protocol parsing and sensor averaging happens in the main loop. Under worst-case load, the maximum interrupt-disabled time is 2 us (a single pointer swap in a critical section), ensuring no UART bytes are lost even at 460800 baud.