How does dynamic frequency scaling work on an MCU, and what are the pitfalls?
Dynamic frequency scaling (DFS) means changing the CPU clock frequency at runtime based on the current workload — running at full speed during computationally intensive phases (signal processing, protocol handling) and dropping to a lower frequency during idle or light-load phases to save power. Since dynamic power scales linearly with frequency (P = C * V^2 * f), halving the clock frequency roughly halves the dynamic power consumption. On STM32, DFS is implemented by changing the PLL multiplier/divider values or the AHB/APB prescalers while the system is running.
The implementation sequence is critical and mirrors the initial clock configuration in reverse. To reduce frequency: first change the PLL configuration or prescalers to the lower frequency, wait for the PLL to re-lock (if PLL settings changed), then reduce Flash wait states to match the new frequency (fewer wait states actually improves performance at lower speeds). To increase frequency: first increase Flash wait states to accommodate the higher speed, then reconfigure the PLL/prescalers, and wait for PLL lock. The ordering of Flash wait states relative to frequency change is essential — too few wait states at a higher frequency causes instruction fetch corruption and hard faults, while too many wait states at a lower frequency is safe but wastes performance.
The major pitfall is that every peripheral whose timing depends on the clock frequency must be reconfigured after a frequency change. UART baud rate is derived from the APB clock — if you halve PCLK1 without updating the USART BRR register, the baud rate halves and communication fails. SPI clock dividers, timer prescaler/ARR values, ADC sampling time, I2C timing registers, and SysTick reload values all depend on their bus clock frequency. Missing even one reconfiguration produces a subtle bug: the UART might work at the slightly-wrong baud rate with occasional framing errors, or a timer interrupt fires at half the expected rate. A robust DFS implementation maintains a table of peripheral reconfiguration callbacks that are invoked automatically whenever the clock changes. Alternatively, some STM32H7 designs avoid full PLL reconfiguration by keeping the PLL fixed and only changing the AHB prescaler, which is simpler but limits the frequency range to powers-of-two divisions.
Source: MCU Cores & Clocking Q&A
