Search topics...
DMATransfer Fundamentalsfoundational

What is the overhead of setting up a DMA transfer, and when does it outweigh the benefit?

0 upvotes
Practice with AISoon
Study the fundamentals first — DMA topic page

Setting up a DMA transfer requires configuring multiple registers: source address, destination address, transfer count (number of data items), data width (byte, half-word, or word), increment modes for source and destination, circular vs. normal mode, channel priority, and finally enabling the channel. On STM32, this translates to 10-20 register writes. Including function call overhead if using the HAL (HAL_DMA_Start() or HAL_DMA_Start_IT()), the total setup cost is roughly 50-200 CPU cycles depending on HAL version and compiler optimization level.

This setup cost is amortized over the entire transfer. For a 1000-byte SPI transfer, 100 cycles of setup versus 1000 cycles of CPU byte-banging is a clear win — especially since the CPU is free during the DMA transfer. But for a 2-byte I2C register read, the DMA setup alone takes longer than the CPU would need to simply poll the two bytes. The crossover is typically around 8-16 bytes, depending on the peripheral clock speed and CPU clock.

For applications that perform repeated small transfers to the same peripheral (e.g., periodic 8-byte SPI sensor reads), the solution is to configure DMA once and re-trigger it for each transfer by simply updating the transfer count register and re-enabling the channel — reducing per-transfer overhead to 2-3 register writes. Circular mode eliminates re-configuration entirely for continuous transfers, making DMA effectively zero-overhead after the initial setup. The key insight: DMA is an investment that pays off in throughput and CPU freedom, not in latency for individual small operations.

Source: DMA Q&A