Search topics...
CPU FundamentalsBus Architecture & Memory Typesfoundational

What is the difference between Flash, SRAM, TCM, and CCM on Cortex-M7?

0 upvotes
Practice with AISoon
Study the fundamentals first — CPU Fundamentals topic page

These are the four primary memory types in a Cortex-M7 system, each with different characteristics that dictate how firmware should use them:

Flash (typically 512 KB to 2 MB) stores the program binary and constant data. It is non-volatile — contents survive power cycling. The key limitation is wait states: Flash access is slower than the CPU clock. At 400 MHz, a Cortex-M7 may need 4-7 wait states per Flash read, meaning an instruction fetch from Flash takes 5-8 clock cycles. The ART (Adaptive Real-Time) accelerator and instruction cache mitigate this by caching recently accessed Flash lines, but cache misses still incur the full wait-state penalty. Flash is also read-only during normal execution — writes require an erase-then-program sequence that takes milliseconds and blocks further reads.

SRAM (typically 256-512 KB, split across multiple banks) holds runtime data: global variables, heap, and general-purpose buffers. SRAM access through the bus matrix takes 1-2 clock cycles at full speed, but this can increase when the DMA is also accessing SRAM — bus contention adds wait cycles. On Cortex-M7, SRAM is cacheable by default, which improves average access time for frequently used data but introduces the cache coherency issues discussed earlier.

DTCM/ITCM (64-128 KB each) provide zero wait-state, deterministic access directly coupled to the CPU core. DTCM is for data — stack, ISR variables, DMA buffers that need cache-free access. ITCM is for code — critical ISRs, real-time control functions. TCM bypasses both the cache and the bus matrix, so access latency is always exactly one cycle regardless of DMA activity or cache state. The trade-off is size (much smaller than general SRAM) and DMA limitations (some STM32 variants cannot DMA directly to/from TCM, or have limited DMA connectivity).

CCM (Core-Coupled Memory) is the predecessor to TCM, found on Cortex-M4 parts like the STM32F4. It provides zero wait-state access on a dedicated bus (D-bus), bypassing the main bus matrix. Unlike TCM, CCM on F4 is not accessible by DMA — this is a hard hardware limitation. CCM is ideal for stack memory and CPU-only data structures (RTOS task stacks, computation buffers) but unusable for any buffer involved in DMA transfers. On M7 parts, CCM is replaced by DTCM/ITCM, which have broader DMA connectivity.

Source: CPU Fundamentals Q&A