What is a deadlock and how do you prevent it?

Question

Accepted Answer

A deadlock occurs when two or more tasks are each waiting for a resource held by another, creating a circular dependency where none can make progress. The classic example: Task A holds Mutex 1 and waits for Mutex 2, while Task B holds Mutex 2 and waits for Mutex 1. Neither can proceed.

Four conditions must all be true for deadlock to occur (the Coffman conditions): (1) mutual exclusion — resources are non-shareable; (2) hold and wait — a task holds at least one resource while waiting for another; (3) no preemption — resources cannot be forcibly taken away; (4) circular wait — a circular chain of tasks each waiting for a resource held by the next.

Prevention strategies break one or more of these conditions. The most practical approach in embedded systems is lock ordering — always acquire multiple mutexes in the same global order (e.g., always lock Mutex 1 before Mutex 2), which eliminates circular wait. Other approaches include using timeout-based locking (try to acquire with a timeout, release all locks and retry if it fails), minimizing lock scope (hold locks for the shortest possible time), and avoiding nested locks entirely when possible. In a small RTOS with a handful of mutexes, a disciplined lock ordering convention documented in the code is usually sufficient. Some RTOS kernels (like Zephyr) provide deadlock detection in debug builds to catch these issues during development.