Linux IPC mechanisms

Quick Cap

Embedded Linux systems often split functionality across multiple processes for isolation, security, and reliability — one process handles the sensor, another runs the network stack, a third manages the UI. These processes need to communicate, and Linux provides a rich set of inter-process communication (IPC) mechanisms, each optimized for different patterns. The interview question is always: "which IPC would you use for X, and why?"

Key Facts:

Pipes: Simplest IPC. Unidirectional, parent-child only (anonymous) or named (FIFO). Good for streaming data.
Shared memory: Fastest IPC (zero-copy). Requires explicit synchronization (mutexes/semaphores). Best for large data buffers.
Unix domain sockets: Most flexible. Bidirectional, stream or datagram, supports file descriptor passing. Used by systemd, D-Bus.
Message queues (POSIX): Structured messages with priority. Good for command/event passing between processes.
Signals: Lightweight notifications (no payload). Limited to predefined signal numbers. Not for data transfer.
D-Bus: High-level message bus built on Unix sockets. Standard for Linux system services (BlueZ, NetworkManager, systemd).

Deep Dive

At a Glance

Mechanism	Direction	Data Type	Latency	Synchronization	Best For
Pipe	Unidirectional	Byte stream	Low	Built-in (blocking read/write)	Parent-child streaming
Named pipe (FIFO)	Unidirectional	Byte stream	Low	Built-in	Unrelated processes, streaming
Shared memory	Bidirectional	Any (raw bytes)	Lowest (zero-copy)	Manual (mutex/semaphore)	Large data buffers, sensor frames
Unix socket	Bidirectional	Stream or datagram	Low	Built-in	General-purpose, most flexible
Message queue	Bidirectional	Structured messages	Low	Built-in, with priority	Command/event passing
Signal	Unidirectional	Signal number only	Very low	Async (interrupts process)	Notifications, no payload
D-Bus	Bidirectional	Typed messages	Higher (~100 us)	Built-in	System service APIs

Pipes and FIFOs

Anonymous pipes are the simplest IPC — created with pipe(), they provide a unidirectional byte channel between a parent and child process. The shell's | operator uses pipes: cat file | grep pattern creates a pipe between cat and grep.

Named pipes (FIFOs) appear as files in the filesystem (created with mkfifo). Any process that can access the file can open it for reading or writing. This allows unrelated processes to communicate without a parent-child relationship.

Limitations: Pipes are unidirectional (you need two for bidirectional), have a kernel buffer (typically 64 KB on Linux), and block when the buffer is full (writer) or empty (reader). For bidirectional communication between unrelated processes, Unix sockets are almost always a better choice.

Shared Memory

Shared memory is the fastest IPC because data is not copied between processes — both processes map the same physical memory pages into their address space. The kernel is not involved in the data transfer (only in setup and teardown).

Two APIs:

POSIX (shm_open + mmap): Creates a named shared memory object in /dev/shm/. Preferred for new code.
System V (shmget + shmat): Older API, still widely used. Uses integer keys for identification.

The catch: Shared memory has no built-in synchronization. If two processes write to the same memory region simultaneously, data corruption occurs. You must pair shared memory with:

POSIX mutexes (pthread_mutex_t with PTHREAD_PROCESS_SHARED) for mutual exclusion
POSIX semaphores (sem_open) for signaling between processes
Lock-free data structures (ring buffers with separate read/write indices) for high-performance paths

Shared Memory Pattern	Synchronization	Use Case
Single writer, single reader	Lock-free ring buffer	Sensor data streaming
Multiple writers, single reader	Mutex-protected queue	Event aggregation
Multiple readers, single writer	Read-write lock or RCU-like pattern	Configuration broadcast

⚠️Common Trap: Shared Memory Without Synchronization

The most common IPC bug in embedded Linux: two processes using shared memory without any locking. It works in testing (low load, deterministic scheduling) but corrupts data under production load when processes run on different CPU cores simultaneously. Always pair shared memory with explicit synchronization.

Unix Domain Sockets

Unix domain sockets are the most versatile IPC mechanism. They use the socket API (socket, bind, connect, send, recv) but communicate locally through a filesystem path instead of a network address.

Why Unix sockets are preferred over TCP for local IPC:

Feature	Unix Domain Socket	TCP Loopback
Latency	Lower (no TCP/IP stack)	Higher (full protocol processing)
Overhead	No checksums, no sequence numbers	Full TCP overhead
File descriptor passing	Yes (`SCM_RIGHTS`)	No
Credential passing	Yes (`SO_PEERCRED`)	No
Datagram mode	Yes (reliable, unlike UDP)	No

Unix sockets support both stream (connection-oriented, like TCP) and datagram (connectionless, like UDP but reliable on Unix sockets). systemd uses Unix sockets for socket activation, D-Bus is built on Unix sockets, and most Linux system daemons use them for local communication.

D-Bus

D-Bus is a high-level message bus that provides typed, structured inter-process communication with service discovery, method calls, signals (events), and property access. It is built on Unix domain sockets but adds a protocol layer.

Two buses:

System bus: System-wide services (NetworkManager, BlueZ Bluetooth, UPower). Runs as root.
Session bus: Per-user services (desktop applications). Rarely used in headless embedded.

When to use D-Bus in embedded:

When you need a standard interface that other Linux services already use (Bluetooth via BlueZ API)
When you need service discovery ("which services are available on the bus?")
When you need a strongly-typed API between processes written in different languages

When NOT to use D-Bus:

High-frequency data streaming (D-Bus marshaling adds ~100 us per message)
Resource-constrained devices (the D-Bus daemon uses 1-2 MB RAM)
Simple point-to-point communication (Unix sockets are simpler and faster)

Signals

Signals are the most primitive IPC — they deliver an integer (signal number) to a process asynchronously. The process can catch the signal with a handler, ignore it, or let the default action occur (usually terminate).

Commonly used signals in embedded:

Signal	Default	Embedded Use
`SIGTERM`	Terminate	Graceful shutdown (flush data, close connections)
`SIGKILL`	Kill (uncatchable)	Force-kill unresponsive process
`SIGHUP`	Terminate	Reload configuration (by convention)
`SIGUSR1`	Terminate	Application-defined (toggle debug mode)
`SIGCHLD`	Ignore	Reap child processes (avoid zombies)

Limitations: Signals cannot carry data (only the signal number), are not queued (multiple same signals may be merged), and signal handlers run asynchronously in the process context — only async-signal-safe functions can be called in handlers.

Choosing the Right IPC

Scenario	Best IPC	Why
Streaming sensor data between processes	Shared memory + ring buffer	Zero-copy, lowest latency
Command/response between services	Unix domain socket (stream)	Bidirectional, reliable, simple
Event notifications across system	D-Bus signals	Service discovery, typed events
Parent launches child, pipes output	Anonymous pipe	Simplest, built-in to fork/exec
"Reload config" signal to daemon	`SIGHUP`	Convention, no data needed
Passing a file descriptor to another process	Unix socket + `SCM_RIGHTS`	Only mechanism that supports this

Debugging Story: Shared Memory Corruption in a Camera System

An embedded Linux camera system had two processes: a capture process writing frames to shared memory and a compression process reading them. During development with a single-core CPU, it worked perfectly. When deployed on a dual-core SoC, the compression process occasionally produced garbled images.

The root cause: both processes accessed the shared memory buffer without synchronization. On a single core, the scheduler ensured only one ran at a time. On dual-core, both ran simultaneously — the capture process was writing a new frame while the compression process was reading the previous one, resulting in a "torn" frame (half old, half new).

The fix: implement a double-buffer scheme with a lock-free swap mechanism. The capture process writes to buffer A while the compression process reads buffer B. When capture completes, an atomic pointer swap makes buffer A the "read" buffer. No mutex needed, no latency added.

The lesson: IPC bugs that depend on timing are the hardest to find. They often hide on single-core systems and only appear on multi-core. Always design for concurrent access from the start, even if your current hardware is single-core.

What Interviewers Want to Hear

You can compare IPC mechanisms by latency, complexity, and use case — not just list them
You understand that shared memory requires explicit synchronization
You know Unix sockets are preferred over TCP loopback for local IPC (lower overhead, FD passing)
You can recommend the right IPC for a specific architecture
You know when D-Bus is appropriate (service APIs) vs overkill (simple data streaming)
You understand signal limitations (no payload, async-unsafe handler context)

Interview Focus

Classic Interview Questions

Q1: "Compare shared memory and Unix domain sockets for IPC. When would you use each?"

Model Answer Starter: "Shared memory is the fastest — zero-copy, both processes access the same physical pages. But it requires explicit synchronization (mutexes or lock-free structures) and has no built-in flow control. I use it for high-bandwidth data like camera frames or audio buffers where latency matters. Unix domain sockets are slightly slower (data is copied through the kernel) but provide built-in flow control, connection management, and sequencing. I use them for command/response communication between services. For most embedded IPC, I default to Unix sockets unless profiling shows the copy overhead is a bottleneck."

Q2: "How do you synchronize access to shared memory between two processes?"

Model Answer Starter: "Three approaches depending on the access pattern. For single-writer, single-reader streaming data: a lock-free ring buffer with separate read and write indices, both stored in the shared region. For multiple writers: a POSIX mutex initialized with PTHREAD_PROCESS_SHARED attribute, stored in the shared memory itself. For a producer-consumer pattern: a POSIX semaphore (sem_open) to signal data availability. The key is matching the synchronization to the access pattern — a ring buffer is fastest for streaming, but a mutex is needed for random-access shared state."

Q3: "What is D-Bus and when would you use it vs a Unix socket?"

Model Answer Starter: "D-Bus is a message bus protocol built on Unix sockets that adds service discovery, typed method calls, property access, and broadcast signals. I use it when interfacing with existing Linux system services (BlueZ for Bluetooth, NetworkManager for networking) because they already expose D-Bus APIs. For custom application-level IPC where I control both sides, I use raw Unix sockets — simpler, faster, no daemon dependency. D-Bus adds about 100 us of marshaling overhead per message and requires the dbus-daemon process, so it is not suitable for high-frequency data transfer."

Q4: "What are the limitations of using signals for IPC?"

Model Answer Starter: "Signals are notifications only — they carry no data beyond the signal number. Standard signals are not queued: if SIGUSR1 is sent twice while the handler for the first is running, the second may be lost. Signal handlers run asynchronously and can only call async-signal-safe functions — no malloc, no printf, no mutex operations. The safe pattern is to set a volatile flag in the handler and check it in the main loop. For anything beyond simple notifications, use a proper IPC mechanism."

Q5: "You need to stream 30 FPS camera frames (1 MB each) between two processes. Which IPC would you choose?"

Model Answer Starter: "Shared memory with a double-buffer or ring buffer scheme. At 30 MB/s, copying data through pipes or sockets would consume significant CPU and add latency. With shared memory, the capture process writes directly to a buffer, then atomically swaps the buffer pointer. The consumer reads the previous buffer. Zero copy, sub-millisecond latency. I would use POSIX shared memory (shm_open + mmap) with cache-line-aligned buffers, and a lock-free swap mechanism using atomic operations for the buffer index."

Trap Alerts

Don't say: "Just use shared memory, it's the fastest" — without mentioning synchronization requirements
Don't forget: Unix domain socket datagrams ARE reliable (unlike UDP) — a common misconception
Don't ignore: D-Bus overhead — it is inappropriate for high-frequency data but perfect for service APIs

Follow-up Questions

"How would you implement a watchdog that monitors multiple processes using IPC?"
"What is socket activation in systemd and how does it relate to IPC?"
"How do you pass a file descriptor from one process to another?"
"What is the difference between POSIX and System V shared memory?"

Practice

❓ Which IPC mechanism provides zero-copy data transfer between processes?

❓ Why are Unix domain sockets preferred over TCP loopback (127.0.0.1) for local IPC?

❓ What happens if two processes write to shared memory simultaneously without synchronization?

❓ A signal handler calls printf() to log the signal. What is wrong with this?

Real-World Tie-In

Automotive Sensor Fusion — An ADAS system runs camera capture, radar processing, and fusion algorithm as separate processes for fault isolation (if camera crashes, radar continues). Camera frames (2 MB, 30 FPS) go through shared memory with double buffering. Radar data (1 KB commands) goes through Unix domain sockets. The fusion process reads from both. This architecture survives individual process crashes without losing the other sensor feeds.

Smart Home Hub — A home automation gateway uses D-Bus as its central message bus. The Zigbee process publishes device events on D-Bus, the automation engine subscribes to events and sends commands back, and the web UI process queries device state via D-Bus properties. D-Bus service discovery means new protocol handlers (Z-Wave, Matter) can be added as drop-in services without modifying existing code.