Embedded Linux Essentials
intermediate
Weight: 4/10

Linux IPC mechanisms

Compare Linux inter-process communication mechanisms — pipes, shared memory, Unix sockets, message queues, signals, and D-Bus — and know when to use each in embedded systems.

embedded-linux
ipc
pipes
shared-memory
sockets
dbus
signals

Quick Cap

Embedded Linux systems often split functionality across multiple processes for isolation, security, and reliability — one process handles the sensor, another runs the network stack, a third manages the UI. These processes need to communicate, and Linux provides a rich set of inter-process communication (IPC) mechanisms, each optimized for different patterns. The interview question is always: "which IPC would you use for X, and why?"

Key Facts:

  • Pipes: Simplest IPC. Unidirectional, parent-child only (anonymous) or named (FIFO). Good for streaming data.
  • Shared memory: Fastest IPC (zero-copy). Requires explicit synchronization (mutexes/semaphores). Best for large data buffers.
  • Unix domain sockets: Most flexible. Bidirectional, stream or datagram, supports file descriptor passing. Used by systemd, D-Bus.
  • Message queues (POSIX): Structured messages with priority. Good for command/event passing between processes.
  • Signals: Lightweight notifications (no payload). Limited to predefined signal numbers. Not for data transfer.
  • D-Bus: High-level message bus built on Unix sockets. Standard for Linux system services (BlueZ, NetworkManager, systemd).

Deep Dive

At a Glance

MechanismDirectionData TypeLatencySynchronizationBest For
PipeUnidirectionalByte streamLowBuilt-in (blocking read/write)Parent-child streaming
Named pipe (FIFO)UnidirectionalByte streamLowBuilt-inUnrelated processes, streaming
Shared memoryBidirectionalAny (raw bytes)Lowest (zero-copy)Manual (mutex/semaphore)Large data buffers, sensor frames
Unix socketBidirectionalStream or datagramLowBuilt-inGeneral-purpose, most flexible
Message queueBidirectionalStructured messagesLowBuilt-in, with priorityCommand/event passing
SignalUnidirectionalSignal number onlyVery lowAsync (interrupts process)Notifications, no payload
D-BusBidirectionalTyped messagesHigher (~100 us)Built-inSystem service APIs

Pipes and FIFOs

Anonymous pipes are the simplest IPC — created with pipe(), they provide a unidirectional byte channel between a parent and child process. The shell's | operator uses pipes: cat file | grep pattern creates a pipe between cat and grep.

Named pipes (FIFOs) appear as files in the filesystem (created with mkfifo). Any process that can access the file can open it for reading or writing. This allows unrelated processes to communicate without a parent-child relationship.

Limitations: Pipes are unidirectional (you need two for bidirectional), have a kernel buffer (typically 64 KB on Linux), and block when the buffer is full (writer) or empty (reader). For bidirectional communication between unrelated processes, Unix sockets are almost always a better choice.

Shared Memory

Shared memory is the fastest IPC because data is not copied between processes — both processes map the same physical memory pages into their address space. The kernel is not involved in the data transfer (only in setup and teardown).

Two APIs:

  • POSIX (shm_open + mmap): Creates a named shared memory object in /dev/shm/. Preferred for new code.
  • System V (shmget + shmat): Older API, still widely used. Uses integer keys for identification.

The catch: Shared memory has no built-in synchronization. If two processes write to the same memory region simultaneously, data corruption occurs. You must pair shared memory with:

  • POSIX mutexes (pthread_mutex_t with PTHREAD_PROCESS_SHARED) for mutual exclusion
  • POSIX semaphores (sem_open) for signaling between processes
  • Lock-free data structures (ring buffers with separate read/write indices) for high-performance paths
Shared Memory PatternSynchronizationUse Case
Single writer, single readerLock-free ring bufferSensor data streaming
Multiple writers, single readerMutex-protected queueEvent aggregation
Multiple readers, single writerRead-write lock or RCU-like patternConfiguration broadcast
⚠️Common Trap: Shared Memory Without Synchronization

The most common IPC bug in embedded Linux: two processes using shared memory without any locking. It works in testing (low load, deterministic scheduling) but corrupts data under production load when processes run on different CPU cores simultaneously. Always pair shared memory with explicit synchronization.

Unix Domain Sockets

Unix domain sockets are the most versatile IPC mechanism. They use the socket API (socket, bind, connect, send, recv) but communicate locally through a filesystem path instead of a network address.

Why Unix sockets are preferred over TCP for local IPC:

FeatureUnix Domain SocketTCP Loopback
LatencyLower (no TCP/IP stack)Higher (full protocol processing)
OverheadNo checksums, no sequence numbersFull TCP overhead
File descriptor passingYes (SCM_RIGHTS)No
Credential passingYes (SO_PEERCRED)No
Datagram modeYes (reliable, unlike UDP)No

Unix sockets support both stream (connection-oriented, like TCP) and datagram (connectionless, like UDP but reliable on Unix sockets). systemd uses Unix sockets for socket activation, D-Bus is built on Unix sockets, and most Linux system daemons use them for local communication.

D-Bus

D-Bus is a high-level message bus that provides typed, structured inter-process communication with service discovery, method calls, signals (events), and property access. It is built on Unix domain sockets but adds a protocol layer.

Two buses:

  • System bus: System-wide services (NetworkManager, BlueZ Bluetooth, UPower). Runs as root.
  • Session bus: Per-user services (desktop applications). Rarely used in headless embedded.

When to use D-Bus in embedded:

  • When you need a standard interface that other Linux services already use (Bluetooth via BlueZ API)
  • When you need service discovery ("which services are available on the bus?")
  • When you need a strongly-typed API between processes written in different languages

When NOT to use D-Bus:

  • High-frequency data streaming (D-Bus marshaling adds ~100 us per message)
  • Resource-constrained devices (the D-Bus daemon uses 1-2 MB RAM)
  • Simple point-to-point communication (Unix sockets are simpler and faster)

Signals

Signals are the most primitive IPC — they deliver an integer (signal number) to a process asynchronously. The process can catch the signal with a handler, ignore it, or let the default action occur (usually terminate).

Commonly used signals in embedded:

SignalDefaultEmbedded Use
SIGTERMTerminateGraceful shutdown (flush data, close connections)
SIGKILLKill (uncatchable)Force-kill unresponsive process
SIGHUPTerminateReload configuration (by convention)
SIGUSR1TerminateApplication-defined (toggle debug mode)
SIGCHLDIgnoreReap child processes (avoid zombies)

Limitations: Signals cannot carry data (only the signal number), are not queued (multiple same signals may be merged), and signal handlers run asynchronously in the process context — only async-signal-safe functions can be called in handlers.

Choosing the Right IPC

ScenarioBest IPCWhy
Streaming sensor data between processesShared memory + ring bufferZero-copy, lowest latency
Command/response between servicesUnix domain socket (stream)Bidirectional, reliable, simple
Event notifications across systemD-Bus signalsService discovery, typed events
Parent launches child, pipes outputAnonymous pipeSimplest, built-in to fork/exec
"Reload config" signal to daemonSIGHUPConvention, no data needed
Passing a file descriptor to another processUnix socket + SCM_RIGHTSOnly mechanism that supports this

Debugging Story: Shared Memory Corruption in a Camera System

An embedded Linux camera system had two processes: a capture process writing frames to shared memory and a compression process reading them. During development with a single-core CPU, it worked perfectly. When deployed on a dual-core SoC, the compression process occasionally produced garbled images.

The root cause: both processes accessed the shared memory buffer without synchronization. On a single core, the scheduler ensured only one ran at a time. On dual-core, both ran simultaneously — the capture process was writing a new frame while the compression process was reading the previous one, resulting in a "torn" frame (half old, half new).

The fix: implement a double-buffer scheme with a lock-free swap mechanism. The capture process writes to buffer A while the compression process reads buffer B. When capture completes, an atomic pointer swap makes buffer A the "read" buffer. No mutex needed, no latency added.

The lesson: IPC bugs that depend on timing are the hardest to find. They often hide on single-core systems and only appear on multi-core. Always design for concurrent access from the start, even if your current hardware is single-core.

What Interviewers Want to Hear

  • You can compare IPC mechanisms by latency, complexity, and use case — not just list them
  • You understand that shared memory requires explicit synchronization
  • You know Unix sockets are preferred over TCP loopback for local IPC (lower overhead, FD passing)
  • You can recommend the right IPC for a specific architecture
  • You know when D-Bus is appropriate (service APIs) vs overkill (simple data streaming)
  • You understand signal limitations (no payload, async-unsafe handler context)

Interview Focus

Classic Interview Questions

Q1: "Compare shared memory and Unix domain sockets for IPC. When would you use each?"

Model Answer Starter: "Shared memory is the fastest — zero-copy, both processes access the same physical pages. But it requires explicit synchronization (mutexes or lock-free structures) and has no built-in flow control. I use it for high-bandwidth data like camera frames or audio buffers where latency matters. Unix domain sockets are slightly slower (data is copied through the kernel) but provide built-in flow control, connection management, and sequencing. I use them for command/response communication between services. For most embedded IPC, I default to Unix sockets unless profiling shows the copy overhead is a bottleneck."

Q2: "How do you synchronize access to shared memory between two processes?"

Model Answer Starter: "Three approaches depending on the access pattern. For single-writer, single-reader streaming data: a lock-free ring buffer with separate read and write indices, both stored in the shared region. For multiple writers: a POSIX mutex initialized with PTHREAD_PROCESS_SHARED attribute, stored in the shared memory itself. For a producer-consumer pattern: a POSIX semaphore (sem_open) to signal data availability. The key is matching the synchronization to the access pattern — a ring buffer is fastest for streaming, but a mutex is needed for random-access shared state."

Q3: "What is D-Bus and when would you use it vs a Unix socket?"

Model Answer Starter: "D-Bus is a message bus protocol built on Unix sockets that adds service discovery, typed method calls, property access, and broadcast signals. I use it when interfacing with existing Linux system services (BlueZ for Bluetooth, NetworkManager for networking) because they already expose D-Bus APIs. For custom application-level IPC where I control both sides, I use raw Unix sockets — simpler, faster, no daemon dependency. D-Bus adds about 100 us of marshaling overhead per message and requires the dbus-daemon process, so it is not suitable for high-frequency data transfer."

Q4: "What are the limitations of using signals for IPC?"

Model Answer Starter: "Signals are notifications only — they carry no data beyond the signal number. Standard signals are not queued: if SIGUSR1 is sent twice while the handler for the first is running, the second may be lost. Signal handlers run asynchronously and can only call async-signal-safe functions — no malloc, no printf, no mutex operations. The safe pattern is to set a volatile flag in the handler and check it in the main loop. For anything beyond simple notifications, use a proper IPC mechanism."

Q5: "You need to stream 30 FPS camera frames (1 MB each) between two processes. Which IPC would you choose?"

Model Answer Starter: "Shared memory with a double-buffer or ring buffer scheme. At 30 MB/s, copying data through pipes or sockets would consume significant CPU and add latency. With shared memory, the capture process writes directly to a buffer, then atomically swaps the buffer pointer. The consumer reads the previous buffer. Zero copy, sub-millisecond latency. I would use POSIX shared memory (shm_open + mmap) with cache-line-aligned buffers, and a lock-free swap mechanism using atomic operations for the buffer index."

Trap Alerts

  • Don't say: "Just use shared memory, it's the fastest" — without mentioning synchronization requirements
  • Don't forget: Unix domain socket datagrams ARE reliable (unlike UDP) — a common misconception
  • Don't ignore: D-Bus overhead — it is inappropriate for high-frequency data but perfect for service APIs

Follow-up Questions

  • "How would you implement a watchdog that monitors multiple processes using IPC?"
  • "What is socket activation in systemd and how does it relate to IPC?"
  • "How do you pass a file descriptor from one process to another?"
  • "What is the difference between POSIX and System V shared memory?"

Practice

Which IPC mechanism provides zero-copy data transfer between processes?

Why are Unix domain sockets preferred over TCP loopback (127.0.0.1) for local IPC?

What happens if two processes write to shared memory simultaneously without synchronization?

A signal handler calls printf() to log the signal. What is wrong with this?

Real-World Tie-In

Automotive Sensor Fusion — An ADAS system runs camera capture, radar processing, and fusion algorithm as separate processes for fault isolation (if camera crashes, radar continues). Camera frames (2 MB, 30 FPS) go through shared memory with double buffering. Radar data (1 KB commands) goes through Unix domain sockets. The fusion process reads from both. This architecture survives individual process crashes without losing the other sensor feeds.

Smart Home Hub — A home automation gateway uses D-Bus as its central message bus. The Zigbee process publishes device events on D-Bus, the automation engine subscribes to events and sends commands back, and the web UI process queries device state via D-Bus properties. D-Bus service discovery means new protocol handlers (Z-Wave, Matter) can be added as drop-in services without modifying existing code.