RTOS & Real-Time
intermediate
Weight: 5/10

RTOS fundamentals

Understand real-time operating systems: hard vs soft real-time, task states, RTOS vs bare-metal tradeoffs, memory model, and FreeRTOS vs Zephyr comparison.

rtos
real-time
freertos
zephyr
task-states
hard-real-time
preemptive

Quick Cap

An RTOS (Real-Time Operating System) provides deterministic task scheduling β€” the highest-priority ready task always runs, and the scheduler guarantees bounded response times. This is the foundation of every multi-tasking embedded system, from motor controllers to medical devices. "What is an RTOS and why would you use one?" is one of the most asked embedded interview questions.

Key Facts:

  • Hard real-time: Missing a deadline is a system failure (airbag deployment, motor commutation). No tolerance.
  • Soft real-time: Missing a deadline degrades quality but is not catastrophic (audio streaming, UI updates).
  • RTOS kernel is typically 5-20 KB Flash, 1-4 KB RAM (FreeRTOS). Small enough for Cortex-M0 with 32 KB Flash.
  • Task states: Ready, Running, Blocked, Suspended β€” every task is always in exactly one state.
  • Key advantage over bare-metal: Preemptive scheduling ensures a high-priority task runs immediately, regardless of what lower-priority code is doing.
  • FreeRTOS dominates IoT/general embedded; Zephyr is growing fast in commercial products with its upstream driver model.

Deep Dive

At a Glance

CharacteristicBare-Metal Super-LoopRTOSEmbedded Linux
SchedulingManual (one big loop)Priority-based preemptiveProcess-based, CFS scheduler
Response timeDepends on loop durationDeterministic (bounded)Non-deterministic (unless PREEMPT_RT)
MemoryNo overhead5-20 KB Flash, 1-4 KB RAM4+ MB RAM minimum
ConcurrencyInterrupts onlyTasks + interruptsProcesses + threads + interrupts
Memory protectionNoneNone (no MMU)Full MMU isolation
Typical CPUCortex-M0, 8-bitCortex-M0 to M7Cortex-A, x86
Best forSimple, single-functionMulti-tasking, real-timeComplex, networked, UI

Hard vs Soft Real-Time

This distinction is the first thing interviewers test:

TypeDeadline ViolationExamplesConsequence
Hard real-timeSystem failureAirbag deployment, fuel injection timing, pacemaker pulsePhysical damage, injury, death
Firm real-timeResult is useless but no damageVideo frame decode, radar pulse processingDropped frame/sample, degraded output
Soft real-timeDegraded qualityAudio playback, UI animation, telemetry reportingGlitch, stutter, delayed data

Interview insight: Most embedded systems are actually firm or soft real-time. True hard real-time is limited to safety-critical domains (automotive, medical, aerospace). However, even soft real-time benefits from an RTOS because the preemptive scheduler makes timing behavior predictable and easier to reason about.

Why RTOS vs Bare-Metal?

FactorBare-MetalRTOS
Multi-taskingManual state machines in main loopAutomatic preemptive task switching
Priority handlingInterrupts only; main loop has one priorityMultiple task priorities with guaranteed preemption
TimingDepends on worst-case loop timeDeterministic β€” high-priority task runs immediately
Code structureMonolithic loop, hard to maintain at scaleModular tasks, clean separation of concerns
Power managementManual idle detectionBuilt-in idle task with tickless sleep
OverheadZero5-20 KB Flash, 1-4 KB RAM, context switch cost
When to chooseSimple systems, ultra-low power, very tight RAMMultiple concurrent activities with different priorities
πŸ’‘When Bare-Metal is Better

If your system has a single function (read sensor, transmit, sleep), an RTOS adds complexity and RAM overhead for no benefit. A super-loop with interrupts is simpler and uses less power. RTOS shines when you have 3+ concurrent activities with different timing requirements.

Task States

Every RTOS task is always in exactly one of these states:

px-2 py-1 rounded text-sm font-mono border
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
create β”‚ β”‚ vTaskSuspend()
────────►│ Ready │◄────────────────┐
β”‚ β”‚ β”‚
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚
schedulerβ”‚picks β”‚
highest β”‚priority β”‚
β–Ό β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”
β”‚ β”‚ β”‚ β”‚
β”‚ Running │─────────►│ Suspended β”‚
β”‚ (only one) β”‚ suspend β”‚ β”‚
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚ β–²
blocks onβ”‚mutex, β”‚
queue, β”‚delay β”‚
β–Ό β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ vTaskSuspend() β”‚
β”‚ Blocked β”‚β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚ β”‚
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
event β”‚occurs
(timeout,β”‚mutex released)
β”‚
β–Ό
Ready
StateMeaningExample Trigger
ReadyEligible to run but a higher-priority task is runningTask created, unblocked by event, resumed
RunningCurrently executing on the CPU (only one at a time)Scheduler picks highest-priority ready task
BlockedWaiting for an event (timeout, mutex, queue, semaphore)vTaskDelay(), xSemaphoreTake(), xQueueReceive()
SuspendedRemoved from scheduling entirely until explicitly resumedvTaskSuspend() β€” rarely used in practice

Task Priorities

FreeRTOS: Higher number = higher priority. Priority 0 is the idle task. Your tasks start from 1 upward.

Zephyr: Lower number = higher priority by default (configurable). Priority 0 is the highest cooperative priority.

This inconsistency is a common source of confusion and a frequent interview question. Always specify which RTOS you are discussing when talking about priority numbers.

Priority assignment guidelines:

  • Safety-critical control loops: highest priority
  • Communication handlers (UART, CAN, network): medium-high
  • Data processing, logging: medium
  • UI updates, status LED: low
  • Idle task: lowest (built-in, runs when nothing else is ready)

RTOS Comparison

FeatureFreeRTOSZephyrThreadX (Azure RTOS)VxWorksQNX
LicenseMITApache 2.0MIT (now open)CommercialCommercial
Kernel size5-10 KB8-20 KB5-10 KBLargeLarge
Typical useIoT, general embeddedCommercial products, BLE, ThreadAzure IoT, medicalAerospace, defenseAutomotive (QNX Neutrino)
Supported boardsMost MCUs400+ boardsARM, RISC-VPowerPC, ARM, x86ARM, x86
NetworkingFreeRTOS+TCP, lwIPNative (BSD sockets)NetX DuoNativeNative (POSIX)
CertificationSafeRTOS (separate product)Planned (IEC 61508)IEC 62304, IEC 61508DO-178C, IEC 61508ISO 26262, IEC 62304
Learning curveLowMediumLowHighHigh

FreeRTOS is the safe default for most embedded projects β€” smallest footprint, simplest API, widest MCU support, MIT license. Zephyr is the choice when you need an upstream driver model (similar to Linux), native BLE/Thread/Matter support, or a path to safety certification.

RTOS Memory Model

Unlike Linux (which uses an MMU to give each process its own virtual address space), RTOS tasks share a single flat memory space. Every task can read and write any address. This means:

  • No memory protection between tasks β€” a buffer overflow in one task can corrupt another task's data
  • Each task gets its own stack β€” sized at creation time, cannot grow dynamically
  • Shared globals require synchronization (mutexes, queues) β€” same as ISR shared data, but with more tools available

Stack sizing is critical. Too small causes stack overflow (silent corruption or hard fault). Too large wastes RAM.

Heap Strategy (FreeRTOS)DescriptionBest For
heap_1Allocate-only, never freeStatic systems, safety-critical
heap_2Simple free, no coalescenceFixed-size allocations
heap_3Wraps standard malloc/freeWhen C library heap is available
heap_4First-fit with coalescenceGeneral purpose (most common)
heap_5Like heap_4 but spans non-contiguous regionsMultiple RAM banks
Static allocationxTaskCreateStatic() β€” no heap at allSafety-critical, deterministic
⚠️Common Trap: Stack Overflow

RTOS stack overflow is the #1 cause of mysterious crashes in embedded systems. FreeRTOS provides configCHECK_FOR_STACK_OVERFLOW (method 1: check on context switch, method 2: fill with pattern and verify). Always enable this during development. Size stacks generously at first (2-4 KB per task), then measure actual usage with uxTaskGetStackHighWaterMark() and trim.

Stack Overflow Detection

MethodHow It WorksCatches
Pattern fill (FreeRTOS method 2)Fill stack with 0xA5A5A5A5; check on context switchGradual overflow
High-water markuxTaskGetStackHighWaterMark() returns minimum free bytes everMeasure worst-case usage
MPU guard regionPlace a no-access MPU region at stack bottomImmediate fault on overflow
Canary valuePlace known value at stack bottom; check periodicallyGradual overflow

Debugging Story: Task Starvation

A team built an IoT sensor hub with 5 tasks: sensor reading (priority 3), data processing (priority 3), WiFi transmission (priority 2), LED status (priority 1), and logging (priority 1). The LED and logging tasks never ran β€” the system appeared to work but produced no log files and the status LED was frozen.

The root cause: the three higher-priority tasks (sensor, processing, WiFi) never blocked long enough for the lower-priority tasks to run. The sensor task used vTaskDelay(10) (10 ms) but the processing task used a busy-wait polling loop to check for new data instead of blocking on a queue. This polling loop consumed 100% CPU whenever data was available, starving everything below it.

The fix: replace the polling loop with xQueueReceive() with a timeout. When no data is available, the processing task blocks, allowing lower-priority tasks to run. The system went from 100% CPU to 35% average utilization.

The lesson: Every RTOS task must eventually block (on a delay, queue, semaphore, or mutex). A task that busy-waits at any priority level starves all lower-priority tasks. This is the most common RTOS design mistake.

What Interviewers Want to Hear

  • You can define hard vs soft real-time with concrete examples (not just "hard = important")
  • You know when to use an RTOS vs bare-metal (not "RTOS is always better")
  • You can draw the task state diagram and explain transitions
  • You understand the memory model (shared address space, per-task stacks, no MMU)
  • You can compare FreeRTOS and Zephyr with specific tradeoffs
  • You know about stack overflow detection and sizing strategies

Interview Focus

Classic Interview Questions

Q1: "What is an RTOS and how does it differ from a general-purpose OS?"

Model Answer Starter: "An RTOS provides deterministic task scheduling β€” the highest-priority ready task always runs within a bounded time. The key property is predictability, not speed. A general-purpose OS like Linux maximizes throughput and fairness; an RTOS guarantees worst-case response time. RTOS kernels are small (5-20 KB) with no MMU, running all tasks in a shared address space. Linux requires megabytes of RAM and provides process isolation via virtual memory. I choose RTOS for bare-metal MCU applications with real-time constraints; Linux for complex systems with networking, UI, and filesystem needs."

Q2: "Explain hard vs soft real-time with examples."

Model Answer Starter: "Hard real-time means a missed deadline is a system failure β€” not just poor performance, but potentially dangerous. Examples: airbag deployment must happen within 10 ms of crash detection; fuel injection timing must be accurate to microseconds. Soft real-time means missed deadlines degrade quality but the system continues functioning. Examples: an audio player drops a sample (audible click but no damage), a telemetry system reports data 100 ms late (acceptable). Most embedded systems are actually soft or firm real-time. True hard real-time requires formal timing analysis and often safety certification."

Q3: "When would you choose an RTOS over a bare-metal super-loop?"

Model Answer Starter: "When I have three or more concurrent activities with different timing requirements. A super-loop works well for simple systems β€” read sensor, process, transmit, repeat. But when I need to simultaneously handle a control loop at 1 kHz, a communication protocol at variable rates, and a UI update at 30 Hz, the super-loop becomes fragile. Adding a new feature requires re-analyzing the entire loop timing. With an RTOS, each activity is an independent task with its own priority. The scheduler guarantees the control loop runs on time regardless of what the communication code is doing."

Q4: "Draw the RTOS task state diagram and explain each transition."

Model Answer Starter: "Four states: Ready, Running, Blocked, and Suspended. A newly created task starts in Ready. The scheduler picks the highest-priority Ready task and moves it to Running β€” only one task runs at a time. When the running task calls a blocking function like xQueueReceive or vTaskDelay, it moves to Blocked and the next highest-priority Ready task runs. When the blocking condition is satisfied (queue receives data, delay expires, semaphore given), the task moves back to Ready. Suspended is a special state where the task is removed from scheduling entirely until explicitly resumed with vTaskResume β€” it is rarely used in practice."

Q5: "Compare FreeRTOS and Zephyr β€” when would you choose each?"

Model Answer Starter: "FreeRTOS is the safe default β€” smallest footprint (5-10 KB), simplest API, widest MCU support, MIT license, and the largest community. I choose it for straightforward IoT and general embedded projects. Zephyr is more opinionated β€” it has an upstream driver model similar to Linux, native Bluetooth LE/Thread/Matter support, device tree for hardware description, and a path to IEC 61508 certification. I choose Zephyr when I need out-of-the-box connectivity stack support, want a consistent driver API across MCU vendors, or need eventual safety certification. The tradeoff is Zephyr's steeper learning curve and larger kernel footprint."

Trap Alerts

  • Don't say: "RTOS is always better than bare-metal" β€” for simple single-function devices, bare-metal is simpler, smaller, and lower power
  • Don't forget: That every RTOS task must eventually block β€” a busy-waiting task starves all lower-priority tasks
  • Don't ignore: Stack sizing β€” RTOS stack overflow is the #1 cause of mysterious crashes in embedded systems

Follow-up Questions

  • "How do you determine the right stack size for an RTOS task?"
  • "What is the idle task and what happens when no application task is ready?"
  • "How does an RTOS handle tasks with the same priority?"
  • "What is the difference between vTaskDelay and vTaskDelayUntil?"
  • "What is configTICK_RATE_HZ and how do you choose it?"

Practice

❓ A pacemaker must deliver an electrical pulse within 1 ms of detecting a cardiac event. What type of real-time system is this?

❓ An RTOS task calls xQueueReceive() with a timeout of 100 ms but no data arrives. What state is the task in during those 100 ms?

❓ In FreeRTOS, task A has priority 3 and task B has priority 1. Which task has higher priority?

❓ Your RTOS system has a data processing task that polls a flag in a while loop instead of blocking on a queue. What problem does this cause?

❓ Which FreeRTOS heap strategy should you use for a safety-critical system that must never call free()?

Real-World Tie-In

Motor Control with RTOS β€” A brushless DC motor controller uses FreeRTOS with 3 tasks: FOC control loop (highest priority, 10 kHz, blocks on timer semaphore), CAN communication (medium, blocks on queue), and diagnostic logging (lowest, blocks on vTaskDelay). The control loop always meets its 100 us deadline because the scheduler guarantees it preempts anything else. Total RTOS overhead: 8 KB Flash, 2 KB RAM on a Cortex-M4.

IoT Environmental Monitor β€” A Zephyr-based air quality sensor uses BLE for data reporting and Thread mesh networking for multi-node communication. Zephyr was chosen over FreeRTOS because it provides native BLE and Thread stacks with a unified driver API. The system runs 6 tasks with priorities ranging from sensor sampling (highest) to BLE advertising (lowest). Stack sizes were measured with Zephyr's thread analyzer and trimmed from 2 KB default to 512-1024 bytes per task, saving 6 KB of RAM.