How Embedded System Design Interviews Work
Embedded system design interviews are fundamentally different from web or cloud system design rounds. While a backend engineer might discuss load balancers and database sharding, you will discuss interrupt latency budgets, power modes, and whether a task belongs in an ISR or an RTOS thread.
What the interviewer is looking for:
-
A structured, top-down approach — not jumping straight to writing register configurationshello
-
Real awareness of hardware constraints (memory, power, cost, thermals)
-
Tradeoff reasoning at every decision point — not just "I would use X" but "I chose X over Y because..."
-
Safety and reliability thinking — what happens when things go wrong
Typical session format:
The interviewer gives you an open-ended prompt like "Design a battery-powered environmental sensor node" or "Design a motor controller." You have 45-60 minutes. You drive the conversation through requirements, architecture, component selection, and analysis. The interviewer watches how you think, not whether you arrive at a specific answer.
Step 1: Clarify Requirements
Never start designing before you understand what you are building. Spend the first 5-10 minutes asking questions. This is the single most important differentiator between strong and weak candidates.
Functional requirements define what the system does — inputs, outputs, processing, communication.
Non-functional requirements define how well it does it — and these dominate embedded design decisions.
| Requirement Category | Example Questions to Ask |
|---|---|
| Real-time | "What is the maximum acceptable latency from input event to output response?" |
| Power | "Is this battery-powered? What is the target battery life? Is there energy harvesting?" |
| Cost | "What is the BOM cost target? Is this a 100-unit or 1M-unit product?" |
| Environment | "Operating temperature range? Vibration? Ingress protection (IP rating)?" |
| Safety | "What safety standard applies (IEC 61508, ISO 26262, DO-178C)? What is the target SIL/ASIL?" |
| Reliability | "What is the expected operating lifetime? What is the acceptable failure rate?" |
| Communication | "What external interfaces are needed? What protocols are already decided?" |
| Size | "Are there PCB area constraints? How many board layers are acceptable?" |
Tip: Write the requirements down on the whiteboard. Refer back to them as you make design decisions. This shows the interviewer that your choices are driven by requirements, not habit.
Step 2: Choose Architecture Pattern
Once you understand the requirements, select a software architecture pattern. This is the highest-impact decision you will make and it shapes everything downstream.
| Pattern | Description | When to Use | When to Avoid |
|---|---|---|---|
| Super-loop | Single while(1) with polled tasks | Simple systems with 1-3 tasks, no hard real-time deadlines | More than 5 concurrent activities, or any hard deadline under 1 ms |
| Super-loop + ISR | Main loop for background work, ISRs for time-critical events | Moderate complexity, 1-2 hard real-time tasks, rest is non-critical | Many tasks competing for CPU time, complex inter-task communication |
| RTOS-based | Preemptive scheduler with prioritized tasks, queues, semaphores | Multiple concurrent tasks with different priorities and deadlines | Very resource-constrained MCUs (under 8 KB RAM), or very simple systems |
| Event-driven | State machines responding to events from a queue | Systems with many states and transitions (UI, protocol stacks) | Heavy computation that blocks the event loop |
| Embedded Linux | Full OS with processes, virtual memory, networking stack | Complex systems needing filesystems, networking, UI, or rapid development | Hard real-time under 1 ms, low power (sleep currents matter), cost-sensitive |
How to discuss this in an interview:
Do not just say "I would use FreeRTOS." Instead, say something like: "Given that we have a 20 kHz control loop alongside a 10 Hz reporting task and a CAN receive handler, a bare-metal super-loop would make the timing analysis fragile. An RTOS gives us clean priority separation — the control ISR runs at highest priority, reporting runs as a low-priority task, and we use a queue for CAN messages. The overhead is about 10 KB Flash and 2 KB RAM for the kernel, which fits within our STM32F4 budget."
Step 3: HW/SW Partitioning
Decide what belongs in hardware (dedicated IC, FPGA, analog circuit) versus firmware running on the MCU.
General principle: Push functionality into hardware when software cannot meet the timing, precision, or power requirement. Keep it in software when flexibility, cost, or development speed matters more.
| Decision | Hardware (FPGA / Dedicated IC) | Software (MCU Firmware) |
|---|---|---|
| Signal processing at more than 10 MSPS | Yes | No — MCU cannot keep up |
| PID control loop at 20 kHz | Possible but overkill | Yes — well within MCU capability |
| Protocol decoding (CAN, SPI) | Use MCU peripheral hardware | Driver code configures the peripheral |
| Safety-critical watchdog | External watchdog IC (independent failure domain) | Internal WDT as secondary layer only |
| Power regulation | Hardware (LDO, buck converter) | Firmware configures sleep modes |
MCU Selection Considerations
| Factor | Questions to Ask |
|---|---|
| Peripherals | Does it have the ADC channels, timers, CAN, SPI, I2C we need natively? |
| Performance | Clock speed and DSP instructions sufficient for control loop math? |
| Memory | Enough Flash for code + OTA image? Enough RAM for stacks + buffers? |
| Power | Sleep current, wake-up time, peripheral gating? |
| Cost | Unit price at target volume? Second source available? |
| Ecosystem | Mature HAL/SDK? RTOS port available? Debug tools? Community support? |
System Block Diagram Template
Use a block diagram like this as a starting point, then customize for the specific design prompt:
+------------------+ +------------------+| Sensors / | | Actuators / || Inputs | | Outputs |+--------+---------+ +--------+---------+| ^v |+--------+---------+ +--------+---------+| Signal | | Power Drive / || Conditioning | | Level Shifting |+--------+---------+ +--------+---------+| ^v |+--------+-------------------------+---------+| MCU || +----------+ +---------+ +----------+ || | ADC/GPIO | | Core | | Timer/PWM| || +----------+ +---------+ +----------+ || +----------+ +---------+ +----------+ || | UART | | RTOS | | CAN | || +----------+ +---------+ +----------+ |+-----+-----------------+--------------------+| |v v+-----+------+ +-----+------+| Debug | | External || Console | | Comms |+------------+ +------------+
Step 4: Design the Software Architecture
With the architecture pattern chosen and hardware defined, decompose the software into components.
Task Decomposition (RTOS Example)
For an RTOS-based design, identify each task, its priority, period, and stack size:
| Task | Priority | Period | Stack Size | Purpose |
|---|---|---|---|---|
| Control Loop | Highest | 50 us (20 kHz) | 512 B | ISR triggers ADC read, runs PID, updates PWM |
| Safety Monitor | High | 1 ms | 256 B | Checks overcurrent, temperature, watchdog kick |
| Communication | Medium | 10 ms | 1024 B | CAN/UART message processing |
| Logging | Low | 100 ms | 512 B | Write telemetry to Flash or transmit |
| Idle | Lowest | N/A | 256 B | Enter low-power sleep mode |
ISR Design Rules
- Keep ISRs short: acknowledge the interrupt, capture data, post to a queue, return
- Never call blocking functions (malloc, printf, mutex lock) inside an ISR
- Use deferred processing: ISR posts an event, a task handles the heavy work
- Assign interrupt priorities carefully — nesting must be analyzed for worst-case latency
Inter-Component Communication
| Mechanism | Use Case | Pitfall |
|---|---|---|
| Queue / Message Buffer | ISR to task data transfer | Overflow if consumer is slower than producer |
| Semaphore / Event Flag | Task synchronization, signaling | Priority inversion without priority inheritance |
| Shared variable + critical section | Simple flag or counter | Forgetting to disable interrupts around access |
| Mutex | Protecting shared resources between tasks | Cannot use from ISR context |
| Mailbox | Single-item producer-consumer | Overwrite if not consumed before next write |
Error Handling and Watchdog Strategy
Every embedded design interview should address what happens when things go wrong:
- Watchdog timer: kicks at the lowest-priority task to detect system hangs
- Stack overflow detection: RTOS stack watermarking or MPU-guarded stack boundaries
- Fault handlers: HardFault / BusFault handlers that log the fault address and reset gracefully
- Graceful degradation: if a non-critical subsystem fails, the safety-critical path continues operating
Step 5: Analyze Constraints
Strong candidates back up design decisions with concrete numbers. This is where you demonstrate real engineering skill.
Timing Analysis
Calculate worst-case latency for your critical path:
Worst-case latency = ISR latency + ISR execution + task scheduling + task executionExample (20 kHz control loop on Cortex-M4 at 168 MHz):ISR entry latency: 12 cycles = 0.07 usADC read + conversion: 3.0 usPID calculation (float): 5.0 usPWM register update: 0.5 us─────────────────────────────────────────────────Total: 8.6 us out of 50 us budget (17% utilization)
A utilization under 50% leaves headroom for worst-case jitter and future feature growth.
Memory Budget
| Category | RAM | Flash |
|---|---|---|
| RTOS kernel | 2 KB | 10 KB |
| Task stacks (5 tasks) | 3 KB | — |
| Buffers (CAN, ADC, UART) | 2 KB | — |
| Application variables | 1 KB | — |
| Application code | — | 64 KB |
| Lookup tables / constants | — | 8 KB |
| OTA update slot | — | 64 KB |
| Total | 8 KB | 146 KB |
| MCU capacity | 128 KB | 512 KB |
| Margin | 93% | 71% |
Power Budget
Active mode: 40 mA @ 3.3V for 10 ms every 100 ms (10% duty cycle)Sleep mode: 5 uA @ 3.3V for 90 ms every 100 ms (90% duty cycle)Average current = (40 mA * 0.10) + (0.005 mA * 0.90) = 4.0045 mABattery life (2000 mAh CR123A):2000 mAh / 4.0 mA = 500 hours = ~21 days
Common Interview Mistakes
| Mistake | Why It Hurts You | What to Do Instead |
|---|---|---|
| Jumping to code or register details | Shows lack of system-level thinking | Start with requirements and architecture |
| Ignoring non-functional requirements | Real products fail on power, cost, or safety — not features | Ask about power, cost, environment, and safety standards upfront |
| Saying "I would use X" without tradeoff discussion | Sounds like you only know one approach | "I chose X over Y because Z. If the requirement changed to W, I would reconsider Y." |
| Over-engineering | Adding an RTOS + TCP/IP stack for a blinking LED wastes resources and adds risk | Match complexity to requirements. A super-loop is fine for simple systems. |
| Under-engineering | Using a bare-metal super-loop for 10 concurrent tasks with mixed deadlines | Recognize when task count and priority separation demand an RTOS |
| No concrete numbers | "Fast enough" and "small enough" are not engineering answers | Calculate timing budgets, memory usage, and power consumption |
| Ignoring failure modes | Real systems must handle faults, not just happy paths | Discuss watchdogs, fault handlers, graceful degradation |
What Strong Candidates Demonstrate
Systematic approach: Requirements then architecture then components then analysis. Every step before the next.
Tradeoff discussion at every decision point: "We could use a Cortex-M0+ for lower power, but the M4 gives us hardware FPU which cuts our PID calculation time from 50 us to 5 us. Given the 50 us loop budget, the M4 is necessary."
Concrete numbers: Not "the loop needs to be fast" but "the control loop runs at 20 kHz, giving us a 50 us budget. The ISR uses 17 us, leaving 33 us of margin."
Awareness of real-world constraints: BOM cost, supply chain (second source), certification requirements, manufacturing testability, field serviceability.
Safety-first thinking: Always mention what happens when things go wrong before the interviewer asks. Discuss failure modes, watchdog strategy, and safe states.
Communication skills: Narrate your thinking out loud. Use the whiteboard. Draw block diagrams. Label interfaces with data rates and protocols. The interviewer cannot evaluate reasoning they cannot see.