System Design

Embedded System Design — Interview Methodology

How to approach embedded system design interviews: requirements analysis, HW/SW partitioning, architecture patterns, and common mistakes to avoid.

How Embedded System Design Interviews Work

Embedded system design interviews are fundamentally different from web or cloud system design rounds. While a backend engineer might discuss load balancers and database sharding, you will discuss interrupt latency budgets, power modes, and whether a task belongs in an ISR or an RTOS thread.

What the interviewer is looking for:

  • A structured, top-down approach — not jumping straight to writing register configurationshello

  • Real awareness of hardware constraints (memory, power, cost, thermals)

  • Tradeoff reasoning at every decision point — not just "I would use X" but "I chose X over Y because..."

  • Safety and reliability thinking — what happens when things go wrong

Typical session format:

The interviewer gives you an open-ended prompt like "Design a battery-powered environmental sensor node" or "Design a motor controller." You have 45-60 minutes. You drive the conversation through requirements, architecture, component selection, and analysis. The interviewer watches how you think, not whether you arrive at a specific answer.


Step 1: Clarify Requirements

Never start designing before you understand what you are building. Spend the first 5-10 minutes asking questions. This is the single most important differentiator between strong and weak candidates.

Functional requirements define what the system does — inputs, outputs, processing, communication.

Non-functional requirements define how well it does it — and these dominate embedded design decisions.

Requirement CategoryExample Questions to Ask
Real-time"What is the maximum acceptable latency from input event to output response?"
Power"Is this battery-powered? What is the target battery life? Is there energy harvesting?"
Cost"What is the BOM cost target? Is this a 100-unit or 1M-unit product?"
Environment"Operating temperature range? Vibration? Ingress protection (IP rating)?"
Safety"What safety standard applies (IEC 61508, ISO 26262, DO-178C)? What is the target SIL/ASIL?"
Reliability"What is the expected operating lifetime? What is the acceptable failure rate?"
Communication"What external interfaces are needed? What protocols are already decided?"
Size"Are there PCB area constraints? How many board layers are acceptable?"

Tip: Write the requirements down on the whiteboard. Refer back to them as you make design decisions. This shows the interviewer that your choices are driven by requirements, not habit.


Step 2: Choose Architecture Pattern

Once you understand the requirements, select a software architecture pattern. This is the highest-impact decision you will make and it shapes everything downstream.

PatternDescriptionWhen to UseWhen to Avoid
Super-loopSingle while(1) with polled tasksSimple systems with 1-3 tasks, no hard real-time deadlinesMore than 5 concurrent activities, or any hard deadline under 1 ms
Super-loop + ISRMain loop for background work, ISRs for time-critical eventsModerate complexity, 1-2 hard real-time tasks, rest is non-criticalMany tasks competing for CPU time, complex inter-task communication
RTOS-basedPreemptive scheduler with prioritized tasks, queues, semaphoresMultiple concurrent tasks with different priorities and deadlinesVery resource-constrained MCUs (under 8 KB RAM), or very simple systems
Event-drivenState machines responding to events from a queueSystems with many states and transitions (UI, protocol stacks)Heavy computation that blocks the event loop
Embedded LinuxFull OS with processes, virtual memory, networking stackComplex systems needing filesystems, networking, UI, or rapid developmentHard real-time under 1 ms, low power (sleep currents matter), cost-sensitive

How to discuss this in an interview:

Do not just say "I would use FreeRTOS." Instead, say something like: "Given that we have a 20 kHz control loop alongside a 10 Hz reporting task and a CAN receive handler, a bare-metal super-loop would make the timing analysis fragile. An RTOS gives us clean priority separation — the control ISR runs at highest priority, reporting runs as a low-priority task, and we use a queue for CAN messages. The overhead is about 10 KB Flash and 2 KB RAM for the kernel, which fits within our STM32F4 budget."


Step 3: HW/SW Partitioning

Decide what belongs in hardware (dedicated IC, FPGA, analog circuit) versus firmware running on the MCU.

General principle: Push functionality into hardware when software cannot meet the timing, precision, or power requirement. Keep it in software when flexibility, cost, or development speed matters more.

DecisionHardware (FPGA / Dedicated IC)Software (MCU Firmware)
Signal processing at more than 10 MSPSYesNo — MCU cannot keep up
PID control loop at 20 kHzPossible but overkillYes — well within MCU capability
Protocol decoding (CAN, SPI)Use MCU peripheral hardwareDriver code configures the peripheral
Safety-critical watchdogExternal watchdog IC (independent failure domain)Internal WDT as secondary layer only
Power regulationHardware (LDO, buck converter)Firmware configures sleep modes

MCU Selection Considerations

FactorQuestions to Ask
PeripheralsDoes it have the ADC channels, timers, CAN, SPI, I2C we need natively?
PerformanceClock speed and DSP instructions sufficient for control loop math?
MemoryEnough Flash for code + OTA image? Enough RAM for stacks + buffers?
PowerSleep current, wake-up time, peripheral gating?
CostUnit price at target volume? Second source available?
EcosystemMature HAL/SDK? RTOS port available? Debug tools? Community support?

System Block Diagram Template

Use a block diagram like this as a starting point, then customize for the specific design prompt:

text
+------------------+ +------------------+
| Sensors / | | Actuators / |
| Inputs | | Outputs |
+--------+---------+ +--------+---------+
| ^
v |
+--------+---------+ +--------+---------+
| Signal | | Power Drive / |
| Conditioning | | Level Shifting |
+--------+---------+ +--------+---------+
| ^
v |
+--------+-------------------------+---------+
| MCU |
| +----------+ +---------+ +----------+ |
| | ADC/GPIO | | Core | | Timer/PWM| |
| +----------+ +---------+ +----------+ |
| +----------+ +---------+ +----------+ |
| | UART | | RTOS | | CAN | |
| +----------+ +---------+ +----------+ |
+-----+-----------------+--------------------+
| |
v v
+-----+------+ +-----+------+
| Debug | | External |
| Console | | Comms |
+------------+ +------------+

Step 4: Design the Software Architecture

With the architecture pattern chosen and hardware defined, decompose the software into components.

Task Decomposition (RTOS Example)

For an RTOS-based design, identify each task, its priority, period, and stack size:

TaskPriorityPeriodStack SizePurpose
Control LoopHighest50 us (20 kHz)512 BISR triggers ADC read, runs PID, updates PWM
Safety MonitorHigh1 ms256 BChecks overcurrent, temperature, watchdog kick
CommunicationMedium10 ms1024 BCAN/UART message processing
LoggingLow100 ms512 BWrite telemetry to Flash or transmit
IdleLowestN/A256 BEnter low-power sleep mode

ISR Design Rules

  • Keep ISRs short: acknowledge the interrupt, capture data, post to a queue, return
  • Never call blocking functions (malloc, printf, mutex lock) inside an ISR
  • Use deferred processing: ISR posts an event, a task handles the heavy work
  • Assign interrupt priorities carefully — nesting must be analyzed for worst-case latency

Inter-Component Communication

MechanismUse CasePitfall
Queue / Message BufferISR to task data transferOverflow if consumer is slower than producer
Semaphore / Event FlagTask synchronization, signalingPriority inversion without priority inheritance
Shared variable + critical sectionSimple flag or counterForgetting to disable interrupts around access
MutexProtecting shared resources between tasksCannot use from ISR context
MailboxSingle-item producer-consumerOverwrite if not consumed before next write

Error Handling and Watchdog Strategy

Every embedded design interview should address what happens when things go wrong:

  • Watchdog timer: kicks at the lowest-priority task to detect system hangs
  • Stack overflow detection: RTOS stack watermarking or MPU-guarded stack boundaries
  • Fault handlers: HardFault / BusFault handlers that log the fault address and reset gracefully
  • Graceful degradation: if a non-critical subsystem fails, the safety-critical path continues operating

Step 5: Analyze Constraints

Strong candidates back up design decisions with concrete numbers. This is where you demonstrate real engineering skill.

Timing Analysis

Calculate worst-case latency for your critical path:

text
Worst-case latency = ISR latency + ISR execution + task scheduling + task execution
Example (20 kHz control loop on Cortex-M4 at 168 MHz):
ISR entry latency: 12 cycles = 0.07 us
ADC read + conversion: 3.0 us
PID calculation (float): 5.0 us
PWM register update: 0.5 us
─────────────────────────────────────────────────
Total: 8.6 us out of 50 us budget (17% utilization)

A utilization under 50% leaves headroom for worst-case jitter and future feature growth.

Memory Budget

CategoryRAMFlash
RTOS kernel2 KB10 KB
Task stacks (5 tasks)3 KB
Buffers (CAN, ADC, UART)2 KB
Application variables1 KB
Application code64 KB
Lookup tables / constants8 KB
OTA update slot64 KB
Total8 KB146 KB
MCU capacity128 KB512 KB
Margin93%71%

Power Budget

text
Active mode: 40 mA @ 3.3V for 10 ms every 100 ms (10% duty cycle)
Sleep mode: 5 uA @ 3.3V for 90 ms every 100 ms (90% duty cycle)
Average current = (40 mA * 0.10) + (0.005 mA * 0.90) = 4.0045 mA
Battery life (2000 mAh CR123A):
2000 mAh / 4.0 mA = 500 hours = ~21 days

Common Interview Mistakes

MistakeWhy It Hurts YouWhat to Do Instead
Jumping to code or register detailsShows lack of system-level thinkingStart with requirements and architecture
Ignoring non-functional requirementsReal products fail on power, cost, or safety — not featuresAsk about power, cost, environment, and safety standards upfront
Saying "I would use X" without tradeoff discussionSounds like you only know one approach"I chose X over Y because Z. If the requirement changed to W, I would reconsider Y."
Over-engineeringAdding an RTOS + TCP/IP stack for a blinking LED wastes resources and adds riskMatch complexity to requirements. A super-loop is fine for simple systems.
Under-engineeringUsing a bare-metal super-loop for 10 concurrent tasks with mixed deadlinesRecognize when task count and priority separation demand an RTOS
No concrete numbers"Fast enough" and "small enough" are not engineering answersCalculate timing budgets, memory usage, and power consumption
Ignoring failure modesReal systems must handle faults, not just happy pathsDiscuss watchdogs, fault handlers, graceful degradation

What Strong Candidates Demonstrate

Systematic approach: Requirements then architecture then components then analysis. Every step before the next.

Tradeoff discussion at every decision point: "We could use a Cortex-M0+ for lower power, but the M4 gives us hardware FPU which cuts our PID calculation time from 50 us to 5 us. Given the 50 us loop budget, the M4 is necessary."

Concrete numbers: Not "the loop needs to be fast" but "the control loop runs at 20 kHz, giving us a 50 us budget. The ISR uses 17 us, leaving 33 us of margin."

Awareness of real-world constraints: BOM cost, supply chain (second source), certification requirements, manufacturing testability, field serviceability.

Safety-first thinking: Always mention what happens when things go wrong before the interviewer asks. Discuss failure modes, watchdog strategy, and safe states.

Communication skills: Narrate your thinking out loud. Use the whiteboard. Draw block diagrams. Label interfaces with data rates and protocols. The interviewer cannot evaluate reasoning they cannot see.