System Design

Design a Data Logging System

System design walkthrough: embedded data logger recording 100 samples/sec for 7 days with Flash storage, circular buffering, and USB retrieval.

The Prompt

"Design a data logging system for an industrial vibration monitor. It must record 3-axis accelerometer data at 100 Hz continuously, store at least 7 days of data, and allow retrieval via USB. Power: always-on 5 V supply."


Requirements Clarification

CategoryRequirementDetail
FunctionalSample rate100 Hz per axis
Axes3 (X, Y, Z accelerometer)
Storage durationMinimum 7 days continuous
RetrievalUSB connection to host PC
Non-functionalData integrityNo silent corruption on power loss
ContinuityNo data gaps during normal operation
Timestamp accuracyBetter than 1 ms relative, RTC within 2 ppm
Storage costUnder $10 BOM for storage medium

Architecture Overview

text
┌──────────────┐ SPI ┌───────────┐ ┌──────────────┐
│ Accelerometer│───────────▶│ MCU │─────────▶│ SD Card │
│ (100 Hz, │ │ (Cortex-M4)│ │ (FAT32) │
│ 3-axis) │ │ │ │ 512 MB+ │
└──────────────┘ │ │ └──────────────┘
│ RAM buf │
┌────────▶│ 512 B │
│ │ │
┌───┴───┐ │ │ ┌──────────────┐
│ RTC │ │ │───▶│ USB (MSC) │
│ (2 ppm)│ │ │ │ Retrieval │
└────────┘ │ │ └──────────────┘
│ │
│ UART───│───▶ Debug console
└───────────┘

Component Deep Dive

Storage Sizing

Start with the raw numbers:

  • Each sample: 3 axes x 2 bytes = 6 bytes/sample
  • At 100 Hz: 6 x 100 = 600 bytes/sec
  • Per minute: 600 x 60 = 36,000 bytes = ~35 KB/min
  • Per hour: 36,000 x 60 = 2,160,000 bytes = ~2.06 MB/hr
  • Per day: 2,160,000 x 24 = 51,840,000 bytes = ~49.4 MB/day
  • 7 days: 51,840,000 x 7 = 362,880,000 bytes = ~346 MB

Add 10-byte records (with timestamp overhead): 10 x 100 x 86,400 x 7 = ~576 MB for 7 days.

A 1 GB SD card provides comfortable headroom.

Storage Medium Comparison

MediumCapacityCostInterfaceWear ConcernPC ReadableVerdict
SD card (SPI mode)1-32 GB~$2SPIModerate (wear leveling built-in)Yes (FAT32)Best choice
eMMC4-64 GB~$5SDIO/SPILow (internal wear leveling)Needs driverOver-engineered
External NOR Flash1-256 MB~$3SPILow (100K cycles)NoToo small
External NAND Flash1-8 GB~$4SPI/parallelHigh (needs FTL)NoComplex

Decision: SD card in SPI mode. Cheap, swappable, FAT32-formatted so any PC can read the files directly. No custom host software needed.


Circular Buffer Strategy

Divide storage into daily files. When storage fills, delete the oldest day and continue.

text
Day 1 Day 2 Day 3 Day 4 Day 5 Day 6 Day 7
┌────┐ ┌────┐ ┌────┐ ┌────┐ ┌────┐ ┌────┐ ┌────┐
│ ██ │ │ ██ │ │ ██ │ │ ██ │ │ ██ │ │ ██ │ │ ▓▓ │ ← writing
└────┘ └────┘ └────┘ └────┘ └────┘ └────┘ └────┘
When Day 8 starts:
1. Delete Day 1 file
2. Create Day 8 file
3. Continue writing
Day 2 Day 3 Day 4 Day 5 Day 6 Day 7 Day 8
┌────┐ ┌────┐ ┌────┐ ┌────┐ ┌────┐ ┌────┐ ┌────┐
│ ██ │ │ ██ │ │ ██ │ │ ██ │ │ ██ │ │ ██ │ │ ▓▓ │ ← writing
└────┘ └────┘ └────┘ └────┘ └────┘ └────┘ └────┘

File naming: VIB_20250115.bin (date-stamped). This makes it obvious which files are present and simplifies deletion of the oldest.

For finer granularity (e.g., hourly files), use VIB_20250115_14.bin. This limits the blast radius if a file becomes corrupt — you lose one hour instead of one day.


Data Format

Binary Record Layout (10 bytes per record)

text
Byte: 0 1 2 3 4 5 6 7 8 9
┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐
│ Timestamp (ms) │ Ax │ Ay │ Az │ Chk│
│ (32-bit LE) │(16b) │(16b) │(16b) │(8b)│
└────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘

Wait — that is 11 bytes. Let us trim: drop the per-record checksum and rely on page-level CRC instead. Final: 4-byte timestamp + 6-byte sample = 10 bytes/record.

Format Comparison

FormatSize/Record7-Day TotalHuman ReadableParse Speed
Binary (10 B)10 bytes~576 MBNoFast
CSV text~40 bytes~2.3 GBYesSlow
Binary + per-page CRC10 + amortized ~0.06 B~580 MBNoFast

Decision: Binary with per-page CRC. 4x smaller than CSV. A simple Python script on the host converts binary to CSV for analysis.

File Header (64 bytes, once per file)

text
Offset Field Size Example
0x00 Magic number 4 B "VLOG"
0x04 Version 2 B 0x0001
0x06 Sample rate (Hz) 2 B 100
0x08 Start timestamp 8 B Unix epoch (ms)
0x10 Calibration X 4 B float, mg/LSB
0x14 Calibration Y 4 B float, mg/LSB
0x18 Calibration Z 4 B float, mg/LSB
0x1C Firmware version 4 B "1.02"
0x20 Reserved 32 B 0xFF fill

Timestamp and Synchronization

  • RTC with a 2 ppm crystal gives drift of approximately 2 x 86,400 / 1,000,000 = 0.17 seconds/day, or about 1.2 seconds over 7 days. Acceptable for industrial logging.
  • Each file header stores the absolute start time (Unix epoch in milliseconds).
  • Individual records store a 32-bit relative offset in milliseconds from the file start time. This keeps per-record timestamps compact.
  • At 100 Hz, the expected offset increment is 10 ms per sample. Any deviation flags a timing anomaly.

If a network interface is available (optional), synchronize the RTC via NTP once per day to bound cumulative drift.


Power Loss Protection

The system is always-on, but power can drop unexpectedly (breaker trip, cable disconnect).

Strategy: RAM buffer with periodic flush.

text
Accelerometer ISR (100 Hz)
┌──────────┐ buffer full ┌────────────┐
│ RAM ring │ ─────────────────▶│ SD card │
│ buffer │ (every 512 B │ page write │
│ 512 bytes │ = 51 samples │ + fsync) │
│ ~510 ms │ = ~510 ms) └────────────┘
└──────────┘
power loss here = lose at most
510 ms of data (~51 samples)
  • Buffer size: 512 bytes aligns with SD card sector size.
  • 512 / 10 bytes per record = 51 samples = 510 ms of data at risk.
  • After each 512-byte write, call f_sync() (FatFs) to flush the FAT and directory entry.
  • On startup, check the last file for a valid end marker. If missing, truncate to the last complete 512-byte page (discard partial writes).

USB Retrieval

Option A: USB Mass Storage Class (MSC)

The MCU presents the SD card as a USB removable drive. The user plugs in a USB cable and sees the files in Windows Explorer or Finder.

  • Pros: Zero host-side software. Drag-and-drop files. Works on any OS.
  • Cons: Must stop logging while USB is connected (SD card cannot be shared between MCU and USB host safely). Requires USB peripheral on the MCU.

Option B: USB CDC (Virtual COM Port) with Custom Protocol

The MCU enumerates as a serial port. A Python script sends commands like LIST, GET VIB_20250115.bin, DELETE VIB_20250115.bin.

  • Pros: Can keep logging while retrieving data (MCU mediates access). More control.
  • Cons: Requires custom host software. Slower transfer.

Decision: USB MSC for simplicity. During USB connection, pause logging and buffer up to 30 seconds in RAM (3,000 samples = 30 KB). If USB session exceeds 30 seconds, accept the data gap and log the event.


Key Design Decisions

DecisionOptions ConsideredChoiceRationale
Storage mediumSD, eMMC, NOR, NANDSD card (SPI)Cheapest, PC-readable, swappable
File formatBinary, CSVBinary (10 B/record)4x smaller than CSV, page-level CRC for integrity
Buffer size64 B, 512 B, 4 KB512 BAligns with SD sector, limits data loss to ~510 ms
File granularitySingle file, daily, hourlyDaily filesSimple rotation, manageable file count
USB retrievalMSC, CDC, EthernetUSB MSCNo host software, drag-and-drop
TimestampAbsolute per record, relative offsetRelative offset with absolute headerSaves 4 bytes/record vs. full 8-byte epoch

What Interviewers Evaluate

  • Storage sizing with concrete math — showing you can work through bytes/sec to total capacity, not hand-waving "a few hundred MB."
  • Flash write patterns — understanding sector alignment, write amplification, and why you buffer in RAM before flushing.
  • Data integrity on power loss — identifying the window of vulnerability and quantifying worst-case data loss.
  • Practical retrieval — choosing USB MSC over a custom protocol shows engineering pragmatism.
  • Tradeoff articulation — every decision has alternatives. Name them, explain why you rejected them.