The Prompt
"Design a data logging system for an industrial vibration monitor. It must record 3-axis accelerometer data at 100 Hz continuously, store at least 7 days of data, and allow retrieval via USB. Power: always-on 5 V supply."
Requirements Clarification
| Category | Requirement | Detail |
|---|---|---|
| Functional | Sample rate | 100 Hz per axis |
| Axes | 3 (X, Y, Z accelerometer) | |
| Storage duration | Minimum 7 days continuous | |
| Retrieval | USB connection to host PC | |
| Non-functional | Data integrity | No silent corruption on power loss |
| Continuity | No data gaps during normal operation | |
| Timestamp accuracy | Better than 1 ms relative, RTC within 2 ppm | |
| Storage cost | Under $10 BOM for storage medium |
Architecture Overview
┌──────────────┐ SPI ┌───────────┐ ┌──────────────┐│ Accelerometer│───────────▶│ MCU │─────────▶│ SD Card ││ (100 Hz, │ │ (Cortex-M4)│ │ (FAT32) ││ 3-axis) │ │ │ │ 512 MB+ │└──────────────┘ │ │ └──────────────┘│ RAM buf │┌────────▶│ 512 B ││ │ │┌───┴───┐ │ │ ┌──────────────┐│ RTC │ │ │───▶│ USB (MSC) ││ (2 ppm)│ │ │ │ Retrieval │└────────┘ │ │ └──────────────┘│ ││ UART───│───▶ Debug console└───────────┘
Component Deep Dive
Storage Sizing
Start with the raw numbers:
- Each sample: 3 axes x 2 bytes = 6 bytes/sample
- At 100 Hz: 6 x 100 = 600 bytes/sec
- Per minute: 600 x 60 = 36,000 bytes = ~35 KB/min
- Per hour: 36,000 x 60 = 2,160,000 bytes = ~2.06 MB/hr
- Per day: 2,160,000 x 24 = 51,840,000 bytes = ~49.4 MB/day
- 7 days: 51,840,000 x 7 = 362,880,000 bytes = ~346 MB
Add 10-byte records (with timestamp overhead): 10 x 100 x 86,400 x 7 = ~576 MB for 7 days.
A 1 GB SD card provides comfortable headroom.
Storage Medium Comparison
| Medium | Capacity | Cost | Interface | Wear Concern | PC Readable | Verdict |
|---|---|---|---|---|---|---|
| SD card (SPI mode) | 1-32 GB | ~$2 | SPI | Moderate (wear leveling built-in) | Yes (FAT32) | Best choice |
| eMMC | 4-64 GB | ~$5 | SDIO/SPI | Low (internal wear leveling) | Needs driver | Over-engineered |
| External NOR Flash | 1-256 MB | ~$3 | SPI | Low (100K cycles) | No | Too small |
| External NAND Flash | 1-8 GB | ~$4 | SPI/parallel | High (needs FTL) | No | Complex |
Decision: SD card in SPI mode. Cheap, swappable, FAT32-formatted so any PC can read the files directly. No custom host software needed.
Circular Buffer Strategy
Divide storage into daily files. When storage fills, delete the oldest day and continue.
Day 1 Day 2 Day 3 Day 4 Day 5 Day 6 Day 7┌────┐ ┌────┐ ┌────┐ ┌────┐ ┌────┐ ┌────┐ ┌────┐│ ██ │ │ ██ │ │ ██ │ │ ██ │ │ ██ │ │ ██ │ │ ▓▓ │ ← writing└────┘ └────┘ └────┘ └────┘ └────┘ └────┘ └────┘When Day 8 starts:1. Delete Day 1 file2. Create Day 8 file3. Continue writingDay 2 Day 3 Day 4 Day 5 Day 6 Day 7 Day 8┌────┐ ┌────┐ ┌────┐ ┌────┐ ┌────┐ ┌────┐ ┌────┐│ ██ │ │ ██ │ │ ██ │ │ ██ │ │ ██ │ │ ██ │ │ ▓▓ │ ← writing└────┘ └────┘ └────┘ └────┘ └────┘ └────┘ └────┘
File naming: VIB_20250115.bin (date-stamped). This makes it obvious which files are present and simplifies deletion of the oldest.
For finer granularity (e.g., hourly files), use VIB_20250115_14.bin. This limits the blast radius if a file becomes corrupt — you lose one hour instead of one day.
Data Format
Binary Record Layout (10 bytes per record)
Byte: 0 1 2 3 4 5 6 7 8 9┌────┬────┬────┬────┬────┬────┬────┬────┬────┬────┐│ Timestamp (ms) │ Ax │ Ay │ Az │ Chk││ (32-bit LE) │(16b) │(16b) │(16b) │(8b)│└────┴────┴────┴────┴────┴────┴────┴────┴────┴────┘
Wait — that is 11 bytes. Let us trim: drop the per-record checksum and rely on page-level CRC instead. Final: 4-byte timestamp + 6-byte sample = 10 bytes/record.
Format Comparison
| Format | Size/Record | 7-Day Total | Human Readable | Parse Speed |
|---|---|---|---|---|
| Binary (10 B) | 10 bytes | ~576 MB | No | Fast |
| CSV text | ~40 bytes | ~2.3 GB | Yes | Slow |
| Binary + per-page CRC | 10 + amortized ~0.06 B | ~580 MB | No | Fast |
Decision: Binary with per-page CRC. 4x smaller than CSV. A simple Python script on the host converts binary to CSV for analysis.
File Header (64 bytes, once per file)
Offset Field Size Example0x00 Magic number 4 B "VLOG"0x04 Version 2 B 0x00010x06 Sample rate (Hz) 2 B 1000x08 Start timestamp 8 B Unix epoch (ms)0x10 Calibration X 4 B float, mg/LSB0x14 Calibration Y 4 B float, mg/LSB0x18 Calibration Z 4 B float, mg/LSB0x1C Firmware version 4 B "1.02"0x20 Reserved 32 B 0xFF fill
Timestamp and Synchronization
- RTC with a 2 ppm crystal gives drift of approximately 2 x 86,400 / 1,000,000 = 0.17 seconds/day, or about 1.2 seconds over 7 days. Acceptable for industrial logging.
- Each file header stores the absolute start time (Unix epoch in milliseconds).
- Individual records store a 32-bit relative offset in milliseconds from the file start time. This keeps per-record timestamps compact.
- At 100 Hz, the expected offset increment is 10 ms per sample. Any deviation flags a timing anomaly.
If a network interface is available (optional), synchronize the RTC via NTP once per day to bound cumulative drift.
Power Loss Protection
The system is always-on, but power can drop unexpectedly (breaker trip, cable disconnect).
Strategy: RAM buffer with periodic flush.
Accelerometer ISR (100 Hz)│▼┌──────────┐ buffer full ┌────────────┐│ RAM ring │ ─────────────────▶│ SD card ││ buffer │ (every 512 B │ page write ││ 512 bytes │ = 51 samples │ + fsync) ││ ~510 ms │ = ~510 ms) └────────────┘└──────────┘│power loss here = lose at most510 ms of data (~51 samples)
- Buffer size: 512 bytes aligns with SD card sector size.
- 512 / 10 bytes per record = 51 samples = 510 ms of data at risk.
- After each 512-byte write, call
f_sync()(FatFs) to flush the FAT and directory entry. - On startup, check the last file for a valid end marker. If missing, truncate to the last complete 512-byte page (discard partial writes).
USB Retrieval
Option A: USB Mass Storage Class (MSC)
The MCU presents the SD card as a USB removable drive. The user plugs in a USB cable and sees the files in Windows Explorer or Finder.
- Pros: Zero host-side software. Drag-and-drop files. Works on any OS.
- Cons: Must stop logging while USB is connected (SD card cannot be shared between MCU and USB host safely). Requires USB peripheral on the MCU.
Option B: USB CDC (Virtual COM Port) with Custom Protocol
The MCU enumerates as a serial port. A Python script sends commands like LIST, GET VIB_20250115.bin, DELETE VIB_20250115.bin.
- Pros: Can keep logging while retrieving data (MCU mediates access). More control.
- Cons: Requires custom host software. Slower transfer.
Decision: USB MSC for simplicity. During USB connection, pause logging and buffer up to 30 seconds in RAM (3,000 samples = 30 KB). If USB session exceeds 30 seconds, accept the data gap and log the event.
Key Design Decisions
| Decision | Options Considered | Choice | Rationale |
|---|---|---|---|
| Storage medium | SD, eMMC, NOR, NAND | SD card (SPI) | Cheapest, PC-readable, swappable |
| File format | Binary, CSV | Binary (10 B/record) | 4x smaller than CSV, page-level CRC for integrity |
| Buffer size | 64 B, 512 B, 4 KB | 512 B | Aligns with SD sector, limits data loss to ~510 ms |
| File granularity | Single file, daily, hourly | Daily files | Simple rotation, manageable file count |
| USB retrieval | MSC, CDC, Ethernet | USB MSC | No host software, drag-and-drop |
| Timestamp | Absolute per record, relative offset | Relative offset with absolute header | Saves 4 bytes/record vs. full 8-byte epoch |
What Interviewers Evaluate
- Storage sizing with concrete math — showing you can work through bytes/sec to total capacity, not hand-waving "a few hundred MB."
- Flash write patterns — understanding sector alignment, write amplification, and why you buffer in RAM before flushing.
- Data integrity on power loss — identifying the window of vulnerability and quantifying worst-case data loss.
- Practical retrieval — choosing USB MSC over a custom protocol shows engineering pragmatism.
- Tradeoff articulation — every decision has alternatives. Name them, explain why you rejected them.