System Design

Design a Battery-Powered IoT Sensor Node

System design walkthrough: battery-powered environmental sensor with 5-year lifetime, LoRa/BLE communication, duty cycling, and OTA firmware updates.

The Prompt

"Design a battery-powered environmental sensor that measures temperature, humidity, and air quality every 15 minutes and reports to a cloud backend. Target: 5-year lifetime on 2x AA batteries."


Requirements Clarification

CategoryRequirementDetail
FunctionalSensorsTemperature, humidity, air quality (VOC/PM2.5)
Reporting intervalEvery 15 minutes
Cloud connectivityData delivered to cloud backend via gateway
Local storageBuffer readings when gateway is unreachable
ConfigurationAdjustable sample rate, TX power, sensor thresholds
Non-functionalBattery life5 years on 2x AA (6000 mAh at 1.5 V)
Cost targetBOM under $15 at 10k volume
Operating temp-20 C to +60 C (outdoor)
EnclosureIP65 rated, wall or pole mountable
OTA updatesRemote firmware update without physical access
Range1-2 km to nearest gateway (urban/suburban)

Architecture Overview

text
┌────────────┐
│ Cloud │
│ Backend │
└─────┬──────┘
│ IP / MQTT
┌─────┴──────┐
│ LoRa │
│ Gateway │
└─────┬──────┘
│ LoRaWAN (868/915 MHz)
┌───────────────────────┴───────────────────────┐
│ Sensor Node │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ Temp/Hum │ │ AQ Sensor│ │ Battery │ │
│ │ (I2C) │ │ (I2C) │ │ Monitor │ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ │
│ │ │ │ │
│ ┌────┴───────────────┴───────────────┴────┐ │
│ │ MCU (Cortex-M4, ultra-low-power)│ │
│ │ - RTC wake timer │ │
│ │ - Dual-bank Flash (OTA) │ │
│ └────────────────┬────────────────────────┘ │
│ │ SPI │
│ ┌────┴─────┐ │
│ │ LoRa │ │
│ │ Radio │ │
│ │ (SX1276) │ │
│ └──────────┘ │
│ │
│ ┌──────────┐ ┌──────────┐ │
│ │ Power │ │ External │ │
│ │ Mgmt │ │ Flash │ │
│ │ (LDO/ │ │ (OTA + │ │
│ │ Boost) │ │ config) │ │
│ └──────────┘ └──────────┘ │
└────────────────────────────────────────────────┘

Component Deep Dive

Power Budget Analysis

This is the single most important calculation in the design. Work through it with real numbers.

Battery capacity:

  • 2x AA alkaline = 2 x 3000 mAh = 6000 mAh at 1.5 V
  • After boost/LDO losses to 3.3 V rail: approximately 4500 mAh effective at 3.3 V (75% conversion efficiency)

Time budget:

  • 5 years = 5 x 365 x 24 = 43,800 hours
  • Average current budget: 4500 mAh / 43,800 h = 102.7 uA average

Duty cycle breakdown (per 15-minute cycle = 900 seconds):

StateDuration% of CycleCurrentContribution to Average
Deep sleep (RTC only)895.2 s99.47%2 uA1.99 uA
Wake + sensor warmup + ADC read2.8 s0.31%5 mA15.6 uA
LoRa TX (SF7, 14 dBm)1.5 s0.17%40 mA66.7 uA
LoRa RX window0.5 s0.05%12 mA6.7 uA
Total900 s100%91.0 uA

Margin check: 91 uA average is under the 102.7 uA budget, leaving approximately 11% margin for battery self-discharge, aging, and cold-temperature capacity loss. This is tight but achievable.

Key insight: Sleep current dominates the energy budget over 5 years. Reducing sleep current from 2 uA to 1 uA saves more energy than halving the transmit time.


Sensor Selection

ParameterSensorInterfaceActive CurrentAccuracyCost
Temp + HumiditySHT40 (Sensirion)I2C1.7 mA (10 ms measurement)+/- 0.2 C, +/- 1.8% RH$1.50
Air Quality (VOC)SGP40 (Sensirion)I2C3.2 mA (30 ms measurement)VOC index 0-500$3.00
Air Quality (PM2.5)PMS5003 (Plantower)UART100 mA (fan, 2 s warmup)+/- 10 ug/m3$5.00

Design decision: The PM2.5 sensor draws 100 mA and needs a 2-second fan warmup. This destroys the power budget if run every 15 minutes. Options:

  1. Drop PM2.5 and use VOC only (recommended for 5-year target)
  2. Sample PM2.5 hourly instead of every 15 minutes, reducing its average contribution
  3. Use a larger battery (C-cell or lithium thionyl chloride)

For the 5-year AA target, option 1 (VOC only) is the pragmatic choice. Mention the tradeoff to the interviewer.


Wireless Technology Selection

CriterionLoRa (LoRaWAN)BLE 5.0NB-IoTWiFi
Range2-5 km (urban)100 m10 km (cellular)50 m
TX current40 mA8 mA220 mA300 mA
TX duration (12 bytes)50 ms (SF7) to 1.5 s (SF12)3 ms500 ms20 ms
Energy per TX2-60 mJ0.08 mJ110 mJ6 mJ
Data rate0.3-11 kbps2 Mbps60 kbps54 Mbps
InfrastructureLoRa gateway ($100-300)Phone/hubCellular tower (carrier)WiFi AP
Monthly costFree (private gateway)Free$0.50-2/device/monthFree
Max payload51-222 bytes (SF-dependent)244 bytes1600 bytesMTU-dependent

Why LoRa wins for this design:

  • 1-2 km range needed (eliminates BLE)
  • No recurring cellular cost (eliminates NB-IoT at scale)
  • TX energy is manageable with duty cycling
  • Small payload (12 bytes) fits within LoRaWAN constraints
  • Private gateway keeps data on-premise if needed

MCU Selection Criteria

Hard requirements:

  • Deep sleep current under 2 uA with RTC running
  • Fast wake-up: under 5 us from stop mode
  • Integrated 12-bit ADC (for battery voltage monitoring)
  • Dual-bank Flash: at least 256 KB for A/B firmware updates
  • I2C and SPI peripherals for sensors and radio
  • Operating range: 1.8 V to 3.6 V (direct battery operation possible)

Candidate comparison:

MCUSleep (RTC)Wake-upFlashRAMPrice
STM32L4120.7 uA3.5 us128 KB40 KB$2.50
STM32L4761.2 uA3.5 us1 MB (dual-bank)128 KB$4.00
nRF528401.5 uA1.5 us1 MB256 KB$3.80
EFR32FG141.4 uA5 us256 KB32 KB$3.50

The STM32L476 is a strong choice: dual-bank Flash natively, extremely low sleep current, and a mature HAL/toolchain. The nRF52840 is preferable if BLE is also needed (e.g., for local commissioning).


OTA Firmware Update Strategy

Remote devices deployed for 5 years must support OTA updates. A bricked device in the field means a truck roll.

Dual-bank (A/B) update flow:

text
1. Device boots from Bank A (active)
2. Cloud pushes new firmware via LoRa (fragmented)
3. Fragments written to Bank B (inactive)
4. After all fragments received, verify CRC-32 of entire Bank B image
5. Set boot flag to Bank B, reset
6. Bootloader validates Bank B (header magic + CRC)
7. If valid → boot Bank B (now active)
8. If invalid → revert flag to Bank A, boot Bank A (rollback)

LoRa OTA constraints:

  • Max payload per message: approximately 51 bytes at SF12 (longest range)
  • 100 KB firmware / 51 bytes per fragment = approximately 2000 fragments
  • At one fragment per 15-minute cycle (piggybacked on RX window): 2000 x 15 min = 20.8 days
  • Dedicated OTA mode (1 fragment per second): approximately 33 minutes, but higher power
  • Tradeoff: Slow background OTA preserves battery; fast OTA drains more but finishes sooner

Rollback safety:

  • Bootloader counts consecutive boot failures (watchdog reset counter in backup registers)
  • After 3 failed boots, automatically revert to previous bank
  • Bootloader itself is never updated OTA (golden bootloader in protected Flash region)

Data Format

Why binary, not JSON:

FormatPayload for one readingNotes
JSONapproximately 180 bytes{"temp":23.5,"hum":45.2,"voc":127,"bat":3.21,"ts":1700000000}
Binary (packed struct)12 bytesSee below
MessagePack35 bytesSmaller than JSON but still has overhead

Binary payload layout (12 bytes):

text
Byte 0-1: Temperature (int16, 0.01 C resolution, big-endian)
Byte 2-3: Humidity (uint16, 0.01 %RH resolution)
Byte 4-5: VOC index (uint16, raw sensor value)
Byte 6-7: Battery mV (uint16)
Byte 8-11: Timestamp (uint32, seconds since epoch, or uptime counter)

At SF12, the LoRaWAN max payload is 51 bytes. The 12-byte binary format leaves room for a LoRaWAN header and future fields. JSON would exceed the limit entirely.


Key Design Decisions

DecisionOptions ConsideredChoiceRationale
WirelessLoRa, BLE, NB-IoT, WiFiLoRaRange, power, no subscription cost
Sensor duty cycleAlways-on, 15-min wake15-min duty cycle99.5% sleep is essential for 5-year life
PM2.5 sensorInclude, exclude, hourlyExclude (VOC only)100 mA fan kills battery budget
Data formatJSON, binary, MessagePackBinary (12 bytes)LoRaWAN payload constraint, minimal TX time
OTA strategySingle-bank with backup, dual-bank A/BDual-bank A/BAtomic swap, safe rollback, no external Flash needed
Power source2x AA, CR123A, LiSOCl2, solar2x AALow cost, widely available, meets 5-year budget
MCUSTM32L4, nRF52, EFR32STM32L476Dual-bank Flash, 0.7 uA sleep, mature ecosystem
Clock sourceInternal RC, 32.768 kHz crystal32.768 kHz crystalRTC accuracy matters over 5 years (+/- 2 min/year)

What Interviewers Evaluate

Power budget with real numbers — The strongest signal in this interview is a candidate who can work through the duty-cycle math on a whiteboard. Interviewers want to see actual current numbers, not vague claims like "it's low power." Show the sleep/wake/TX breakdown, compute the average, and verify it fits the battery.

Wireless technology justification — Don't just pick LoRa and move on. Explain why BLE is too short-range, why NB-IoT has recurring costs, and why WiFi is power-hungry. A comparison table demonstrates structured thinking.

OTA strategy for remote devices — Any device deployed for 5 years without physical access needs a robust update mechanism. Interviewers look for dual-bank awareness, CRC verification, rollback on failure, and understanding of the time-to-update tradeoff over constrained links.

Data format awareness — Choosing binary over JSON in a constrained environment shows you understand the system end-to-end, from sensor to radio to cloud. Bonus points for knowing LoRaWAN payload limits at different spreading factors.

Tradeoff articulation — The PM2.5 decision is a great example. There is no "right" answer, but the ability to quantify the cost (100 mA fan versus 5 uA sleep) and present options shows engineering judgment.