Search topics...
State MachinesImplementation & Designintermediate

Walk me through implementing a UART byte-stream parser as an FSM.

0 upvotes
Practice with AISoon
Study the fundamentals first — State Machines topic page

A binary protocol parser is the textbook FSM use case. Suppose the protocol has the frame structure:

text
[SOF][LEN][PAYLOAD bytes...][CRC16]

Where SOF is a fixed start-of-frame byte (e.g., 0x7E), LEN is the payload length (1 byte, max 255), payload is LEN bytes, and CRC16 is two trailing bytes.

The FSM has four states:

StateMeaningExpects
WAIT_SOFIdle / between framesSOF byte (0x7E)
READ_LENJust got SOFOne length byte
READ_PAYLOADReading payload bytesLEN bytes
READ_CRCPayload completeTwo CRC bytes

The single event is BYTE_RX(value). The transition logic:

c
typedef enum { WAIT_SOF, READ_LEN, READ_PAYLOAD, READ_CRC } State;
static State state = WAIT_SOF;
static uint8_t payload[256];
static uint8_t payload_len;
static uint8_t bytes_read;
static uint16_t crc_received;
void on_byte_rx(uint8_t b) {
switch (state) {
case WAIT_SOF:
if (b == 0x7E) state = READ_LEN;
break;
case READ_LEN:
payload_len = b;
bytes_read = 0;
state = (b == 0) ? READ_CRC : READ_PAYLOAD;
break;
case READ_PAYLOAD:
payload[bytes_read++] = b;
if (bytes_read == payload_len) state = READ_CRC;
break;
case READ_CRC:
crc_received = (crc_received << 8) | b;
if (++bytes_read == payload_len + 2) {
if (crc_received == compute_crc16(payload, payload_len)) {
deliver_message(payload, payload_len);
} else {
/* CRC mismatch — drop frame, log error */
}
state = WAIT_SOF;
}
break;
}
}

Things to discuss in an interview:

  • Run-to-completion: each byte is a discrete event, processed to completion before the next byte arrives
  • Buffer ownership: the payload[] buffer is owned by the FSM; deliver_message must copy it if it needs to retain the data, since the next frame will overwrite
  • Error recovery: on CRC mismatch, return to WAIT_SOF — the next 0x7E byte (whether it's a real SOF or a payload byte coincidentally equal to 0x7E in a corrupted frame) restarts parsing
  • Length validation: a malicious or corrupted LEN byte could be 255; the buffer must accommodate or you must clamp/reject early
  • Testing: drive the FSM with synthetic byte sequences (good frames, frames with bit flips, frames with wrong length, garbage between frames) and assert correct delivery / drop behavior

For a real protocol you'd often refactor to a table-driven FSM as it grows (escape characters, multi-byte length fields, variable header formats). The structure stays the same; only the dispatch encoding changes.

Source: State Machines Q&A