Quick Cap
CAN (Controller Area Network) is a multi-master, message-broadcast bus protocol built for reliability in electrically noisy environments. It uses differential signaling on two wires (CAN_H, CAN_L), non-destructive bit-wise arbitration for priority, and five independent error-detection mechanisms — making it the backbone of automotive, industrial, and medical device communication.
Interviewers probe CAN to test whether you understand differential bus physics, priority-based arbitration without a central controller, and progressive fault confinement. A real-world example: an automotive powertrain network where the engine ECU (ID 0x100), transmission ECU (ID 0x200), and ABS module (ID 0x300) all share one CAN bus — you need to assign IDs by criticality, calculate bus load to avoid saturation, and handle error-passive or bus-off nodes without bringing down the entire network.
Key Facts:
- Multi-master broadcast: Any node can transmit; all nodes receive every frame
- Differential signaling: CAN_H/CAN_L twisted pair provides noise immunity up to several thousand volts of common-mode rejection
- Non-destructive arbitration: Lower ID wins without corrupting the winning message
- Five error mechanisms: Bit, stuff, CRC, form, and ACK errors detected in hardware
- Fault confinement: Faulty nodes progressively isolate themselves (Error Active, Error Passive, Bus Off)
- Two flavors: Classic CAN (8 bytes, 1 Mbit/s) and CAN-FD (64 bytes, 8 Mbit/s data phase)
Deep Dive
At a Glance
| Feature | Detail |
|---|---|
| Wires | 2 — CAN_H, CAN_L (differential pair) |
| Clock | Asynchronous (each node has its own oscillator; bit stuffing provides sync edges) |
| Topology | Multi-master broadcast bus |
| Max Nodes | ~110 (electrical limit of standard transceivers) |
| Speeds | 1 Mbit/s classic; up to 8 Mbit/s CAN-FD data phase |
| Data per Frame | 0-8 bytes (classic), 0-64 bytes (CAN-FD) |
| Error Detection | CRC, bit monitoring, stuff check, form check, ACK check |
| Termination | 120 ohm resistor at each end of the bus |
Bus Interface
Physical Layer
CAN uses differential signaling on a twisted pair. The bus has two logical states:
- Recessive state: Both CAN_H and CAN_L sit near 2.5 V (differential voltage ~ 0 V). This represents a logic 1.
- Dominant state: CAN_H is driven to ~3.5 V and CAN_L is pulled to ~1.5 V (differential voltage ~ 2 V). This represents a logic 0.
The key insight is that dominant overwrites recessive. If any single node drives a dominant bit, the entire bus sees dominant — regardless of how many other nodes are driving recessive. This is what makes CAN arbitration work: a node transmitting a dominant bit (0) will always "win" over a node transmitting a recessive bit (1) on the same clock cycle, without electrical collision or data corruption.
The differential approach provides excellent noise immunity because external interference affects both CAN_H and CAN_L equally (common-mode noise), and the receiver only looks at the voltage difference between the two lines. Standard CAN transceivers (e.g., TJA1050, MCP2551) can tolerate common-mode voltages from -2 V to +7 V, providing robust operation even in electrically harsh automotive and industrial environments.
Dominant is logic 0 (not 1). This is counterintuitive. On the bus, a dominant bit is "stronger" — it overrides any number of recessive bits. In arbitration, a lower numeric ID (more dominant zeros) has higher priority. Getting this backwards in an interview is a common mistake.
Frame Structure
- SOF (red) — Start of frame, always dominant (0).
- Identifier + RTR (blue) — 11-bit message ID (determines priority) + Remote Transmission Request bit.
- Ctrl (purple) — Control field including DLC (Data Length Code).
- Data (orange) — Payload: 0-8 bytes (classic CAN), 0-64 bytes (CAN-FD).
- CRC (green) — 15-bit cyclic redundancy check for error detection.
- ACK (yellow) — Transmitter sends recessive; receivers drive dominant to acknowledge.
- EOF (gray) — 7 recessive bits marking end of frame.
Frame Types
CAN defines four frame types, each serving a different purpose:
-
Data frame — Carries data from a transmitter to all receivers. This is the most common frame type on any CAN bus. The transmitter broadcasts data and all nodes receive it; each node decides whether to process or ignore based on its acceptance filter.
-
Remote frame — A node requests data by sending a frame with the RTR bit set to recessive and no data payload. The node that produces the requested ID responds with a data frame. Remote frames are rarely used in modern CAN systems because periodic broadcasting is simpler and more predictable.
-
Error frame — Generated by any node that detects an error. It consists of an error flag (6 dominant or recessive bits depending on the node's error state) followed by an error delimiter (8 recessive bits). This destroys the current frame on the bus, forcing the transmitter to retry automatically. An Error Active node sends 6 dominant bits (active error flag), which violates the bit stuffing rule and guarantees all nodes detect the error. An Error Passive node sends 6 recessive bits (passive error flag), which may go unnoticed by other nodes — this is by design, preventing a marginal node from disrupting healthy communication.
-
Overload frame — Signals that a node needs extra time before the next frame. It is structurally similar to an error frame but occurs only during the inter-frame space. Overload frames are uncommon in modern CAN implementations because controllers are fast enough to process frames without needing extra time. Seeing overload frames on a bus usually indicates an underpowered or misconfigured node.
Arbitration
CAN arbitration is non-destructive and bit-wise. When two or more nodes begin transmitting at the same time, they each send their message ID bit by bit, starting with the most significant bit. After each bit, the transmitter reads the bus back:
- If the bus matches what it sent, it continues.
- If it sent a recessive (1) but reads back a dominant (0), it knows another node with a lower ID (higher priority) is also transmitting — so it immediately stops and becomes a receiver.
Concrete example: Suppose Node A wants to send ID 0b10110 and Node B wants to send ID 0b10100:
| Bit Position | Node A sends | Node B sends | Bus state | Result |
|---|---|---|---|---|
| MSB (bit 4) | 1 | 1 | 1 (recessive) | Both continue |
| bit 3 | 0 | 0 | 0 (dominant) | Both continue |
| bit 2 | 1 | 1 | 1 (recessive) | Both continue |
| bit 1 | 1 | 0 | 0 (dominant) | Node A reads back 0 but sent 1 — Node A loses, backs off |
| bit 0 | — | 0 | 0 | Node B continues alone |
Node B wins because its ID is numerically lower (higher priority). Node A's message is not corrupted — it simply retries in the next arbitration round. This is why lower CAN IDs must be assigned to higher-priority messages (e.g., safety-critical braking messages should get the lowest IDs).
Standard vs extended ID arbitration: When a standard (11-bit) and extended (29-bit) frame compete, the standard frame wins if their first 11 bits are identical, because the SRR bit (Substitute Remote Request) in the extended frame is always recessive while the RTR bit in the standard frame can be dominant. This means standard frames inherently have higher priority than extended frames with the same base ID — an important consideration when designing mixed-format networks.
Three or more nodes: Arbitration works identically with any number of competing nodes. Each node independently monitors the bus and backs off as soon as it loses. Only the node with the lowest ID survives the entire arbitration field and proceeds to transmit its data uncontested.
Assign message IDs based on urgency, not by node. Safety-critical messages (emergency braking, airbag deployment) get the lowest IDs. Diagnostic and configuration messages get the highest. A common scheme uses the upper bits for message class and the lower bits for specific signals within that class.
Five Error Detection Mechanisms
CAN implements five independent error checks in hardware. If any one fails, the current frame is destroyed and retransmitted:
-
Bit error — A transmitter monitors the bus while sending. If it drives a dominant bit but reads back recessive (or vice versa, outside the arbitration field), a bit error is flagged. This detects driver/transceiver faults and bus contention issues.
-
Stuff error — CAN uses bit stuffing (see below). If a receiver detects six or more consecutive bits of the same polarity, it signals a stuff error. This typically indicates a synchronization loss or severe bus disturbance.
-
CRC error — The receiver computes the CRC over the received data and compares it with the transmitted CRC field. A mismatch triggers a CRC error. This catches any bit flips that the other mechanisms missed.
-
Form error — Certain fields (CRC delimiter, ACK delimiter, EOF) must have fixed values. If a receiver sees an unexpected bit value in these fields, it flags a form error. This detects frame structure violations.
-
ACK error — The transmitter sends a recessive bit in the ACK slot. If no receiver drives it dominant (meaning no node acknowledged), the transmitter detects an ACK error — indicating either no receivers are on the bus or all receivers detected an error. This is the only mechanism that requires at least two nodes — a single node on a bus will always see ACK errors.
Together, these five mechanisms give CAN a Hamming distance of 6, meaning it can detect up to 5 random bit errors in any frame. The residual error rate (undetected error probability) is on the order of 4.7 x 10^-11 per frame — making CAN one of the most reliable serial protocols available.
Error signaling flow: When any node detects an error, it immediately transmits an error frame, which consists of 6 dominant bits followed by an 8-bit delimiter. The 6 consecutive dominant bits intentionally violate the bit stuffing rule, ensuring all other nodes on the bus also detect the error condition. The transmitter of the original (corrupted) frame then automatically retransmits it after the bus becomes idle.
Fault Confinement States
CAN implements progressive fault confinement to prevent a single malfunctioning node from taking down the entire bus. Each node maintains two counters:
- TEC (Transmit Error Counter) — incremented by 8 on transmit errors, decremented by 1 on successful transmit
- REC (Receive Error Counter) — incremented by 1 on receive errors (or 8 for certain types), decremented by 1 on successful receive
The asymmetric increment (error adds 8, success subtracts 1) means a node must successfully transmit many frames to recover from even a single error. This ensures faulty nodes are penalized quickly.
Based on these counters, a node is in one of three states:
| State | Condition | Behavior |
|---|---|---|
| Error Active | TEC below 128 AND REC below 128 | Normal operation. Node sends active error flags (6 dominant bits) when it detects errors. Active error flags immediately destroy the current frame on the bus. |
| Error Passive | TEC at or above 128 OR REC at or above 128 | Node can still transmit and receive, but sends passive error flags (6 recessive bits) that don't disturb other nodes. Must wait an additional 8-bit "suspend transmission" time before retransmitting. |
| Bus Off | TEC at or above 256 | Node is completely disconnected from the bus. It cannot transmit or receive until a recovery sequence (128 occurrences of 11 consecutive recessive bits) is observed, typically requiring a software reset or automatic recovery. |
This mechanism is elegant: a node that keeps causing errors gradually loses its ability to disrupt the bus. A healthy node that encounters occasional errors stays Error Active. A persistently failing node gets isolated. The progression is one-way under persistent faults — a node cannot jump from Bus Off back to Error Active without going through the full recovery sequence.
Some MCUs support automatic Bus Off recovery (the CAN peripheral automatically attempts recovery after detecting 128 x 11 recessive bits). Others require manual intervention — the software must explicitly reinitialize the CAN controller. Know which mode your hardware uses, and in safety-critical systems, always log Bus Off events and verify the node's state after recovery before resuming normal operation.
Bit Stuffing
CAN is asynchronous — each node has its own oscillator, and there is no shared clock line. Receivers synchronize by detecting edges (transitions from dominant to recessive or vice versa). If a long run of identical bits occurs, the receiver might lose synchronization.
Bit stuffing solves this: after five consecutive bits of the same polarity, the transmitter inserts one complementary bit (a "stuff bit"). The receiver knows to remove these extra bits. This guarantees at least one edge every six bit times, keeping receivers locked.
For example, if the data to transmit is 11111 00000 11, the stuffed output becomes 11111[0] 00000[1] 11 — where the bits in brackets are stuff bits. The receiver strips them out to recover the original data.
If a receiver ever sees six consecutive bits of the same polarity, it knows something is wrong and raises a stuff error.
Impact on throughput: Bit stuffing adds overhead. In the worst case (alternating five-bit patterns), a CAN frame can grow by up to 20%. This must be factored into bus load calculations. In the best case (random data), the overhead is approximately 4-8%.
Where stuffing applies: Bit stuffing is only applied from SOF through the CRC field. The CRC delimiter, ACK field, and EOF are fixed-form fields and are not stuffed.
Bit Timing and Synchronization
Although CAN has no shared clock, all nodes must agree on where each bit boundary falls. CAN divides each bit time into segments measured in time quanta (TQ):
| Segment | Purpose | Typical Length |
|---|---|---|
| Sync Segment | Synchronization edge expected here | 1 TQ (fixed) |
| Propagation Segment | Compensates for physical signal propagation delay | 1-8 TQ |
| Phase Segment 1 | Can be lengthened for resynchronization | 1-8 TQ |
| Phase Segment 2 | Can be shortened for resynchronization | 1-8 TQ |
The sample point — the instant when the receiver reads the bus level — falls at the boundary between Phase Segment 1 and Phase Segment 2. A typical sample point is at 75-87.5% of the bit time.
The Synchronization Jump Width (SJW) defines how much Phase Segment 1 and Phase Segment 2 can be lengthened or shortened to resynchronize with incoming edges. A larger SJW provides more tolerance for oscillator drift but reduces the margin for setup/hold times.
Practical rule of thumb: For a given bit rate, calculate the time quantum duration as TQ = 1 / (Prescaler * CAN_clock), then distribute typically 8-25 TQ per bit with the sample point near 87.5% for maximum tolerance.
Oscillator tolerance: CAN requires each node's oscillator to be within 1.58% of the nominal frequency (for SJW=1). Using a lower-quality oscillator? Increase SJW — but this reduces the maximum bus length because the sample point window narrows. This is a common trade-off in cost-sensitive designs.
Worked example — 500 kbit/s with a 48 MHz clock:
- Target: 500 kbit/s = 2 us per bit
- Choose 16 TQ per bit: TQ = 2 us / 16 = 125 ns
- Prescaler = CAN_clock * TQ = 48 MHz * 125 ns = 6
- Sync Segment = 1 TQ
- Propagation Segment = 3 TQ (covers ~100 m bus propagation)
- Phase Segment 1 = 10 TQ (sample point at TQ 14 = 87.5%)
- Phase Segment 2 = 2 TQ
- SJW = 1 TQ (sufficient for crystal oscillators with under 0.5% drift)
- Verify: 1 + 3 + 10 + 2 = 16 TQ per bit at 125 ns = 2 us = 500 kbit/s
In practice, most MCU vendors provide bit timing calculators (e.g., STM32CubeMX, NXP S32 Design Studio) that compute prescaler and segment values from your target bit rate and clock. However, understanding the manual calculation is essential for interviews and for debugging when auto-generated settings don't work — typically due to clock source inaccuracies or unexpected bus lengths.
Message Filtering and Acceptance Masks
A CAN bus is a broadcast network — every node sees every frame. To avoid processing irrelevant messages, CAN controllers implement hardware acceptance filters. A filter consists of two registers:
- Filter ID register — the expected message ID pattern
- Filter Mask register — which bits of the ID to compare (1 = must match, 0 = don't care)
For example, to accept only messages with IDs 0x100 through 0x10F:
- Filter ID:
0x100(binary:0001 0000 0000) - Filter Mask:
0x7F0(binary:0111 1111 0000) — upper 7 bits must match, lower 4 bits are don't-care
This allows the hardware to reject unwanted messages before they trigger an interrupt, significantly reducing CPU load on busy networks. Most CAN controllers support multiple filter banks, allowing a node to accept several non-contiguous ID ranges.
Filter modes on STM32 bxCAN (representative example):
| Mode | Description | Use Case |
|---|---|---|
| Mask mode | Filter ID + Mask: accepts a range of IDs | Accept IDs 0x100-0x10F (mask out lower 4 bits) |
| List mode | Exact match against 2 or 4 specific IDs per bank | Accept only IDs 0x100, 0x200, 0x300 |
| 32-bit filter | Matches full 29-bit extended ID + IDE + RTR | Extended ID filtering |
| 16-bit filter | Matches 11-bit standard ID + RTR (2 filters per bank) | Standard ID filtering, saves filter banks |
On a busy automotive bus with hundreds of message IDs, configuring filters correctly can mean the difference between a node running at 5% CPU utilization versus 80%. Always set up filters before enabling the CAN peripheral to avoid processing a burst of unwanted messages during startup.
CAN 2.0 vs CAN-FD
| Feature | CAN 2.0 (Classic) | CAN-FD |
|---|---|---|
| Max data length | 8 bytes | 64 bytes |
| Arbitration bit rate | Up to 1 Mbit/s | Up to 1 Mbit/s (same as classic) |
| Data phase bit rate | Same as arbitration | Up to 8 Mbit/s (bit rate switching) |
| CRC | 15-bit | 17-bit (up to 16 bytes) or 21-bit (over 16 bytes) |
| Backward compatible? | — | Arbitration phase is compatible; data phase is not |
| Key new fields | — | FDF (FD format), BRS (bit rate switch), ESI (error state indicator) |
CAN-FD maintains backward compatibility during the arbitration phase — classic CAN nodes can coexist on the same bus as long as they tolerate (ignore) FD frames. However, a classic CAN node will flag an error if it tries to decode an FD data phase, so mixed-mode buses require careful planning or gateway nodes that bridge classic and FD segments.
Bit Rate Switching (BRS): CAN-FD can transmit the arbitration phase at a slower rate (e.g., 500 kbit/s for compatibility and propagation tolerance) and then switch to a higher rate (e.g., 4 Mbit/s) for the data phase. The switch happens after the BRS bit and reverts after the CRC delimiter. This gives you the best of both worlds: reliable arbitration over long buses and high throughput for data.
DLC mapping in CAN-FD: Classic CAN DLC values 0-8 map directly to byte counts. CAN-FD extends this: DLC values 9-15 map to 12, 16, 20, 24, 32, 48, and 64 bytes respectively. There are no intermediate sizes — you cannot send exactly 10 bytes in a single CAN-FD frame.
Bus Load Calculation
Bus load determines how much of the available bandwidth is consumed. Exceeding ~70% bus load risks message latency spikes and missed deadlines in real-time systems.
Formula:
Bus Load (%) = (Sum of all message bit-times per second) / (Bit rate) * 100
Example: A 500 kbit/s bus carries three periodic messages:
- Message A: 108 bits (including stuff bits), sent every 10 ms = 100 Hz
- Message B: 108 bits, sent every 20 ms = 50 Hz
- Message C: 76 bits, sent every 5 ms = 200 Hz
Total bits/second = (108 * 100) + (108 * 50) + (76 * 200) = 10,800 + 5,400 + 15,200 = 31,400 bits/s
Bus load = 31,400 / 500,000 * 100 = 6.28% — well within safe limits.
When estimating frame sizes, remember to include: SOF, arbitration field, control field, data field, CRC field, ACK field, EOF, and inter-frame space (3 bits minimum). Add approximately 10-20% for bit stuffing overhead depending on data patterns.
If bus load approaches 70-80%, consider splitting the network into multiple buses with a gateway, reducing message frequency, combining multiple signals into fewer frames, or migrating to CAN-FD for higher throughput.
Design Considerations
Termination: A CAN bus must have a 120 ohm resistor at each physical end of the bus. Without termination, signal reflections corrupt data. You can verify correct termination by measuring resistance between CAN_H and CAN_L with the bus powered off — it should read ~60 ohms (two 120 ohm resistors in parallel). A reading significantly higher than 60 ohms suggests a missing terminator; significantly lower suggests a short or extra terminators.
Bus length vs speed: Maximum bus length decreases as bit rate increases. At 1 Mbit/s, the practical limit is about 40 meters. At 125 kbit/s, you can reach 500 meters. This is because the arbitration mechanism requires all nodes to see the same bit value within one bit time — longer buses mean more propagation delay, and at higher speeds each bit time is shorter.
| Bit Rate | Approximate Max Bus Length |
|---|---|
| 1 Mbit/s | 40 m |
| 500 kbit/s | 100 m |
| 250 kbit/s | 250 m |
| 125 kbit/s | 500 m |
Stub length: Nodes connect to the main bus via short stubs. Keep stubs as short as possible (ideally under 0.3 m at 1 Mbit/s). Long stubs act as unterminated transmission lines and cause reflections that can corrupt data at high speeds.
Proper grounding: All CAN nodes should share a common ground reference. Ground offset between nodes appears as common-mode voltage and reduces noise margin. In automotive systems, a dedicated CAN_GND wire in the harness is recommended. Ground offsets exceeding 2 V can cause transceiver failures.
Transceiver selection: Choose a transceiver that matches your voltage levels (3.3 V or 5 V MCU), supports the required bit rate, and provides adequate common-mode range. For automotive applications, use AEC-Q100 qualified parts with ESD protection and fault-tolerant features.
Common CAN transceivers and their characteristics:
| Transceiver | Voltage | Max Speed | Features | Typical Use |
|---|---|---|---|---|
| TJA1050 | 5 V | 1 Mbit/s | High-speed, basic | General automotive |
| TJA1051 | 3.3/5 V | 5 Mbit/s | CAN-FD capable, low power standby | Modern automotive |
| MCP2551 | 5 V | 1 Mbit/s | Fault-tolerant, slope control | Industrial |
| MCP2562FD | 3.3/5 V | 8 Mbit/s | CAN-FD, low EMI | CAN-FD applications |
| SN65HVD230 | 3.3 V | 1 Mbit/s | Low power, slope control | 3.3 V systems |
Slope control: Some transceivers offer a slope control pin (Rs) to limit the slew rate of the CAN_H/CAN_L signals. Slower slew rates reduce EMI emissions but limit the maximum bit rate. In EMC-sensitive environments, enabling slope control and reducing the bit rate is often preferable to fighting EMI issues with shielding.
Isolated transceivers: In applications where ground potential differences between nodes can be large (industrial automation, medical devices), use galvanically isolated CAN transceivers (e.g., ISO1050, ADM3053). These provide electrical isolation between the MCU side and the bus side, protecting against ground loops and high-voltage transients.
CAN vs Other Embedded Buses
| Criteria | CAN | UART | SPI | I2C | LIN |
|---|---|---|---|---|---|
| Topology | Multi-master bus | Point-to-point | Master-slave | Multi-master bus | Single master |
| Wires | 2 (differential) | 2 (TX/RX) | 4+ (SCK, MOSI, MISO, CS) | 2 (SDA, SCL) | 1 (+ ground) |
| Max speed | 1 Mbit/s (8 Mbit/s FD) | ~1 Mbit/s | 50+ MHz | 3.4 MHz | 20 kbit/s |
| Max distance | 40 m at 1 Mbit/s | ~15 m | under 1 m typical | under 1 m typical | 40 m |
| Error detection | 5 mechanisms + CRC | Parity only | None built-in | ACK/NACK | Checksum |
| Arbitration | Non-destructive, ID-based | N/A | N/A (CS select) | Clock stretching | Master polls |
| Best for | Automotive, industrial | Debug, simple links | High-speed peripherals | On-board sensors | Automotive sub-bus |
Choose CAN when you need: multiple nodes communicating over distance in noisy environments with deterministic priority and built-in fault tolerance. Choose simpler protocols when you have point-to-point links, short distances, or need higher raw throughput without arbitration.
Higher-Layer Protocols on CAN
CAN defines only the data link and physical layers. Real-world applications need higher-layer protocols for node management, data interpretation, and diagnostics:
| Protocol | Domain | Key Features |
|---|---|---|
| CANopen | Industrial automation | Object dictionary, PDO/SDO communication, NMT node management, standardized device profiles |
| SAE J1939 | Heavy-duty vehicles | 29-bit extended IDs, PGN-based addressing, transport protocol for multi-frame messages |
| DeviceNet | Factory automation | Predefined device profiles, producer/consumer model, explicit and I/O messaging |
| AUTOSAR COM | Passenger vehicles | Signal-level abstraction, PDU routing, deadline monitoring, integrated with AUTOSAR BSW stack |
| UDS / ISO 14229 | Diagnostics | Standardized diagnostic services (read DTCs, flash programming, security access) over CAN transport (ISO 15765) |
In interviews, knowing that CAN itself does not define how to interpret data bytes, assign node addresses, or handle multi-frame transfers shows depth. When asked "how do you send more than 8 bytes over CAN?", the answer involves a transport protocol (e.g., ISO-TP / ISO 15765-2) that segments long messages across multiple CAN frames with flow control.
Note that CAN-FD reduces the need for transport protocols in many cases because a single 64-byte frame can carry what previously required 8+ classic CAN frames. However, for truly large payloads (firmware updates, diagnostic data dumps), transport protocols remain necessary even with CAN-FD.
Debugging Story: The Phantom Bus-Off
A team was bringing up a four-node CAN network on an automotive prototype. Three nodes communicated normally, but the fourth node kept entering Bus Off within minutes of power-up. The TEC counter would climb rapidly and hit 256. Swapping the node's MCU board made no difference. Replacing the CAN transceiver made no difference. The waveform on the oscilloscope looked clean at the problem node's connector.
The root cause turned out to be missing termination. The fourth node was at the far end of a 3-meter stub, and the bus only had one 120 ohm terminator (at the opposite end). Signal reflections on the unterminated stub caused the transceiver at node four to misread bits, generating bit errors. Because only node four saw the reflections, only its error counters climbed — the other three nodes remained Error Active and appeared healthy.
Adding a 120 ohm resistor at the stub end immediately fixed the problem.
Lessons for CAN debugging:
- When one node repeatedly goes Bus Off while others stay healthy, suspect the physical layer at that node's location — termination, connector quality, stub length, or ground offset
- Always check TEC/REC values (available in most CAN controller status registers) to distinguish transmit-side from receive-side errors
- Use an oscilloscope at the failing node's transceiver pins (not just the bus connector) to inspect the actual waveform
- Measure the bus resistance with everything powered off: ~60 ohms = correct dual termination, ~120 ohms = one terminator missing, over 120 ohms = both missing or open connection
- Check for ground offset between nodes — measure voltage between the ground pins of different nodes while the system is running
- A CAN bus analyzer (e.g., PCAN, Vector CANalyzer, Kvaser) is invaluable for capturing error frames, identifying the error type, and correlating errors with specific message IDs
What interviewers want to hear: You understand differential signaling and why dominant overwrites recessive, you can walk through arbitration with a concrete bit-by-bit example, you know the five error detection mechanisms and the three fault confinement states, and you can reason about bus load, termination, and physical layer issues.
Interview Focus
Classic CAN Interview Questions
Q1: "How does CAN arbitration work, and what makes it non-destructive?"
Model Answer Starter: "CAN arbitration is non-destructive because it uses bit-wise priority comparison during message transmission. When multiple nodes start transmitting simultaneously, they compare their message IDs bit by bit. Lower ID values have higher priority, so if one node transmits a dominant bit (0) while another transmits recessive (1), the dominant bit wins and the recessive transmitter immediately stops transmitting. This allows the highest priority message to complete transmission without any data loss or corruption, making the arbitration process deterministic and efficient."
Q2: "What's the difference between CAN classic and CAN FD, and when would you use each?"
Model Answer Starter: "CAN classic supports up to 8 bytes of data per frame at speeds up to 1 Mbps, while CAN FD supports up to 64 bytes of data with variable bit rates up to 8 Mbps in the data phase. CAN FD maintains backward compatibility with classic CAN in the arbitration phase. I use CAN classic for simple control messages with small payloads and moderate speed requirements, and CAN FD for applications requiring larger data transfers, higher throughput, or when migrating existing systems that need more bandwidth while maintaining the robust CAN protocol benefits."
Q3: "How do you configure CAN bit timing and what factors influence the calculation?"
Model Answer Starter: "CAN bit timing configuration involves setting the baud rate prescaler, propagation segment, and phase segments to achieve the desired bit rate while accounting for signal propagation delays. Key factors include the oscillator frequency, bus length, node count, and transceiver characteristics. I calculate the total propagation delay, set the propagation segment to accommodate round-trip delays, and configure phase segments for proper synchronization. The goal is to ensure all nodes can sample bits at the correct time while providing tolerance for clock variations and signal propagation."
Q4: "How does CAN error handling work, and what are the different error states?"
Model Answer Starter: "CAN implements a sophisticated error handling mechanism with error detection, signaling, and recovery. Error types include bit errors, stuff errors, form errors, ACK errors, and CRC errors. When an error is detected, the node sends an error frame and increments its error counters. Nodes can be in Error Active (normal operation), Error Passive (limited error signaling), or Bus-off (transmission disabled) states. Recovery involves automatic retransmission for temporary errors and requires manual intervention for bus-off recovery, ensuring system reliability and fault tolerance."
Q5: "What are the advantages of CAN over other communication protocols in automotive applications?"
Model Answer Starter: "CAN's advantages include deterministic message delivery through priority-based arbitration, fault tolerance with automatic error detection and recovery, real-time performance with guaranteed message latency, and cost-effectiveness through simple wiring and transceiver requirements. CAN's differential signaling provides excellent noise immunity, making it ideal for the electrically noisy automotive environment. The multi-master architecture eliminates single points of failure, and the built-in error handling ensures system reliability for safety-critical applications."
Trap Alerts
- Don't say: "CAN is just like UART" or "All CAN messages have equal priority" — CAN has unique arbitration and error handling that set it apart from point-to-point serial protocols
- Don't forget: Bit timing configuration and synchronization requirements for proper CAN operation across all nodes
- Don't ignore: Error handling mechanisms and bus-off recovery procedures — interviewers often ask follow-up questions about fault confinement
Follow-up Questions
- "How would you implement CAN message filtering and acceptance masks?"
- "What's your strategy for handling CAN bus-off conditions in safety-critical systems?"
- "How do you ensure message integrity and prevent data corruption in CAN networks?"
- "How would you design a CAN ID allocation scheme for a 20-node automotive network?"
- "Explain the relationship between bus length and maximum bit rate in CAN."
- "What is the role of bit stuffing and how does it affect bus load calculations?"
Ready to test yourself? Head over to the CAN Interview Questions page for a full set of Q&A with collapsible answers — perfect for self-study and mock interview practice.
Practice
❓ How does CAN arbitration work when multiple nodes try to transmit simultaneously?
❓ What is the main advantage of CAN FD over classic CAN?
❓ What happens when a CAN node enters the bus-off state?
❓ What is the purpose of bit stuffing in CAN?
❓ What voltage levels indicate a dominant bit on a CAN bus?
❓ Why does a lower CAN ID indicate higher message priority?
Real-World Tie-In
Automotive Powertrain Network
In the field, I designed a CAN network connecting an engine ECU, transmission controller, and ABS module. Safety-critical messages (engine shutdown, brake intervention) were assigned the lowest IDs for guaranteed priority. Bus load analysis showed 35% utilization at 500 kbit/s, well within the safe 70% threshold. Proper 120 ohm termination at both harness ends and a dedicated CAN_GND wire eliminated the intermittent CRC errors we initially saw during vehicle-level EMC testing.
Industrial Process Control
On the job, we deployed a CAN-FD network across a chemical plant to coordinate 40+ sensor and actuator nodes. Classic CAN was insufficient — the 8-byte payload could not carry our 24-byte sensor packets without fragmentation. Migrating to CAN-FD with bit rate switching (500 kbit/s arbitration, 4 Mbit/s data phase) gave us the throughput we needed while keeping bus load under 50%. We implemented acceptance filtering aggressively so each node only processed relevant messages, reducing interrupt load by 80%.
Medical Device Monitoring
A patient monitoring system used CAN to connect pulse oximeter, ECG, and blood pressure modules to a central display unit. The critical design choice was ID assignment: alarm messages (patient distress) got the lowest IDs to guarantee sub-millisecond delivery even under full bus load. We ran bus load calculations to prove that even under worst-case conditions (all sensors reporting simultaneously), the highest-priority alarm message would never be delayed by more than one frame time. We implemented dual-redundant CAN buses — if one bus went Bus Off due to a cable fault, the system seamlessly switched to the backup bus and raised a maintenance alert.