Peripherals
intermediate
Weight: 3/10

USB

USB (Universal Serial Bus) fundamentals including device architecture, transfer types, enumeration, descriptors, and common device classes for embedded systems.

peripherals
usb
communication
endpoints
descriptors

Quick Cap

USB (Universal Serial Bus) is a host-controlled serial bus protocol that provides standardized connectivity, automatic device detection (plug-and-play), and power delivery. In embedded systems, USB is used for programming/debugging (USB-CDC serial), human interfaces (HID), mass storage, and custom device communication.

Interviewers test understanding of USB architecture (the host controls everything), the four transfer types and when to use each, the enumeration sequence, and the descriptor hierarchy that tells the host what a device is and how to talk to it.

Key Facts:

  • Host-controlled: The host initiates ALL communication -- devices can never send data unsolicited
  • Four transfer types: Control, bulk, interrupt, and isochronous -- each optimized for different use cases
  • Plug-and-play: Automatic device detection via enumeration (reset, address assignment, descriptor reads)
  • Descriptor hierarchy: Device, configuration, interface, and endpoint descriptors form a tree that defines the device
  • Standard device classes: HID (keyboard/mouse), CDC (virtual serial port), MSC (flash drive) eliminate the need for custom drivers
  • Power delivery: 5V bus power, 100 mA default, up to 500 mA after enumeration (USB 2.0)

Deep Dive

At a Glance

FeatureDetail
TopologyTiered star (host -> hubs -> devices)
Wires4 (VBUS, D+, D-, GND) -- USB 2.0
SpeedsLow (1.5 Mbps), Full (12 Mbps), High (480 Mbps), SuperSpeed (5 Gbps)
Power5V, up to 500 mA (USB 2.0), up to 100W (USB PD)
CommunicationHost-initiated, packet-based
Error DetectionCRC on all packets

USB Architecture

USB uses a host-centric model: the host initiates ALL communication. Devices can never send data unsolicited -- they respond to host polls. This is fundamentally different from protocols like CAN or I2C where any node can initiate. If a device has data to send, it must wait for the host to ask for it.

Host: A PC, SBC, or MCU in host mode. The host controls bus timing, assigns addresses to devices, manages bandwidth allocation, and schedules all transfers. There is exactly one host per USB bus.

Device: A peripheral that responds to host requests. Each device contains one or more endpoints (data pipes). A device cannot communicate with another device directly -- all traffic flows through the host. Even device-to-device communication (e.g., copying files from one USB drive to another) is mediated entirely by the host.

Hub: Expands one host port into multiple downstream ports. Hubs are transparent to devices -- a device does not know whether it is plugged directly into the host or through five levels of hubs. USB supports up to 127 devices on a single bus via hubs (7-bit address space), with a maximum of five hub tiers between the host and any device.

Endpoints and Pipes

An endpoint is a data source or sink inside a USB device. Each endpoint is identified by two attributes: its number (0 through 15) and its direction (IN = device-to-host, OUT = host-to-device). This means a device can have up to 32 endpoints: 16 IN and 16 OUT. In practice, most simple devices use far fewer.

Endpoint 0 is mandatory and special. Every USB device must implement EP0 in both directions (IN and OUT). EP0 is used exclusively for control transfers -- the host uses it during enumeration to read descriptors, assign addresses, and configure the device. EP0 is the only endpoint that exists before the device is configured.

All other endpoints (EP1 through EP15, in either direction) are optional. Their existence, direction, transfer type, and maximum packet size are declared in the device's endpoint descriptors, which the host reads during enumeration.

A pipe is the logical connection between host software and a device endpoint. When the host's USB driver opens a connection to a specific endpoint, a pipe is established. The pipe carries data using the transfer type defined by that endpoint.

Transfer Types

This is the core of USB interviews. USB defines four transfer types, each designed for a specific category of data:

Transfer TypeGuaranteed BandwidthError RecoveryLatencyMax Packet Size (FS)Use Case
ControlNoYes (retransmit)Variable64 bytesEnumeration, configuration
BulkNoYes (retransmit)Variable64 bytesFile transfer, printing
InterruptYes (reserved bandwidth)Yes (retransmit)Bounded64 bytesMouse, keyboard, sensors
IsochronousYes (reserved bandwidth)No (no retransmit)Bounded1023 bytesAudio, video streaming

Control transfers are the only type used on EP0. They follow a defined three-stage transaction: SETUP stage (host sends a request), DATA stage (optional data in either direction), and STATUS stage (handshake confirming completion). Control transfers are used for device management -- enumeration, configuration, and vendor-specific commands.

Bulk transfers provide reliable delivery with error correction (CRC check + automatic retransmit on failure) but have no guaranteed bandwidth or latency. Bulk transfers use whatever bandwidth remains after isochronous and interrupt transfers are served. This makes them ideal for large, non-time-critical data transfers like file copies to a USB flash drive or print jobs.

Interrupt transfers guarantee a bounded latency by reserving a polling interval. The host polls the device at a regular interval (1 ms to 255 ms for full-speed). Despite the name, interrupt transfers are NOT interrupt-driven -- the host still polls; the "interrupt" refers to the guaranteed polling schedule. This type is used for devices that need periodic, low-latency data delivery: mice, keyboards, gamepads, and sensor readings.

Isochronous transfers trade reliability for guaranteed timing. They reserve bandwidth and maintain a fixed delivery schedule, but corrupted packets are dropped, not retransmitted. This is correct for real-time audio and video: a late packet is worse than a dropped one. If an audio sample arrives late, the playback buffer underruns and you hear a pop; if it is simply dropped, the codec can interpolate and the glitch is inaudible.

ℹ️Key Insight: Bandwidth Allocation Priority

The USB host allocates bandwidth in this order: isochronous (up to 90% of bus time), then interrupt (guaranteed polling slots), then control (10% reserved), and finally bulk (gets whatever is left). On a bus with active isochronous and interrupt transfers, bulk throughput can drop dramatically. This is why copying files to a USB drive slows down when you plug in a USB webcam.

Enumeration Process

When a device is plugged in, the host goes through a specific sequence to identify and configure it. This sequence is called enumeration and is one of the most commonly asked USB topics in interviews.

px-2 py-1 rounded text-sm font-mono border
Device plugged in
┌─ Attach ──→ Reset ──→ GET_DESCRIPTOR(addr 0) ──→ SET_ADDRESS ─┐
│ │
│ ┌────────────────────────────────────────────────────────────┘
│ │
│ ▼
│ GET_DESCRIPTOR (device, config, interface, endpoint)
│ │
│ ▼
└─ SET_CONFIGURATION ──→ Device Ready (normal operation)

Step by step:

  1. Attachment detection: The host detects a voltage change on D+/D- caused by a pull-up resistor on the device side. A full-speed device pulls D+ high through a 1.5 kOhm resistor; a low-speed device pulls D- high. The host uses this to detect both device presence and speed.

  2. Bus reset: The host drives both D+ and D- low for at least 10 ms. This forces the device into the default state, clearing any previous configuration and resetting its address to 0.

  3. Default address: The device now responds on address 0. The host sends an initial GET_DESCRIPTOR request to address 0 to read the first 8 bytes of the device descriptor (which includes the maximum packet size for EP0).

  4. SET_ADDRESS: The host assigns a unique address (1-127) to the device. From this point on, the device responds only on its assigned address, freeing address 0 for the next newly attached device.

  5. GET_DESCRIPTOR: The host reads the full device descriptor, configuration descriptor(s), interface descriptor(s), endpoint descriptor(s), and string descriptors. This tells the host everything it needs to know: what the device is (VID/PID, device class), how many configurations and interfaces it has, and what endpoints are available.

  6. SET_CONFIGURATION: The host selects a configuration (usually configuration 1 -- most devices have only one). The operating system matches the device's class or VID/PID to a driver and loads it.

  7. Device ready: Normal communication begins on the configured endpoints.

Descriptor Hierarchy

USB descriptors form a tree structure. The host reads these during enumeration to understand what the device is and how to communicate with it. Each descriptor is a packed binary structure with standardized fields.

Device Descriptor (exactly 1 per device)

  • Contains: USB version, device class/subclass/protocol, VID (Vendor ID), PID (Product ID), max packet size for EP0, number of configurations
  • The VID/PID pair uniquely identifies a device model and is used by the OS to select the appropriate driver

Configuration Descriptor (usually 1, sometimes more)

  • Contains: number of interfaces, power requirements (self-powered or bus-powered, max current draw), configuration value
  • A device can have multiple configurations (e.g., a phone might expose different interfaces when charging vs data transfer), but in practice most devices have exactly one

Interface Descriptor (1 or more per configuration)

  • Contains: interface number, class/subclass/protocol, number of endpoints
  • Each interface represents a logical function. A composite device (e.g., a USB headset with audio + volume buttons) has multiple interfaces, one per function

Endpoint Descriptor (0 or more per interface)

  • Contains: endpoint address (number + direction), transfer type (control/bulk/interrupt/isochronous), max packet size, polling interval (for interrupt endpoints)
  • EP0 does not have an explicit endpoint descriptor -- it is implicitly defined

The hierarchy looks like this:

px-2 py-1 rounded text-sm font-mono border
Device Descriptor (1 per device)
└─ Configuration Descriptor (usually 1)
└─ Interface Descriptor (1+ per config)
└─ Endpoint Descriptor (0+ per interface)

Common Device Classes

USB defines standard device classes so that a single driver can handle all devices of the same type. This is what makes plug-and-play work -- you do not need vendor-specific drivers for standard device types.

ClassCodeDescriptionExample Devices
HID0x03Human Interface DeviceMouse, keyboard, gamepad
CDC0x02Communication Device ClassVirtual COM port (USB-serial)
MSC0x08Mass Storage ClassUSB flash drive
Audio0x01Audio streamingUSB microphone, speaker
Video0x0EVideo streamingUSB webcam
Vendor0xFFCustom protocolProprietary devices

CDC-ACM (Communication Device Class - Abstract Control Model) is the most important class for embedded development. It creates a virtual serial port over USB, replacing UART for debug output and console access. The host OS presents it as a standard COM port (Windows) or /dev/ttyACM0 (Linux) with no custom driver needed. Most embedded development boards (STM32 Nucleo, ESP32-S2, nRF52840) use USB-CDC for their serial console.

HID is the second most common class in embedded systems. HID devices are polled by the host using interrupt transfers and exchange data via fixed-format "reports" described in a report descriptor. HID is not limited to keyboards and mice -- any device that sends periodic, small data packets can use HID (e.g., sensor dongles, custom controllers).

Composite devices combine multiple classes in a single USB device. For example, a development board might expose both a CDC interface (serial console) and an MSC interface (drag-and-drop firmware upload) simultaneously. The host OS loads separate drivers for each interface.

USB Speed Negotiation

Speed is determined during device attachment by the placement of the pull-up resistor:

Pull-up LocationSpeed
1.5 kOhm on D-Low Speed (1.5 Mbps)
1.5 kOhm on D+Full Speed (12 Mbps)
D+ pull-up + chirp protocolHigh Speed (480 Mbps)

For Low Speed and Full Speed, the pull-up resistor alone determines the speed. The host detects which data line is pulled high and configures accordingly.

High Speed negotiation is more involved. The device initially connects as Full Speed (D+ pull-up). During the bus reset, the device and host exchange a series of rapid signal chirps (K-J signaling sequences). If both sides support High Speed, they switch to 480 Mbps operation. If either side does not support High Speed, the connection remains at Full Speed.

Most embedded MCUs (STM32F4, nRF52840, RP2040) support Full Speed (12 Mbps) using the integrated USB PHY. High Speed (480 Mbps) typically requires either an MCU with an integrated HS PHY (STM32H7, i.MX RT) or an external ULPI PHY connected to a dedicated high-speed USB peripheral.

USB Power Delivery

USB provides both data and power over the same cable. The power rules are strict and commonly misunderstood:

Before enumeration: A device may draw up to 100 mA from VBUS. This is the "unconfigured" power limit.

After enumeration: The host reads the device's configuration descriptor, which declares the device's maximum power consumption in 2 mA units. The host may then grant up to 500 mA (USB 2.0) or 900 mA (USB 3.0). The device must not exceed the declared amount.

Self-powered vs bus-powered: A self-powered device has its own power supply (wall adapter, battery) but may still draw a small amount of current from USB VBUS (e.g., for pull-up resistors or wake-up detection). A bus-powered device relies entirely on USB for power and must declare its full current requirement in the configuration descriptor.

⚠️Common Trap: Drawing Too Much Current Before Enumeration

A device MUST NOT draw more than 100 mA before enumeration completes. Drawing excessive current before SET_CONFIGURATION can cause the host to disable the port or trigger an overcurrent protection shutdown. This is a common cause of "device not recognized" issues during development -- if your device has power-hungry peripherals (LEDs, motors, radios), they must remain off until the host grants the higher current allowance via SET_CONFIGURATION.

USB Packet Structure

All USB communication is packet-based. A transaction consists of two or three packets:

Packet TypePurposeSent By
TokenIdentifies the transaction type, device address, and endpointHost
DataCarries the payload (DATA0 or DATA1, toggled for error detection)Host or Device
HandshakeAcknowledges receipt (ACK, NAK, STALL)Receiver

The data toggle mechanism (alternating DATA0 and DATA1 packet IDs) is a simple but effective error detection scheme. Both the host and device maintain a toggle bit per endpoint. If a packet is lost or corrupted, the toggle bits go out of sync, and the receiver can detect the mismatch and request retransmission.

NAK does not indicate an error -- it means "no data available right now" (for IN) or "not ready to accept data" (for OUT). The host will retry later. STALL indicates a permanent error -- the endpoint is halted and requires host intervention (a CLEAR_FEATURE request) to recover.

Debugging Story: The Descriptor Mismatch

A custom USB-CDC device worked perfectly on Linux but failed to enumerate on Windows. The host kept resetting the device in an infinite loop. Investigation with a USB protocol analyzer showed that enumeration proceeded normally through GET_DESCRIPTOR, but the host immediately issued another bus reset after reading the descriptors.

The root cause: the device descriptor declared bMaxPacketSize0 = 32 bytes, but the actual EP0 buffer in the firmware was configured for 64 bytes. The device was sending 64-byte responses to control requests, exceeding the declared maximum packet size. Linux was forgiving of this mismatch and accepted the oversized packets. Windows was strict -- upon receiving a packet larger than the declared maximum, it treated it as a protocol violation, reset the device, and tried again, creating the infinite reset loop.

Fixing the device descriptor to declare bMaxPacketSize0 = 64 (matching the actual buffer size) resolved the issue immediately.

Lesson: USB descriptors must exactly match the device's actual capabilities. The descriptor values are not suggestions -- they are a contract between the device and the host. Always test on multiple host operating systems (Windows, macOS, Linux) during development, as they enforce the USB specification with varying degrees of strictness.

What interviewers want to hear: You understand that USB is host-controlled (devices never initiate), you can walk through the enumeration sequence step by step, you know the four transfer types and can match each to its use case, you can describe the descriptor hierarchy and explain what information each level carries, and you appreciate the practical considerations around power limits, speed negotiation, and cross-platform compatibility.

Interview Focus

Classic USB Interview Questions

Q1: "What are the four USB transfer types and when would you use each?"

Model Answer Starter: "USB defines four transfer types, each optimized for different data characteristics. Control transfers are used for device management -- enumeration, configuration, and vendor-specific commands. Bulk transfers provide reliable delivery for large, non-time-critical data like file transfers and printing -- they use whatever bandwidth is left after other transfer types are served. Interrupt transfers guarantee bounded latency by reserving polling slots, making them ideal for HID devices like keyboards and mice. Isochronous transfers reserve bandwidth with guaranteed timing but sacrifice reliability -- corrupted packets are dropped, not retransmitted -- which is correct for audio and video where a late packet is worse than a lost one."

Q2: "Walk me through the USB enumeration process."

Model Answer Starter: "When a device is plugged in, the host detects it via a voltage change on the D+/D- lines caused by the device's pull-up resistor. The host then issues a bus reset by driving both data lines low for 10 ms. After reset, the device responds on the default address 0. The host reads the first 8 bytes of the device descriptor to learn the EP0 max packet size, then assigns a unique address using SET_ADDRESS. Next, the host reads the full set of descriptors -- device, configuration, interface, and endpoint -- to learn the device's capabilities. Finally, the host selects a configuration with SET_CONFIGURATION, loads the appropriate driver, and the device is ready for normal communication."

Q3: "What is the difference between a USB endpoint and a pipe?"

Model Answer Starter: "An endpoint is a physical data buffer in the USB device, identified by its number (0-15) and direction (IN or OUT). It has a fixed transfer type and maximum packet size declared in its endpoint descriptor. A pipe is the logical connection between host software and a specific endpoint -- it is the software abstraction that carries data between the host driver and the device endpoint. EP0 always exists and is used for control transfers. All other endpoints are optional and defined by the device's descriptors."

Q4: "Why would you choose USB CDC over UART for embedded communication?"

Model Answer Starter: "USB CDC creates a virtual serial port that appears as a standard COM port on the host with no custom drivers needed. Compared to UART, USB CDC offers higher throughput (up to 12 Mbps at full speed vs typical UART rates of 115200 baud), built-in error detection via CRC, plug-and-play detection, and it eliminates the need for a separate USB-to-UART bridge chip like FTDI or CP2102 -- the MCU connects directly to the host. USB CDC also provides flow control automatically, preventing buffer overruns that are common with bare UART."

Q5: "How does USB handle errors differently for bulk vs isochronous transfers?"

Model Answer Starter: "Bulk transfers prioritize reliability: every packet includes a CRC, and if the CRC check fails or the receiver sends a NAK, the host automatically retransmits the packet. There is no limit on retries, so data is guaranteed to arrive correctly eventually, though latency is unpredictable. Isochronous transfers prioritize timing: packets still have CRC, but if a packet is corrupted, it is simply dropped -- no retransmission occurs. This is by design for real-time streams like audio and video, where maintaining a steady delivery rate matters more than perfect data integrity."

Trap Alerts

  • Don't say: "USB devices can send data whenever they want" -- the host initiates ALL communication; devices only respond to host polls
  • Don't forget: The enumeration sequence is specific and ordered -- attachment detection, bus reset, address assignment, descriptor reads, configuration selection. Skipping or misordering steps shows a superficial understanding
  • Don't ignore: The difference between interrupt and isochronous transfers -- interrupt has error recovery (retransmit), isochronous does not. Both reserve bandwidth, but they handle errors oppositely

Follow-up Questions

  • "How does a composite USB device expose multiple functions (e.g., CDC + MSC) to the host?"
  • "What happens if two USB devices on the same bus request more combined bandwidth than is available?"
  • "How does USB High Speed negotiation work during the chirp protocol?"
  • "What is the USB data toggle mechanism and why is it necessary?"

Practice

Who initiates ALL communication on a USB bus?

Which USB transfer type drops corrupted packets instead of retransmitting them?

What is the maximum current a USB 2.0 device may draw BEFORE enumeration completes?

During USB enumeration, what address does a newly attached device respond on before SET_ADDRESS?

Which USB device class creates a virtual serial port for embedded debugging?

Real-World Tie-In

USB-CDC Debug Console on a Sensor Node -- Replaced a dedicated FTDI USB-to-UART bridge chip with the MCU's integrated USB peripheral running CDC-ACM. This saved board space, reduced BOM cost, and increased debug throughput from 115200 baud to nearly 1 Mbps. The same USB port also exposed an MSC interface for drag-and-drop firmware updates via a composite device configuration.

Custom HID Sensor Dongle -- Built a wireless sensor receiver that reports temperature, humidity, and air quality data to a host PC using the HID class. By using HID instead of CDC, the device worked on Windows, macOS, and Linux with zero driver installation -- HID drivers are built into every OS. The 64-byte HID report carried all sensor readings in a single interrupt transfer, polled every 10 ms.

Automotive Diagnostics Interface -- Developed a USB device that bridges CAN bus traffic to a laptop for vehicle diagnostics. The device used bulk transfers for high-throughput CAN frame streaming (up to 8000 frames/second) and a control endpoint for configuration commands (bit rate, filter settings). Careful descriptor design and cross-platform testing (Windows, Linux, macOS) ensured reliable enumeration across all diagnostic tool host platforms.