What is a CRC algorithm? Why would you use it? What are some CRC algorithms? What issues do you need to worry about when using CRC algorithms that might cause problems? Have you ever written a CRC algorithm from scratch?

Question

Accepted Answer

What it is: A CRC (Cyclic Redundancy Check) is an error-detecting checksum. The message is treated as a large binary polynomial and divided, using polynomial arithmetic in GF(2) (carry-less, i.e., XOR-based long division), by a fixed generator polynomial. The remainder of that division is the CRC, appended to the data. The receiver repeats the division (or divides message+CRC and checks for a known constant/zero) — a mismatch indicates corruption. It is excellent at catching the burst errors and bit flips typical of communication channels and storage. Why use it: It is cheap to compute (table or shift-register based), far stronger than a simple additive checksum or parity, and detects common error patterns deterministically — all single-bit errors, all double-bit errors (with a good polynomial), any odd number of bit errors, and all burst errors shorter than the CRC width. It's ubiquitous in UART/CAN/USB/Ethernet frames, flash/EEPROM integrity, firmware images, and bootloader validation. Common CRC algorithms: - CRC-8 — short messages, small sensor/comm packets (e.g., SMBus). - CRC-16 — variants include CRC-16-CCITT (poly 0x1021) and CRC-16-IBM/ANSI (poly 0x8005, used by Modbus); common in serial protocols. - CRC-32 — poly 0x04C11DB7, used by Ethernet, ZIP, PNG, and zlib; the workhorse for larger blocks. Issues to worry about — CRCs are notoriously parameterized, and two correct implementations will disagree unless every parameter matches: - Polynomial choice — must be agreed by both ends; some polynomials have better error-detection (Hamming distance) for a given data length than others. - Initial value — e.g., CRC-32 starts at 0xFFFFFFFF; CRC-16-CCITT variants start at 0xFFFF or 0x0000. - Reflection / bit order (refin/refout) — whether input bytes and/or the output are bit-reversed (LSB-first vs MSB-first). This is the single most common source of "my CRC doesn't match the reference" bugs. - Final XOR value — many CRCs XOR the result with a constant (e.g., 0xFFFFFFFF for CRC-32). - Endianness / byte order — how the multi-byte CRC is placed on the wire and how multi-byte data is fed in. - Table vs. bitwise implementation — bitwise is small/slow; a 256-entry lookup table is fast but costs memory; both must produce identical results given identical parameters. - Fundamental limitations — a CRC is not cryptographic: it does not detect all errors (some multi-bit patterns are missed), and it provides no protection against intentional tampering (an attacker can forge data with a matching CRC). For authenticity use an HMAC/MAC, not a CRC. A from-scratch bitwise CRC-16/CCITT-FALSE (init 0xFFFF, poly 0x1021, no input/output reflection, no final XOR — check value 0x29B1) illustrates the mechanics: For production, the standard practice is to precompute a lookup table (process a byte per iteration) and to validate against a known test vector — the classic check value for CRC-32 over the ASCII string "123456789" is 0xCBF43926, and matching it confirms all parameters are correct. ---