How do you optimize embedded Linux boot time?
Boot-time optimization starts with measurement — you cannot optimize what you have not profiled. Use grabserial to timestamp the serial console output, bootchart or systemd-analyze for userspace, and kernel printk.time=1 to add timestamps to kernel messages. Identify which stage dominates: bootloader, kernel, or userspace.
Bootloader optimizations: reduce or eliminate the U-Boot autoboot delay (bootdelay=0), skip unnecessary hardware initialization (disable USB, network if not needed for boot), use Falcon mode to bypass U-Boot entirely and have the SPL load the kernel directly, and precompute DRAM timing instead of running calibration at every boot.
Kernel optimizations: build a minimal kernel with only the drivers you need (remove all unused subsystems), use kernel XIP (execute in place) from NOR flash to eliminate decompression time, defer non-critical driver probing with deferred_probe, compile critical drivers as built-in rather than modules to avoid module loading overhead, and use a compressed kernel format like LZ4 which decompresses faster than gzip at the cost of slightly larger images.
Userspace optimizations: replace systemd with a simpler init (BusyBox init or a custom init script), start only essential services, parallelize independent service startup, use readahead to preload files from storage, and move the application launch as early as possible — ideally as the init process itself. With aggressive optimization, sub-one-second boot from power-on to application is achievable on modern SoCs.
Source: Embedded Linux Q&A
