Search topics...
WatchdogWatchdog Basicsfoundational

How do you choose the watchdog timeout value?

0 upvotes
Practice with AISoon
Study the fundamentals first — Watchdog topic page

The timeout must be long enough for the software to complete its longest legitimate execution cycle, but short enough to limit the duration of undetected failure. Choosing this value requires understanding your system's timing characteristics under worst-case conditions, not just typical operation.

Too short: Legitimate code paths that take longer than normal — a flash sector erase (which can block for 20-400 ms depending on the flash technology), a computation-heavy DSP filter pass, or a task waiting for a slow external peripheral response — exceed the timeout, causing spurious resets. These are maddening to debug because the system works most of the time but randomly resets under heavy load, low temperature (flash writes are slower), or specific operational sequences. The symptom looks like a hardware defect, not a timeout misconfiguration.

Too long: The system remains in a failed state for an unacceptably long time before the watchdog fires. A 30-second watchdog timeout on a motor controller means the motor could run uncontrolled for 30 seconds after the software hangs — potentially causing physical damage, injury, or product destruction. A 10-second timeout on a communication gateway means 10 seconds of lost data before the system recovers.

Practical approach: Measure the worst-case main loop or task cycle time under maximum load, with all peripheral interactions active and all error-handling paths exercised. Set the watchdog timeout to 2-3x that value to provide margin for timing variability. For a super-loop that normally completes in 5 ms and worst-case in 20 ms, a 50-100 ms timeout is appropriate. For an RTOS watchdog manager checking in every 100 ms, a 300-500 ms timeout provides comfortable margin. Also consider the reset recovery time: if a watchdog reset takes 500 ms to boot and reinitialize, the total downtime per watchdog event is timeout + recovery, and this total must be acceptable for your application's availability requirements.

Source: Watchdog Q&A