|
Description
|
I found 2 nits in deadman() code:
1) deadman is not able to detect 1s hang
2) in panic message, deadman claims that it timed out after 1 seconds of clock inactivity,
while clock was stopped actually for 2 seconds.
Take look at this snippet of deadman():
1776
1777 if (lbolt != CPU->cpu_deadman_lbolt) {
1778 CPU->cpu_deadman_lbolt = lbolt;
1779 CPU->cpu_deadman_countdown = deadman_seconds;
1780 return;
1781 }
^^^ this block tests whether lbolt is moving from current CPU's point of view. If lbolt moves, it re-sets timer, stores current lbolt and returns from the function.
1782
1783 if (CPU->cpu_deadman_countdown-- > 0)
1784 return;
1785
^^^ When lbolt is stale, decrement timer and return if it was positive
before incrementing. So when one uses deadman_seconds=1, we need to
get twice into this place, until we are able to pass it and trigger panic.
|