Multiple different subsystems are running into stack overflow problems on sun4u. The call chains look reasonable: no recursion, no outrageous stack allocations. They are just deep - stack depth around 50 or so.
xxxxx@xxxxx.com 2003-09-12
Two other project teams have already contacted me about the same problem,
arising in completely different contexts. My response:
Subject: Re: More kernel stack overflows
> Jeff, maybe you can provide a wise viewpoint, or suggestions on how
> we might get out of this mess.
Yep. Grow the kernel stack. Memory is cheap. Panics are expensive.
I've been through several kernel stack crises before. They always unfold
the same way. Some particular workload goes too deep. We prune a few
stack frames to fix the offending code path. Then another one comes up.
And another. (Right about now someone suggests that instead of growing
the stack for every thread, we should finally bite the bullet and make
the kernel stack growable. I dig up my mail archive from the last time
we contemplated this, and explain why it's much harder than it sounds.)
Eventually the panic rate becomes so high that we have to act. Nobody
can figure out how to make dynamic stack growth work reliably, so after
one more jurassic outage we accept physics and grow the stack.
So my proposal is that this time, we dispense with all the hand-wringing
and just grow the damn thing.
Jeff