|
Description
|
[johansen, 15Nov2007]
Jurassic, the NFS and e-mail server for ON developers, had a ton of spew in
its logfiles yesterday evening. We discovered that as a result of running
a debug kernel, and having dtlogin touch various devices, we were loading
the ohci driver about once a minute. Profiling the kernel at 997Hz, showed
a ton of stacks like this:
dtlogin
unix`tsc_read+0x3
genunix`gethrtime+0xd
unix`drv_usecwait+0x98
ohci`ohci_take_control+0xbd
ohci`ohci_init_ctlr+0x3b
ohci`ohci_attach+0x122
genunix`devi_attach+0x87
genunix`attach_node+0x123
genunix`i_ndi_config_node+0xe1
genunix`i_ddi_attachchild+0x67
genunix`devi_attach_node+0xfd
genunix`devi_config_one+0x2bd
genunix`ndi_devi_config_one+0xd8
devfs`dv_find+0x20f
devfs`devfs_lookup+0x69
genunix`fop_lookup+0xf2
genunix`lookuppnvp+0x351
genunix`lookuppnat+0x125
genunix`lookupnameat+0xb9
genunix`cstatat_getvp+0x160
1653
Looking at the paths passed to dv_find, we're able to see that we're trying
to connect USB floppy devices:
/floppy@5/disk@0,0:g 1
/floppy@5/disk@0,0:h 1
/floppy@5/disk@0,0:i 1
/floppy@5/disk@0,0:j 1
/floppy@5/disk@0,0:k 1
/pci108e,cb84@2/floppy@5/disk@0,0:g 1
/pci108e,cb84@2/floppy@5/disk@0,0:h 1
/pci108e,cb84@2/floppy@5/disk@0,0:i 1
/pci108e,cb84@2/floppy@5/disk@0,0:j 1
/pci108e,cb84@2/floppy@5/disk@0,0:k 1
(Jurassic has no floppy disk drives)
Each one of these calls results in invocations of drv_usecwait. As the
system runs around continuously binding these interrupts to cpus, it issues
many messages like this to the log:
Nov 9 08:26:16 jurassic-x4600 pcplusmp: [ID 803547 kern.info] pcplusmp: pciclas
s,0c0310 (ohci) instance 0 vector 0x15 ioapic 0x10 intin 0x15 is bound to cpu 4
The cpu number changes, depending upon which cpu is up next. We also manage
to keep one cpu in the kernel competely busy (100% sys in mpstat) as we
run around busy waiting for the interrupt to bind to each CPU. A quick
dtrace of a binding from CPUs 0-15 on jurassic shows us spending 105565000
microseconds busy waiting. That's 105 seconds of system CPU time burned
as we assign an interrupt to each CPU!
Ed Pilatowicz also took a look at this problem and offered the following comments:
looking at the ohci driver, it seems like it's a bug that
ohci_take_control() uses drv_usecwait(). here's the code:
---8<---
/* now wait for 5 seconds for InterruptRouting to go away */
for (wait = 0; wait < 5000; wait++) {
if ((Get_OpReg(hcr_control) & HCR_CONTROL_IR) == 0)
break;
drv_usecwait(1000);
}
---8<---
given the level of granularity that this code is going for it seems
like it shouldn't be using drv_usecwait() and busy waiting. the
the only reasons for using drv_usecwait() are if the caller is in
interrupt context or if the caller needs very fine granularity.
neither appear to be the case here. looking at the ohci driver, it
appears to use drv_usecwait() all over the place with large timeout
values. it seems like it could easily use some other mechanism.
(delay(), cv_timedwait(), or a perhaps even a custom periodic timer
via ddi_periodic_add().)
It looks like this loop of 5 second waiting is a big part of the problem, and
is wasting a terrific amount of kernel CPU time.
|