OpenSolaris

Printable Version Enter a New Search
Bug ID 6280143
Synopsis lufs_enable() strikes again
State 10-Fix Delivered (Fix available in build)
Category:Subcategory kernel:ufs
Keywords BAD | TRAP | dras_proactive | lufs_enable | lufs_snarf | panic | rtiq_regression | s10u1-req | ufs_fiologenable | ufs_fiosdio | ufs_flush | ufs_scan_inodes | ufs_sync | ufs_update | vfs_lock_wait
Responsible Engineer Frank Batschulat
Reported Against snv_01 , snv_15 , s10_74l1 , s10u1_01 , s10_b74l2a
Duplicate Of
Introduced In solaris_7
Commit to Fix snv_29
Fixed In snv_29
Release Fixed solaris_nevada(snv_29) , solaris_10u2(s10u2_03) (Bug ID:2128140)
Related Bugs 4994518 , 4997862 , 5032177 , 6206039 , 6215065 , 6226551 , 6226585 , 6231241
Submit Date 3-June-2005
Last Update Date 7-July-2009
Description
while running fs-pit/tsufs on an amd box for the fix to:

6215065 Booting off single disk from mirrored root pair causes panic reset

the following panic happened 3 times in a row, always with the same picture:

NOTICE: alloc: /tmp/TSufs/ufstest: file system full
WARNING: ufs is forcing a ufs log error
WARNING: ufs log for /tmp/TSufs/ufstest changed state to Error
WARNING: Please umount(1M) /tmp/TSufs/ufstest and run fsck(1M)
WARNING: ufs is forcing a ufs log error
WARNING: ufs log for /tmp/TSufs/ufstest2 changed state to Error
WARNING: Please umount(1M) /tmp/TSufs/ufstest2 and run fsck(1M)
NOTICE: alloc: /tmp/TSufs/ufstest: file system full
NOTICE: alloc: /tmp/TSufs/ufstest2: file system full
NOTICE: realloccg /tmp/TSufs/ufstest: file system full
WARNING: ufs is forcing a ufs log error
WARNING: ufs log for /tmp/TSufs/ufstest changed state to Error
WARNING: Please umount(1M) /tmp/TSufs/ufstest and run fsck(1M)
NOTICE: alloc: /tmp/TSufs/ufstest2: file system full
WARNING: ufs is forcing a ufs log error
WARNING: ufs log for /tmp/TSufs/ufstest2 changed state to Error
WARNING: Please umount(1M) /tmp/TSufs/ufstest2 and run fsck(1M)

panic[cpu0]/thread=ffffffff81d5e520: 
BAD TRAP: type=e (#pf Page fault) rp=fffffe80008d3700 addr=8 occurred in module "ufs" due to a NULL pointer dereference

test_ufs_ops: 
#pf Page fault
Bad kernel fault at addr=0x8
pid=2462, pc=0xfffffffffbad4584, sp=0xfffffe80008d37f0, eflags=0x10282
cr0: 8005003b<pg,wp,ne,et,ts,mp,pe> cr4: 6f0<xmme,fxsr,pge,mce,pae,pse>
cr2: 8 cr3: 9bea000 cr8: c
        rdi: ffffffff88b81900 rsi:                5 rdx:            3ef04
        rcx: ffffffff881bcb68  r8:                0  r9:                0
        rax: ffffffff81d5e520 rbx: ffffffff81d5e520 rbp: fffffe80008d3820
        r10: fb0082c51b2001ff r11: fffffffffbcc1e00 r12: ffffffff8a466b40
        r13: ffffffff81d91700 r14:                0 r15: ffffffff8870b778
        fsb: ffffffff80000000 gsb: fffffffffbc228e0  ds:               43
         es:               43  fs:                0  gs:              1c3
        trp:                e err:                0 rip: fffffffffbad4584
         cs:               28 rfl:            10282 rsp: fffffe80008d37f0
         ss:               30

fffffe80008d3600 unix:die+dd ()
fffffe80008d36f0 unix:trap+67e ()
fffffe80008d3700 unix:cmntrap+13f ()
fffffe80008d3820 ufs:ufs_flush+127 ()
fffffe80008d3860 ufs:ufs_fiosdio+fd ()
fffffe80008d3dd0 ufs:ufs_ioctl+61f ()
fffffe80008d3de0 genunix:fop_ioctl+b ()
fffffe80008d3ec0 genunix:ioctl+1bc ()
fffffe80008d3f10 unix:sys_syscall32+101 ()

> ::status
debugging crash dump vmcore.0 (64-bit) from kickass.Central.Sun.COM
operating system: 5.10.1 fs-test:0520 (i86pc)

ffffffff81d5e520 ffffffff8307e4a8 ffffffff83168200   1   0                0
  PC: panicsys+0x6e    CMD: /opt/TSutil/bin/test_ufs_ops /dev/dsk/c0d0s0 /tmp/TSufs/ufstest /dev/rdsk/c0d0s
  stack pointer for thread ffffffff81d5e520: fffffe80008d34b0
    param_preset()
    die+0xdd()
    trap+0x67e()
    _cmntrap+0x13f()
    ufs_flush+0x127()
    ufs_fiosdio+0xfd()
    ufs_ioctl+0x61f()
    fop_ioctl+0xb()
    ioctl+0x1bc()
    sys_syscall32+0x101()

ffffffff8165e560 ffffffff8116dc30 ffffffff8fa88220   1  29 ffffffff8da0fa0a
  PC: _resume_from_idle+0xde    CMD: mount -o remount,logging /dev/dsk/c0d0s0 /tmp/TSufs/ufstest
  stack pointer for thread ffffffff8165e560: fffffe80009b36d0
  [ fffffe80009b36d0 _resume_from_idle+0xde() ]
    swtch+0x136()
    cv_wait+0x5b()
    rwst_enter_common+0x225()
    rwst_enter+0xe()
    vfs_lock_wait+0x16()
    lufs_enable+0x29f()
    ufs_fiologenable+0x54()
    ufs_ioctl+0x7a7()
    fop_ioctl+0xb()
    ioctl+0x1bc()
    sys_syscall32+0x101()

ufs_flush+0x102:                call   +0x1eb6  <ufs_delete_drain>
ufs_flush+0x107:                movq   %r12,%rdi
ufs_flush+0x10a:                call   +0x26d0  <ufs_idle_drain>
ufs_flush+0x10f:                cmpl   $0x0,-0x28(%rbp)
ufs_flush+0x113:                jne    +0xa     <ufs_flush+0x11d>
ufs_flush+0x115:                movq   %r12,%rdi
ufs_flush+0x118:                call   -0xe9a0  <ufs_checkclean>
ufs_flush+0x11d:                cmpq   $0x0,0x180(%r13)
ufs_flush+0x125:                je     +0x7e    <ufs_flush+0x1a3>
ufs_flush+0x127:                testb  $0x4,0x8(%r14)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

%rip:

> fffffffffbad4584/ai
ufs_flush+0x127:testb  $0x4,0x8(%r14)  ### check for LDL_NOROLL  

%r14:

r14:                0 

233 #define LDL_NOROLL      0x0004  /* Log Not Yet Rollable */ 

my fix does:
 
348         if (TRANS_ISTRANS(ufsvfsp)) {
349                 if (ul->un_flags & LDL_NOROLL) {

the filesystem in question shows:

> ::fsinfo
            VFSP FS              MOUNT
ffffffff8a466b40 ufs             /tmp/TSufs/ufstest

> 0xffffffff81d91700::print ufsvfs_t
{
    vfs_vfs = 0xffffffff8a466b40

    vfs_ulockfs = {
        ul_flag = 0xc   ### ULOCKFS_NOIDEL | ULOCKFS_NOIACC 
        ul_fs_lock = 0x2     ### ULOCKFS_WLOCK  set by lufs_enable() !     
        ul_fs_mod = 0x1
        ul_vnops_cnt = 0  
        ul_lock = {
            _opaque = [ 0xffffffff81d5e520 ]  ### ufs_flush() panic thread !
        }
        ul_cv = {
            _opaque = 0
        }
        ul_sbowner = 0xffffffffffffffff  ### set to -1 from lufs_alloc()
        ul_lockfs = {
            lf_lock = 0x1  LOCKFS_WLOCK ### set by lufs_enable() !
            lf_flags = 0
            lf_key = 0x5
            lf_comlen = 0
            lf_comment = 0
        }
    }

    vfs_log = 0xffffffff8a2cb080  !!!

> 0xffffffff8a2cb080::print ml_unit_t
{
    un_next = 0
    un_flags = 0
    un_bp = 0xfffffe80cd48b940
    un_ufsvfs = 0xffffffff81d91700
    un_dev = 0x6600000000
    un_ebp = 0xffffffff827fd268
    un_nbeb = 0x20
    un_deltamap = 0xffffffff821d2440
    un_logmap = 0xffffffff88b81900

so we do have a log as vfs_log tells us, regardless my fix panics

348         if (TRANS_ISTRANS(ufsvfsp)) {
349                 if (ul->un_flags & LDL_NOROLL) {

it should have failed the TRANS_ISTRANS() check if logging is disabled and
thus no ml_unit_t hangs off the vfs_log field, however we obviously passed
that check, but panic while accessing ufsvfsp->vfs_log->un_flags because
it's 0.

How can this happen, quite easy, ufs_flush() sets ul:

276 ufs_flush(struct vfs *vfsp)
277 {
278         int             error;
279         int             saverror = 0;
280         struct ufsvfs   *ufsvfsp        = (struct ufsvfs *)vfsp->vfs_data;
281         struct fs       *fs             = ufsvfsp->vfs_fs;
282         ml_unit_t       *ul     = ufsvfsp->vfs_log;
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
at this point we must have gotten our 0 for ul from vfs_log in ufsvfs_t

later on we check for logging enabled via:

348         if (TRANS_ISTRANS(ufsvfsp)) {

which does again dereference and check ufsvfs->vfs_log

#define TRANS_ISTRANS(ufsvfsp)  (ufsvfsp->vfs_log) 

now we attempt to dereference the un_flags from ml_unit_t that should
be pointed too by ufsvfs->vfs_log, ul=0, un_flags at offset 0x8 within it
and bumm !

The reason this happened is because when we initilized *ul from ufsvfs->vfs_log
it was in fact 0 and logging was disabled, however we then raced with the 
thread that enabled logging and our 2nd dereference of ufsvfs->vfs_log via
TRANS_ISTRANS() succeeded !

okay:

ufs_fiosdio() has done, note it only checks for H/W lockfs lock

 328         /* hold the mutex to prevent race with a lockfs request */ 
 329         vfs_lock_wait(vp->v_vfsp); 
 330         mutex_enter(&ulp->ul_lock); 
[...]
 346         if (error = ufs_flush(vp->v_vfsp)) 
 
 
ufs_fiologenable()->lufs_enable() has done:

### write lock fs

 933         lf.lf_lock = LOCKFS_WLOCK; 
 934         lf.lf_flags = 0; 
 935         lf.lf_comment = NULL; 
 936         error = ufs_fiolfs(vp, &lf, 1);  ### this has set the write lock we have

### lufs_snarf inits a new ml_unit_t and hangs it off ufsvfsp->vfs_log 

 991          * Create all of the incore structs 
 992          */ 
 993         error = lufs_snarf(ufsvfsp, fs, 0);  
 
1010         vfs_lock_wait(vfsp);   ### blocked here
1011         vfs_setmntopt(vfsp, MNTOPT_LOGGING, NULL, 0); 
1012         ulp = &ufsvfsp->vfs_ulockfs; 
1013         mutex_enter(&ulp->ul_lock); 
1014         (void) ufs_quiesce(ulp); 

this shows how the race works, ufs_fiosdio() does hold any appropriate locks,
it owns the vfs_lock and the ulockfs->ul_lock mutex !
lufs_enable() however, yes it applied the write lock, but does all the work
in lufs_alloc() and lufs_snarf() ! without proper locking and syncronisation,
the grab of the vfs_lock and of the ulockfs->ul_lock is way too late, thus
it raced with out call to ufs_flush() and initilized the incore logging releated
fields/structures behind our back.

 xxxxx@xxxxx.com 2005-06-03 13:09:42 GMT
Work Around
N/A
Comments
N/A