|
Description
|
I have been running the build test on fw_08 and I've hit several segmentation faults.
core.akd.100155 core.fmd.100451
core.akd.100588 core.in.routed.100426
core.akd.100155 occurred at the same time as core.fmd.100451 - occurred around the same time as a reboot. (Note - core.akd.100155 will be named core.akd/<bugnumber.1> when copied to /net/mdb.sfbay/cores/fishworks)
x4200-07a# mdb core.akd.100155
Loading modules: [ libumem.so.1 libak.so.1 libc.so.1 libsysevent.so.1 libnvpair.so.1 libtopo.so.1 libuutil.so.1 libavl.so.1 libproc.so.1 ld.so.1 ]
> ::stack
libxml2.so.2`xmlFreeNode+0x69(87058d0)
libshare_nfs.so.1`free_protoprops+0x20(8047d40, fe18f095, 86afcf0, fe1a2000,
feef8dff, 0)
libshare_nfs.so.1`nfs_fini+0xb(86afcf0, fe1a2000, feef8dff, 0, 84eb888, 8711148
)
libshare.so.1`proto_plugin_fini+0x35(82d0c48, fea66000, fef51098, 8116008,
82d0c48, 8047d7c)
libshare.so.1`sa_fini+0x5f(86afcf0)
libzfs.so.1`zfs_uninit_libshare+0x32(82d0c48)
libzfs.so.1`libzfs_fini+0x48(82d0c48)
libak.so.1`ak_zfs_fini+0x23(8116008, 85fb4e8)
libak.so.1`akx_fini+0x4f(8116008)
libak.so.1`ak_fini+0x10e(8116008)
main+0x26b(1, 8047eac, 8047eb4)
_start+0x7a(1, 8047f38, 0, 8047f48, 8047f60, 8047f88)
> ::status
debugging core file of akd (32-bit) from x4200-07a
file: /usr/lib/ak/akd
initial argv: /usr/lib/ak/akd
threading model: native threads
status: process terminated by SIGSEGV (Segmentation Fault)
<<eliding fmd core tracked by 6586870>>
<<edliding in.routed core tracked by 6587073>>
The last one is another akd core file that occurred during the aktest run (core.akd.100188 will be named core.akd/<bugnumber.2> when copied to /net/mdb.sfbay/cores/fishworks)
x4200-07a# mdb core.akd.100588
Loading modules: [ libumem.so.1 libak.so.1 libc.so.1 libsysevent.so.1 libnvpair.so.1 libtopo.so.1 libuutil.so.1 libavl.so.1 libproc.so.1 ld.so.1 ]
> ::status
debugging core file of akd (32-bit) from x4200-07a
file: /usr/lib/ak/akd
initial argv: /usr/lib/ak/akd
threading model: native threads
status: process terminated by SIGSEGV (Segmentation Fault)
> ::stack
libxml2.so.2`xmlFreeNode+0x69(32697880)
libshare_nfs.so.1`free_protoprops+0x20(8047d40, fe18f095, 15f69008, fe1a2000,
feef8dff, 0)
libshare_nfs.so.1`nfs_fini+0xb(15f69008, fe1a2000, feef8dff, 0, 252cb948,
12c63e40)
libshare.so.1`proto_plugin_fini+0x35(87bac88, fea66000, fef51098, 8116008,
87bac88, 8047d7c)
libshare.so.1`sa_fini+0x5f(15f69008)
libzfs.so.1`zfs_uninit_libshare+0x32(87bac88)
libzfs.so.1`libzfs_fini+0x48(87bac88, 8116008, f0106748, f0104000, 8047dc8,
f00aaf0b)
libak.so.1`ak_nas_fini+0x74(8116008, 83751a8)
libak.so.1`akx_fini+0x4f(8116008)
libak.so.1`ak_fini+0x10e(8116008)
main+0x26b(1, 8047eac, 8047eb4)
_start+0x7a(1, 8047f38, 0, 8047f48, 8047f60, 8047f88)
>
<<moved output to comments>>
Hit another akd segv today. This time, I was running on a Thumper and I was trying to add a new administrator without using a directory. I filled in the fields and pressed ok, and after waiting awhile, I got the popup window saying that the script had been for a long time, did I want to let the script continue or cancel. I pressed continue, and then I got kicked back to the login screen. The akd core file corresponds to this time period.
x4500-01a# cd /var/ak/core
x4500-01a# mdb core.akd.100210
Loading modules: [ libumem.so.1 libak.so.1 libc.so.1 libsysevent.so.1 libnvpair.so.1 libtopo.so.1 libuutil.so.1 libavl.so.1 libproc.so.1 ld.so.1 ]
> ::status
debugging core file of akd (32-bit) from
file: /usr/lib/ak/akd
initial argv: /usr/lib/ak/akd
threading model: native threads
status: process terminated by SIGSEGV (Segmentation Fault)
> ::stack
libak.so.1`ak_user_create+0x55e(fb61f00c, 8727390)
libak.so.1`akx_invoke+0x92(fb61f00c, f0108d68, f0108e18, 8727390)
libak.so.1`akx_call+0x7c(fb61f00c)
libak.so.1`akx_rpc_svc+0x92(8116008, 8c0c60c, fb61fbd4, 22c, 8462b48, fb61fa64)
libak.so.1`ak_rpc_svc+0xe4(8116008, 8c0c60c, fb61fa5c, 8462b48, fb61fa64, 0)
libak.so.1`ak_frontdoor+0x70(8c0c5e8, fb61fa5c, 3a4, 0, 0)
libak.so.1`ak_door_serve+0xd1(8c0c5e8, fb61fa5c, 3a4, 0, 0, f0042c0c)
libc.so.1`__door_return+0x52()
>
core file has been copied to corefile server - it's name is core.akd.6586857.3
Focusing on the libshare bug (core.akd.6586857.2), we
are attempting to double-free the same XML node:
> $C
08047d00 libxml2.so.2`xmlFreeNode+0x69(32697880)
08047d18 libshare_nfs.so.1`free_protoprops+0x20(8047d40, fe18f095, 15f69008,
fe1a2000, feef8dff, 0)
08047d20 libshare_nfs.so.1`nfs_fini+0xb(15f69008, fe1a2000, feef8dff, 0,
252cb948, 12c63e40)
08047d40 libshare.so.1`proto_plugin_fini+0x35(87bac88, fea66000, fef51098,
8116008, 87bac88, 8047d7c)
08047d5c libshare.so.1`sa_fini+0x5f(15f69008)
08047d7c libzfs.so.1`zfs_uninit_libshare+0x32(87bac88)
08047d98 libzfs.so.1`libzfs_fini+0x48(87bac88, 8116008, f0106748, f0104000,
8047dc8, f00aaf0b)
08047dc4 libak.so.1`ak_nas_fini+0x74(8116008, 83751a8)
08047dfc libak.so.1`akx_fini+0x4f(8116008)
08047e18 libak.so.1`ak_fini+0x10e(8116008)
08047e7c main+0x26b(1, 8047eac, 8047eb4)
08047ea0 _start+0x7a(1, 8047f38, 0, 8047f48, 8047f60, 8047f88)
> 32697880/X
0x32697880: deadbeef
> 32697880::whatis
32697880 is 32697878+8, bufctl 32698ea8 freed from umem_alloc_80
> 32698ea8::bufctl -v
ADDR BUFADDR TIMESTAMP THREAD
CACHE LASTLOG CONTENTS
32698ea8 32697878 365f40053d8 1
80d2010 80736a4 808db00
libumem.so.1`umem_cache_free_debug+0x135
libumem.so.1`umem_cache_free+0x42
libumem.so.1`umem_free+0xd8
libumem.so.1`process_free+0x55
libumem.so.1`free+0x17
libxml2.so.2`xmlFreeNode+0x16a
libshare_nfs.so.1`free_protoprops+0x20
libshare_nfs.so.1`nfs_fini+0xb
libshare.so.1`proto_plugin_fini+0x35
libshare.so.1`sa_fini+0x5f
libzfs.so.1`zfs_uninit_libshare+0x32
libzfs.so.1`libzfs_fini+0x48
libak.so.1`ak_nas_fini+0x74
libak.so.1`akx_fini+0x4f
libak.so.1`ak_fini+0x10e
|