|
Description
|
We had a JVM dump core in a production process at CBOE this morning.
The VM is Java HotSpot(TM) Server VM (1.5.0_08-b03 mixed mode) running on a V40, Solaris 10.
When I map the error (4F533F534F4C415249530E4350500096) in the code it points to line 150 in the following function in os_solaris.cpp
141 Thread* ThreadLocalStorage::get_thread_via_cache_slowly(uintptr_t raw_id,
142 int index) {
143 Thread *thread = get_thread_slow();
144 if (thread != NULL) {
145 address sp = os::current_stack_pointer();
146 guarantee(thread->_stack_base == NULL ||
147 (sp <= thread->_stack_base &&
148 sp >= thread->_stack_base - thread->_stack_size) ||
149 is_error_reported(),
150 "sp must be inside of selected thread stack");
151
152 thread->_self_raw_id = raw_id; // mark for quick retrieval
153 _get_thread_cache[ index ] = thread;
154 }
155 return thread;
156 }
Looking at the stack pointer sp (in the filehs_err_pid21796.log), it does seem outside the allocated stack.
Stack: [0x7e2c3000,0x7e2e3000), sp=0x8486d51c, free space=104105k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
Also freespace of 104105K is much larger than the allocation of 10240K by customer.
$ isainfo -v
64-bit amd64 applications
sse3 sse2 sse fxsr amd_3dnowx amd_3dnow amd_mmx mmx cmov amd_sysc cx8
tsc fpu
32-bit i386 applications
sse3 sse2 sse fxsr amd_3dnowx amd_3dnow amd_mmx mmx cmov amd_sysc cx8
tsc fpu
13:14:52 mdcas02.infrap /sbt/prod/infra/QAINFRA_CRIT_8.0.3
$ ulimit -a
time(seconds) unlimited
file(blocks) unlimited
data(kbytes) unlimited
stack(kbytes) 10240
coredump(blocks) unlimited
nofiles(descriptors) 1024
vmemory(kbytes) unlimited
If you were interested in kernel settings, this is what I get from a box which should be identical to the production box:
$ dtrace -xnolibs -n BEGIN'{printf("Default = %d LWP Default = %d\n", `default_stksize, `lwp_default_stksize);}'
dtrace: description 'BEGIN' matched 1 probe
CPU ID FUNCTION:NAME
0 1 :BEGIN Default = 20480 LWP Default = 24576
Attached is the tar file with the hs_err file, the stdout and stderr from the process, and the output from running pstack, pmap -x, pldd, pargs, and pargs -e on the core file.
The native stack looks like:
----------------- lwp# 42357 / thread# 42357 --------------------
fef509d7 _lwp_kill (a575, 6) + 7
feefcf93 raise (6) + 1f
feee0a29 abort (fedee000, 8486d5ac, fed19eca, 1, f4000a70, 8486d5c0) + cd
fecbe37f __1cCosFabort6Fi_v_ (1) + 2f
fed19eca __1cHVMErrorOreport_and_die6M_v_ (8486d5c0) + 58a
feb1aa29 __1cMreport_fatal6Fpkci1_v_ (fed9c73b, 96, fed9c710) + 39
fe9e80f7 __1cSThreadLocalStoragebBget_thread_via_cache_slowly6FIi_pnGThread__ (84a41400, 40) + 77
fe8d9fed __1cKSharedHeapXfill_region_with_object6FnJMemRegion__v_ (8486d650) + 12d
fe8cedc2 __1cWThreadLocalAllocBufferFreset6M_v_ (8b28114) + 52
fe8d9c48 __1cWThreadLocalAllocBufferXclear_before_allocation6M_v_ (8b28114) + 38
fe8d9abe __1cNCollectedHeapXallocate_from_tlab_slow6FpnGThread_I_pnIHeapWord__ (8b280d8, aa) + 8e
fecfcc8c __1cNCollectedHeapbAcommon_mem_allocate_noinit6FIipnGThread__pnIHeapWord__ (aa, 0, 8b280d8) + 8c
fe8bfe1b __1cOtypeArrayKlassIallocate6MipnGThread__pnQtypeArrayOopDesc__ (f4000438, 14c, 8b280d8) + 12b
fe8bffeb __1cKoopFactoryNnew_typeArray6FnJBasicType_ipnGThread__pnQtypeArrayOopDesc__ (5, 14c, 8b280d8) + 2b
fe8f7ab1 __1cLOptoRuntimePnew_typeArray_C6FnJBasicType_ipnKJavaThread__v_ (5, 14c, 8b280d8, 9e64b478, a5, 9e64b720) + 31
f822f043 ???????? ()
Also checkout out the directory /net/cores.central/cores/dir33/831095 for all files and core.
|