OpenSolaris

Printable Version Enter a New Search
Bug ID 6493689
Synopsis Jvm crash: curthread set by kernel incorrect
State 10-Fix Delivered (Fix available in build)
Category:Subcategory kernel:amd64
Keywords
Responsible Engineer Sudheer Abdul-salam
Reported Against b03 , b10 , snv_56 , snv_67
Duplicate Of
Introduced In solaris_10
Commit to Fix snv_68
Fixed In snv_68
Release Fixed solaris_nevada(snv_68) , solaris_10u5(s10u5_08) (Bug ID:2155240)
Related Bugs 6491248 , 6500127 , 6501650 , 6514923 , 6519994 , 6530289 , 6538045 , 6576167
Submit Date 14-November-2006
Last Update Date 29-January-2008
Description
We had a JVM dump core in a production process at CBOE this morning. 

The VM is Java HotSpot(TM) Server VM (1.5.0_08-b03 mixed mode) running on a V40, Solaris 10.

When I map the error (4F533F534F4C415249530E4350500096) in the code it points to line 150 in the following function in os_solaris.cpp

141 Thread* ThreadLocalStorage::get_thread_via_cache_slowly(uintptr_t raw_id,
142                                                         int index) {
143   Thread *thread = get_thread_slow();
144   if (thread != NULL) {
145     address sp = os::current_stack_pointer();
146     guarantee(thread->_stack_base == NULL ||
147               (sp <= thread->_stack_base &&
148                  sp >= thread->_stack_base - thread->_stack_size) ||
149                is_error_reported(),
150               "sp must be inside of selected thread stack");
151 
152     thread->_self_raw_id = raw_id;  // mark for quick retrieval
153     _get_thread_cache[ index ] = thread;
154   }
155   return thread;
156 }


Looking at the stack pointer sp (in the filehs_err_pid21796.log), it does seem outside the allocated stack.

Stack: [0x7e2c3000,0x7e2e3000),  sp=0x8486d51c,  free space=104105k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)

Also freespace of 104105K is much larger than the allocation of 10240K by customer.

$ isainfo -v
64-bit amd64 applications
        sse3 sse2 sse fxsr amd_3dnowx amd_3dnow amd_mmx mmx cmov amd_sysc cx8 
        tsc fpu 
32-bit i386 applications
        sse3 sse2 sse fxsr amd_3dnowx amd_3dnow amd_mmx mmx cmov amd_sysc cx8 
        tsc fpu 

13:14:52 mdcas02.infrap /sbt/prod/infra/QAINFRA_CRIT_8.0.3
$ ulimit -a
time(seconds)        unlimited
file(blocks)         unlimited
data(kbytes)         unlimited
stack(kbytes)        10240
coredump(blocks)     unlimited
nofiles(descriptors) 1024
vmemory(kbytes)      unlimited

If you were interested in kernel settings, this is what I get from a box which should be identical to the production box:

$ dtrace -xnolibs  -n BEGIN'{printf("Default = %d LWP Default = %d\n", `default_stksize, `lwp_default_stksize);}'

dtrace: description 'BEGIN' matched 1 probe
CPU     ID                    FUNCTION:NAME
  0      1                    :BEGIN Default = 20480 LWP Default = 24576


Attached is the tar file with the hs_err file, the stdout and stderr from the process, and the output from running pstack, pmap -x, pldd, pargs, and pargs -e on the core file.

The native stack looks like:

-----------------  lwp# 42357 / thread# 42357  --------------------
 fef509d7 _lwp_kill (a575, 6) + 7
 feefcf93 raise    (6) + 1f
 feee0a29 abort    (fedee000, 8486d5ac, fed19eca, 1, f4000a70, 8486d5c0) + cd
 fecbe37f __1cCosFabort6Fi_v_ (1) + 2f
 fed19eca __1cHVMErrorOreport_and_die6M_v_ (8486d5c0) + 58a
 feb1aa29 __1cMreport_fatal6Fpkci1_v_ (fed9c73b, 96, fed9c710) + 39
 fe9e80f7 __1cSThreadLocalStoragebBget_thread_via_cache_slowly6FIi_pnGThread__ (84a41400, 40) + 77
 fe8d9fed __1cKSharedHeapXfill_region_with_object6FnJMemRegion__v_ (8486d650) + 12d
 fe8cedc2 __1cWThreadLocalAllocBufferFreset6M_v_ (8b28114) + 52
 fe8d9c48 __1cWThreadLocalAllocBufferXclear_before_allocation6M_v_ (8b28114) + 38
 fe8d9abe __1cNCollectedHeapXallocate_from_tlab_slow6FpnGThread_I_pnIHeapWord__ (8b280d8, aa) + 8e
 fecfcc8c __1cNCollectedHeapbAcommon_mem_allocate_noinit6FIipnGThread__pnIHeapWord__ (aa, 0, 8b280d8) + 8c
 fe8bfe1b __1cOtypeArrayKlassIallocate6MipnGThread__pnQtypeArrayOopDesc__ (f4000438, 14c, 8b280d8) + 12b
 fe8bffeb __1cKoopFactoryNnew_typeArray6FnJBasicType_ipnGThread__pnQtypeArrayOopDesc__ (5, 14c, 8b280d8) + 2b
 fe8f7ab1 __1cLOptoRuntimePnew_typeArray_C6FnJBasicType_ipnKJavaThread__v_ (5, 14c, 8b280d8, 9e64b478, a5, 9e64b720) + 31
 f822f043 ???????? ()

Also checkout out the directory /net/cores.central/cores/dir33/831095 for all files and core.
Work Around
N/A
Comments
N/A