OpenSolaris

Printable Version Enter a New Search
Bug ID 6631419
Synopsis gtk-update-icon-cache dies on first boot after install/upgrade
State 10-Fix Delivered:Verified (Fix available in build)
Category:Subcategory gnome:gtk+
Keywords sst-osp
Responsible Engineer Erwann Chenede
Reported Against snv_76 , snv_77 , snv_82 , snv_84 , snv_79a , snv_79b
Duplicate Of
Introduced In
Commit to Fix snv_86
Fixed In snv_86
Release Fixed solaris_nevada(snv_86)
Related Bugs 6648191 , 6666675 , 6668791 , 6670830 , 6672606 , 6674110 , 6677529 , 6680197
Submit Date 19-November-2007
Last Update Date 20-March-2008
Description
after live upgrading my system recently i've noticed that gtk-update-icon-cache
core dumps occasionally.   i've seen this happen twice, with different invocations
but the same stack trace.  here's some examples:
---8<---
root@squee$ (pstack core.gtk-update-icon-.2885.0; pstack core.gtk-update-icon-.2945.0) | sed 's:(.*):():'
core 'core.gtk-update-icon-.2885.0' of 2885:    /usr/bin/gtk-update-icon-cache /usr/share/icons/Tango
 d16f2ee5 _lwp_kill () + 15
 d16a7722 raise    () + 22
 d168586d abort    () + cd
 d15afa7c g_logv   () + 340
 d15afaa5 g_log    () + 25
 0805456c write_dir_index () + dc
 080547f9 write_file () + 1a5
 080549ae build_cache () + ba
 08054cfd main     () + 171
 080523be _start   () + 7a
core 'core.gtk-update-icon-.2945.0' of 2945:    /usr/bin/gtk-update-icon-cache /usr/share/icons/nimbus
 feef2ee5 _lwp_kill () + 15
 feea7722 raise    () + 22
 fee8586d abort    () + cd
 fedafa7c g_logv   () + 340
 fedafaa5 g_log    () + 25
 0805456c write_dir_index () + dc
 080547f9 write_file () + 1a5
 080549ae build_cache () + ba
 08054cfd main     () + 171
 080523be _start   () + 7a
---8<---

i've attached a core file of the crash to this bug report to allow
for further debugging.
During SST testing we found that gtk-update-icon-cache coredumped on M5000 OPL system. Crash happened as part of reboot of the system. See comments for core.

Jan 21 15:04:03 m5000 genunix: [ID 603404 kern.notice] NOTICE: core_log: gtk-update-icon-[10134] core dumped: /autoh
ome/core/core.gtk-update-icon-.10134.1200927839

> ::status
debugging core file of gtk-update-icon (32-bit) from m5000
file: /usr/bin/gtk-update-icon-cache
initial argv: /usr/bin/gtk-update-icon-cache /usr/share/icons/hicolor
threading model: native threads
status: process terminated by SIGABRT (Abort)
> ::showrev
Hostname: m5000
Release: 5.11
Kernel architecture: sun4u
Application architecture: sparc
Kernel version: SunOS 5.11 sun4u snv_79a
Platform: SUNW,SPARC-Enterprise
> $c
libc.so.1`_lwp_kill+8(6, 0, 5, 6, ffffffff, 6)
libc.so.1`abort+0x108(0, 1, 6, ff353940, fbc5c, 0)
libglib-2.0.so.0.1400.2`g_logv+0x484(1570c, 6, 5, 6, 4, ff1e3404)
libglib-2.0.so.0.1400.2`g_log+0x1c(1570c, 4, 15754, 15780, 531, 15880)
write_dir_index+0x100(ff35589c, 524358, 2b230, 52434c, 1570c, 15754)
write_file+0xb4(ff35589c, 29e60, 2b230, 13c00, 72, aaaaaaab)
build_cache+0xdc(ffbffed7, ff0f1678, 2b230, 29e60, ff35589c, 26000)
main+0x1a4(27610, 0, ffbffed7, 26270, 0, ff380140)
_start+0x108(0, 0, 0, 0, 0, 0)
> write_dir_index+0x100::dis
write_dir_index+0xd8:           add       %l2, 0x380, %l1
write_dir_index+0xdc:           add       %o5, 0x354, %i5
write_dir_index+0xe0:           be        +0x28         <write_dir_index+0x108>
write_dir_index+0xe4:           add       %l0, 0x30c, %i4
write_dir_index+0xe8:           sethi     %hi(0x15800), %o7
write_dir_index+0xec:           add       %o7, 0x80, %o5
write_dir_index+0xf0:           mov       0x531, %o4
write_dir_index+0xf4:           mov       0x4, %o1
write_dir_index+0xf8:           mov       %l1, %o3
write_dir_index+0xfc:           mov       %i5, %o2
write_dir_index+0x100:          call      +0x110ec      <PLT=libglib-2.0.so.0.1400.2`g_log>
write_dir_index+0x104:          mov       %i4, %o0
write_dir_index+0x108:          ld        [%i2], %l3
write_dir_index+0x10c:          call      -0x1074       <find_string>
write_dir_index+0x110:          mov       %l3, %o0
write_dir_index+0x114:          add       %o0, 0x1, %l5
write_dir_index+0x118:          cmp       %l5, 0x1
write_dir_index+0x11c:          bgu       +0x28         <write_dir_index+0x144>
write_dir_index+0x120:          mov       %o0, %l4
write_dir_index+0x124:          mov       0x537, %o4
write_dir_index+0x128:          mov       0x4, %o1
> ::regs
%g0 = 0x0000000000000000                 %l0 = 0x00000000 
%g1 = 0x00000000000000a3                 %l1 = 0x00000012 
%g2 = 0x0000000000000002                 %l2 = 0x00001000 
%g3 = 0x0000000000028150                 %l3 = 0xff35069c 
%g4 = 0x0000000000028158                 %l4 = 0x00020000 
%g5 = 0x0000000000000000                 %l5 = 0x00000000 
%g6 = 0x0000000000000000                 %l6 = 0xffffffef 
%g7 = 0x00000000ff382a00                 %l7 = 0xffffffec 
%o0 = 0x0000000000000000                 %i0 = 0x00000006 
%o1 = 0xffffffffffffffff                 %i1 = 0x00000000 
%o2 = 0x0000000000000000                 %i2 = 0x00000005 
%o3 = 0x0000000000000000                 %i3 = 0x00000006 
%o4 = 0x00000000fffffffc                 %i4 = 0xffffffff 
%o5 = 0x0000000000000000                 %i5 = 0x00000006 
%o6 = 0x00000000ffbff518                 %i6 = 0xffbff578 
%o7 = 0x00000000ff27926c libc.so.1`raise+0xc %i7 = 0xff254b44 libc.so.1`abort+0x108

 %psr = 0xfe401000 impl=0xf ver=0xe icc=nZvc
                   ec=0 ef=4096 pil=0 s=0 ps=0 et=0 cwp=0x0
   %y = 0x00000000
  %pc = 0xff2cab10 libc.so.1`_lwp_kill+8
 %npc = 0xff2cab14 libc.so.1`_lwp_kill+0xc
  %sp = 0xffbff518
  %fp = 0xffbff578

 %wim = 0x00000000
 %tbr = 0x00000000
hm.  i've noticed a trend here.
i filed this bug.  it was closed as "not reproducible" with no comments.
someone else saw the same issue, reopened the bug, and once again it
was closed as "not reproducible" with no comments.  this type of back and
forth is silly so i'd like to point some things out.

- there is a bug here.  there are core dumps to prove it.
just because it can't be reproduced trivially doesn't mean the bug
doesn't exist.  so closing the bug report as not reproducibe with no
comments is not helpfull.  this behavior makes it seem like the responsible
group is ignoring bugs.  this in turn discourages people from filing bugs.
this in turns leads to the "quality death spiral".  (feel free to search
google for this term.)

- when updating the state of bugs, it's a good idea to include a description
of all the action taken on the bug.  in this case, obviously someone tried
to reproduce the problem before closing it as "not reproducible".  so what
was done?  was live upgrade testing done?  was the machine configured with
global core dumps enabled?  was the crashing application run by hand?  if
so what were the command line arguments?  was an analysis done on the core
dump?  if so what was discovered?  etc.

- so when is it be ok to mark this bug as "not reproducible"?
the "not reproducible" state makes sense for bugs that have been sitting
in the "incomplete - need more information" state for a long time.  if
a bug is marked as "incomplete - need more information" and no one
can reproduce the problem, and (critically) -no one is seeing the problem on
anymore- for a long time, then there's nothing that can be done to
analyze and fix the bug so it makes sense to close it as "not reproducible".
in the case of this bug it was never in the incomplete state and the problem
was last seen 6 days ago, which doesn't really qualify as a "long time".

- a more appropriate state for this bug would be "incomplete - need more info".
this update should also include a description of all the steps that were
taken while trying to reproduce the problem.  it should also include a
request for more information that can be used to debug the problem.
for example, the next time someone see's this core dump, what information
would you want from them to be able to debug the problem?  the next time
i do an upgrade, is there anything you'd like me to do before or after the
upgrade to collect more information about the failure?  etc.

given my comments above, i'm going to change the state of this bug to
dispatched so that we can try this all again.
wow.  gtk-update-icon-cache seems super fragile.
---8<---
edp@mcescher$ uname -a
SunOS mcescher 5.11 snv_81 i86pc i386 i86pc
root@mcescher$ cd /usr/share/icons 
root@mcescher$ gtk-update-icon-cache  --force
zsh: segmentation fault (core dumped)  gtk-update-icon-cache --force
root@mcescher$ pstack core
core 'core' of 677223:  gtk-update-icon-cache --force
 fee4a3c0 strlen   (8054ede, 80475f0, 8047500, 0) + 30
 fee8d930 vsnprintf (8047550, 1, 8054ec4, 80475f0) + 70
 fed8076f g_printf_string_upper_bound (8054ec4, 80475f0) + 27
 feda4af7 g_vasprintf (80475b0, 8054ec4, 80475f0) + 2f
 fed93e2a g_strdup_vprintf (8054ec4, 80475f0) + 2a
 fed8064d g_printerr (8054ec4, 0) + 2d
 08054e4f main     (1, 8047644, 8047650) + 2e3
 080523be _start   (2, 80477b0, 0, 0, 80477ce, 80477eb) + 7a
---8<---
well, i just live upgraded from snv_81 to snv_84, and guess what happened.
gtk-update-icon-cache core dumped again.  HUGE shocker, i know.
the stack trace is exactly the same as all the other stack traces.
---8<---
root@mcescher$ pargs -l core.gtk-update-icon-.122579.0
/usr/bin/gtk-update-icon-cache /usr/share/icons/Crux 
---8<---
Same bug on a fresh b85 installation on Mac VMWare fusion....

I find it hard to understand such critical issues can appear in promoted Bxxx builds. Could you share with me what are the testing plans and how are they executed between b79, 80, etc? Why this regression? 
I
I have the same problem on both b84 and b85 on both VMWare (Fusion) and bare metal (v2100z).
Test case to simulate the install/upgrade problem :

copy /usr/share/icons/nimbus to /tmp (cp -r /usr/share/icons/nimbus /tmp)
remove the icon cache file in /tmp/nimbus (rm /tmp/nimbus/icon-theme.cache)
run concurrently 2 instance of the command gtk-update-icon-cache /tmp/nimbus 
   (e.g. gtk-update-icon-cache /tmp/nimbus & then gtk-update-icon-cache /tmp/nimbus)

Error :

Gtk-ERROR **: file updateiconcache.c: line 1329: assertion failed: (offset == ftell (cache))
aborting...

Expected result :
Failed to open file /tmp/nimbus/.icon-theme.cache : File exists
Cache file created successfully.

NOTE : the fix to gtk-update-icon-cache itself only avoid the data corruption and crash
so if the if multiple concurrent gtk-update-icon-cache processes are still running on the 
same directory in the post install error log will still appear.
hey erwann, thanks for actually fixing the gtk-update-icon-cache crashes
as well.  fyi, i've gone ahead and file a new bug to track the remaining
isseu wrt jds post install scripts not propegating errors back to smf:
	6677529 jds post install scripts ignore critical errors
Work Around
enable global core dumps:
	mkdir -p /var/core/`uname -n`
	coreadm -g "/var/core/%n/core.%f.%p.%u" -G all
	coreadm -e global -e global-setid -e log

then after you upgrade, check out all the gtk-update-icon-cache
core dumps in /var/core/`uname -n`.  for each core dump run
pargs to see what directory it was processing, and then for
each directory manually run gtk-update-icon-cache.
	gtk-update-icon-cache --force <icon_directory>

weee!  isn't this fun?
an easier workaround is to run the following commands as root:
---8<---
for d in /usr/share/icons/*; do
        [ -d $d ] &&
                gtk-update-icon-cache --force $d;
done
---8<---
Comments
N/A