|
Description
|
so i just image-update'd from opensolaris snv_99 to snv_101a. now, i'm having
multiple problems with gdm.
the first issue is that the gdm service has it's own copy of dbus that doesn't exit
when the service is disabled. this dbus-daemon will hang around till smf kills it and
puts the service into the maintenance state. the gdm-binary process associated with
the gdm service contract does get killed as expected when the service is disabled.
---8<---
root@mcescher$ svcs -p gdm
STATE STIME FMRI
online 12:00:44 svc:/application/graphical-login/gdm:default
12:00:44 101048 dbus-daemon
12:00:44 101049 gdm-binary
root@mcescher$ svcadm disable gdm
root@mcescher$ svcs -p gdm
STATE STIME FMRI
online* 12:00:58 svc:/application/graphical-login/gdm:default
12:00:44 101048 dbus-daemon
<... wait ...>
root@mcescher$ svcs -p gdm
STATE STIME FMRI
maintenance 12:01:59 svc:/application/graphical-login/gdm:default
---8<---
the second problem is that i never get a gdm login screen. the nvidia logo flashes
up on the screen, and then my display was going back to console mode (as indicated
by a flashing cursor at the upper left side of the screen). the gdm service is not
going into maintainance mode. when i login via ssh to look around i see that there
is no X server running, the only thing running is dbus-daemon and gdm-binary. if
i remove my /etc/X11/xorg.conf file, the gdm starts up just fine, but i only have
one display active. at first i assumed this was an nvidia driver problem, and i
upgraded to the latests nvidia drivers, that didn't help, so i downgraded to the
snv_99 nvidia drivers, and that also didn't make a difference. i compared the
following log files:
/var/adm/Xorg.0.log
/var/gdm/*
from failed and successfull gdm startups with different /etc/X11/xorg.conf files
to try and isolate the problem, all to no avail. eventually i figured out that
the only thing that caused the problem was having two screens. if i took the same
config file and removed one screen, everything worked. so i decided to truss the
Xorg server to see why it was exiting. i used the following commands:
---8<---
dtrace -w -n 'syscall:::return/execname == "Xorg"/{stop(); exit(0)}'
truss -o Xorg.truss.log -p `pgrep -x Xorg`
---<---
looking at the truss data i saw that Xorg was getting a SIGTERM.
so i wanted to figure out where the SIGTERM was comming from:
---8<---
root@mcescher$ dtrace -n 'proc:::signal-send/args[2] == 15/{trace(execname)}'
dtrace: description 'proc:::signal-send' matched 1 probe
CPU ID FUNCTION:NAME
0 1095 sigtoproc:signal-send gdm-binary
---8<---
so gdm is killing Xorg, to try and figure out why, i stopped gdm-binary
when it was sending a signal:
---8<---
dtrace -w \
-n 'proc:::signal-send/execname == "gdm-binary" && args[2] == 15/{
stop(); trace(pid); exit(0);}'
---8<---
using mdb i then found the following stack trace:
---8<---
> $c1
gdm_server_stop+0xdc(80f1c00)
gdm_slave_quick_exit+0xf3(4) - 4 == DISPLAY_ABORT
gdm_slave_start+0x23c(80f1c00)
gdm_display_manage+0x1de(80f1c00)
gdm_start_first_unborn_local+0x74(0)
main+0x69e(1)
_start+0x7a(1)
---8<---
so then i tried the trick i learned in:
http://defect.opensolaris.org/bz/show_bug.cgi?id=3316
which involved using gdmsetup to enable gdm DEBUG output. (the log file
has been attached to this bug report.) when i checked my log files i noticed
the following at the end of the log file:
---8<---
Nov 6 12:24:05 DEBUG: Handling message: 'START_NEXT_LOCAL'
Nov 6 12:24:05 WARNING: Xinerama active, but <= 0 screens?
Nov 6 12:24:05 DEBUG: slave killing self
Nov 6 12:24:05 DEBUG: term_quit: Final cleanup
Nov 6 12:24:05 DEBUG: gdm_slave_quick_exit: Will kill everything from the display
Nov 6 12:24:05 DEBUG: gdm_server_stop: Server for :0 going down!
Nov 6 12:24:05 DEBUG: gdm_server_stop: Killing server pid 101538
Nov 6 12:24:06 DEBUG: gdm_server_stop: Server pid 101538 dead
Nov 6 12:24:06 DEBUG: gdm_slave_quick_exit: Killed everything from the display
Nov 6 12:24:06 DEBUG: mainloop_sig_callback: Got signal 18
Nov 6 12:24:06 DEBUG: gdm_cleanup_children: child 101537 returned 4
Nov 6 12:24:06 DEBUG: gdm_display_unmanage: Stopping :0 (slave pid: 0)
Nov 6 12:24:06 DEBUG: gdm_display_unmanage: Display stopped
---8<---
Interestingly enough, the Xinerama warning was repeated in my log files. so
i tried adding the following to my xorg.conf file:
---8<---
Section "ServerFlags"
Option "Xinerama" "0"
EndSection
---8<---
Unfortunatly, this didn't help and gdm-binary still thinks that Xinerama is
active. i searched all the files delivered with gdm-binary and found these
two:
/usr/share/gdm/defaults.conf
/usr/share/gdm/factory-defaults.conf
which had "XineramaScreen=0" commented out, but uncommenting these lines
didn't make any difference.
sigh.
looking at the gdm code i see that we're dying here in daemon/slave.c:
---8<---
static void
gdm_screen_init (GdmDisplay *display)
{
...
gboolean have_xinerama = FALSE;
have_xinerama = XQueryExtension (display->dsp,
"XINERAMA",
&opcode,
&firstevent,
&firsterror);
if (have_xinerama) {
int result;
XRectangle monitors[MAXFRAMEBUFFERS];
unsigned char hints[16];
int xineramascreen;
result = XineramaGetInfo (display->dsp, 0, monitors, hints, &n_monitors);
/* Yes I know it should be Success but the current implementation
* returns the num of monitor
*/
if G_UNLIKELY (result <= 0)
gdm_fail ("Xinerama active, but <= 0 screens?");
---8<---
well, after digging around i found out that on snv_99 both the
XQueryExtension() call and the XineramaGetInfo() return non-zero
values. where as on snv_101a, XineramaGetInfo() returns zero.
i don't know why this happens though...
|