OpenSolaris

Printable Version Enter a New Search
Bug ID 6822628
Synopsis constype returns unexpected value for nvidia card, ogl-select picks generic GLX extension in snv_111
State 10-Fix Delivered:Verified (Fix available in build)
Category:Subcategory xserver:programs
Keywords
Responsible Engineer Alan Coopersmith
Reported Against snv_111
Duplicate Of
Introduced In solaris_nevada
Commit to Fix snv_111a
Fixed In snv_111a
Release Fixed solaris_nevada(snv_111a)
Related Bugs 6815700 , 6822917 , 6827560
Submit Date 26-March-2009
Last Update Date 10-April-2009
Description
During SST testing of snv_111(RE) I've encountered an issue with X server
not loading proper GLX extensions on an Ultra 24 machine with NVS 290 card.

The machine was upgraded via Live Upgrade from snv_110(RE) where no such
issues were seen, it seems like a regression in snv_111(RE).

From Xorg.0.log:
 (EE) NVIDIA(0): Failed to initialize the GLX module; please check in your X
 (EE) NVIDIA(0):     log file that the GLX module has been loaded in your X
 (EE) NVIDIA(0):     server, and that the module is the NVIDIA GLX module.  If
 (EE) NVIDIA(0):     you continue to encounter problems, Please try
 (EE) NVIDIA(0):     reinstalling the NVIDIA driver.

The SMF service svc:/application/opengl/ogl-select:default seems to pick wrong
type of GLX extension, it points to the generic Mesa library:

 > ls -l /var/run/opengl/server/libglx.so
 lrwxrwxrwx   1 root     root          46 Mar 26 14:57 /var/run/opengl/server/libglx.so -> /usr/X11/lib/modules/extensions/mesa/libglx.so*

After inspecting the method script, it's clear that it's expecting different
value being returned by /usr/X/bin/constype:

 + /usr/X/bin/constype
 + DRIVER=NVDA
 + REGISTRY=/tmp/ogl_select9280
 + [ -e /tmp/ogl_select9280 ]
 + touch /tmp/ogl_select9280
 + [ -x /lib/opengl/ogl_select/mesa_vendor_select ]
 + /lib/opengl/ogl_select/mesa_vendor_select identify
 + 1>> /tmp/ogl_select9280
 + [ -x /lib/opengl/ogl_select/nvidia_vendor_select ]
 + /lib/opengl/ogl_select/nvidia_vendor_select identify
 + 1>> /tmp/ogl_select9280
 + [ -f /tmp/ogl_select9280 ]
 + readregistry
 + 0< /tmp/ogl_select9280
 + read DRIVERTMP VENDORTMP
 + [ SUNWtext = NVDA -a mesa !=  ]
 + read DRIVERTMP VENDORTMP
 + [ NVDAnvda = NVDA -a nvidia !=  ]
 + read DRIVERTMP VENDORTMP
 + getprop options/vendor
 + PROPVAL=
 + svcprop -q -p options/vendor application/opengl/ogl-select
 + [ 0 -eq 0 ]
 + svcprop -p options/vendor application/opengl/ogl-select
 + PROPVAL=notset
 + [ notset == "" ]
 + return
 + [ notset !=  -a notset != notset ]
 + /usr/bin/tr [A-Z] [a-z]
 + echo MESA
 + VENDOR=mesa
 + SELECT_SCRIPT=/lib/opengl/ogl_select/mesa_vendor_select
 + [ ! -x /lib/opengl/ogl_select/mesa_vendor_select ]
 + /lib/opengl/ogl_select/mesa_vendor_select

It expects "NVDAnvda", but "NVDA" is returned instead.

Full Xorg.0.log and xorg.conf are attached.
Work Around
Override the vendor string:
 # svccfg -s application/opengl/ogl-select setprop options/vendor = nvidia
You also need to :

 $ svcadm restart application/opengl/ogl-select

Then restart your X Server, compiz should function correctly.
Comments
One of the symptoms of the problem is compiz refusing to start:

 $ compiz
 /usr/bin/compiz-bin (core) - Fatal: Root visual is not a GL visual
 /usr/bin/compiz-bin (core) - Error: Failed to manage screen: 0
 /usr/bin/compiz-bin (core) - Fatal: No manageable screens found on display :0.0
Reproducible starting with b111, but this is not a regression in
the NVIDIA driver.

// b110
# /usr/X11/bin/constype
NVDAnvda
#

// b111 - same system and same NVIDIA driver as b110
# /usr/X11/bin/constype
NVDA
#

A simple test program shows neither the NVIDIA driver nor the
user space interface is broken.  On b111:

# cat a.c
#include <stdlib.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/visual_io.h>

int
main(int argc, char *argv[], char *envp[])
{
	int fd;
	struct vis_identifier v;

	if ((fd = open("/dev/fb", O_RDONLY)) == -1) {
		perror("/dev/fb");
		return (1);
	}

	if (ioctl(fd, VIS_GETIDENTIFIER, &v) == -1) {
		perror("VIS_GETIDENTIFIER");
		return (1);
	}

	printf("kernel driver returns <%s>\n", v.name);

	return (0);
}
# cc a.c
# ./a.out
kernel driver returns <NVDAnvda>
#

Something changed in /usr/X11/bin/constype delivered in the
SUNWxwplt package.  Transferring to X group.
This appears to be a latent bug that may have been triggered by
the build changes to reduce the size of binaries.  main() calls
wu_fbid() with the second argument char** fbname.  wu_fbid()
declares automatic storage for the structure passed to
the VIS_GETIDENTIFIER ioctl:


  static int
  wu_fbid(const char* devname, char** fbname, int* fbtype)
  {
          struct fbgattr fbattr;
          int fd, ioctl_ret;
  #ifdef VIS_GETIDENTIFIER
          int vistype;
          struct vis_identifier fbid;
  #endif

The compiler has the option to allocate this structure on the
stack and may be motivated to do so by the space optimizations
added recently.  The code then just passes this pointer back to
main():

  #ifdef VIS_GETIDENTIFIER
          if ((vistype = ioctl(fd, VIS_GETIDENTIFIER, &fbid)) >= 0) {
              *fbname = fbid.name;
              *fbtype = vistype;
              close(fd);
              return 0;
          }
  #endif

If the vis_identifier structure is on the stack, there is no
guarantee the data (*fbname) is valid after the stack is unwound
on the return to main().  This structure should be declared as
static.
This bug is verified on snv_111 + X B111a .
This bug is verified on X NV  B113,both x86 and sparc.