|
Description
|
We built a cluster using
Solaris 10
Sun Cluster 3.2
VxVM /VxFS 4.1
True Copy
Samba 3.0.24
HA-Samba for Sun Cluster 3.2 agent
The logical hostname is called "mscanberra" (here all logical hostname start with ms).
The cluster nodes are mssydney, msmelbourne (the theme for the cluster is the Australia).
When I start the samba resource, I receive the error message
Feb 6 15:03:23 msmelbourne SC[SUNWscsmb.samba.probe]:mscanberra-rg:mscanberra-smb-server: [ID 182069 daemon.error] check_smbd - Samba server <mscanberra> not working, failed to connect to samba-resource <scmondir>
Feb 6 15:03:53 msmelbourne last message repeated 12 times
Feb 6 15:03:55 msmelbourne SC[SUNWscsmb.samba.probe]:mscanberra-rg:mscanberra-smb-server: [ID 182069 daemon.error] check_smbd - Samba server <mscanberra> not working, failed to connect to samba-resource <scmondir>
Feb 6 15:07:15 msmelbourne last message repeated 77 times
But the customer can access to the netbios share using his PC running Windows XP and Windows Explorer.
Explanation:
The fault monitoring probe runs check_samba (see /opt/SUNWscsmb/bin/functions file).
The line (859) is:
${SMBCLIENT} '\\'${NETBIOSNAME}'\'${SAMBA_FMRESOURCE} -s ${SMBCONF} -U `/usr/bin/echo ${SAMBA_FMUSER}` \
-c 'pwd;exit' 2>/dev/null | /usr/xpg4/bin/grep -i -e ERR -e FAIL > ${TMPF}
In our case, the smbclient command run is
/mscanberra/product/samba/bin/smbclient '\\mscanberra\scmondir' -s /mscanberra/product/samba/mscanberra/lib/smb.conf -U 'CM\sambamon%samb
apass' -c 'pwd;exit' 2>/dev/null | /usr/xpg4/bin/grep -i -e ERR -e FAIL
In our case, the output of the smbclient is:
Current directory is \\mscanberra\scmondir\
Unfortunately, mscanberra contains the string ERR if you ignore the upper/lower case (-i of grep).
setting preferred master to no, turns out in following message :
doing parameter preferred master = No
which is in turn matched by "usr/xpg4/bin/grep -i -e ERR -e FAIL" as an error.
in /opt/SUNWscsmb/bin/functions).
check_nmbd()
{
debug_message "Function: check_nmbd - Begin"
${SET_DEBUG}
rc=0
if [ "${RUN_NMBD}" = "YES" ]
then
for lh in ${LHOST}
do
${NMBLOOKUP} -s ${SMBCONF} -U ${lh} ${NETBIOSNAME} | \
>>>>> /usr/xpg4/bin/grep -i -e ERR -e FAIL > ${TMPF}
if [ -s "${TMPF}" ]
then
# SCMSGS
# @explanation
# nmblookup could not be performed.
# @user_action
# No user action is needed. The Samba server will be
# restarted.
scds_syslog -p daemon.error -t $(syslog_tag) -m \
"check_nmbd - Nmbd for <%s> is not working, failed to retrieve ipnumber for %s" \
"${NETBIOSNAME}" "${NETBIOSNAME}"
rc=1
else
debug_message "check_nmbd - nmblookup for ${lh} is working"
fi
done
fi
debug_message "Function: check_nmbd - End"
return ${rc}
}
this results in :
and results in:
---
"Jul 17 11:01:03 node1 SC[SUNWscsmb.samba.probe]:tranrg:tran-smb-res:
[ID 182069 daemon.error] check_smbd - Samba server <samba> not working,
failed to connect to samba-resource <scmondir>"
and this in turn leads to exit code 100 for the probe-script
|