sge_shadowd.8
NAME
sge_shadowd - Grid Engine shadow master daemon
SYNOPSIS
sge_shadowd
DESCRIPTION
sge_shadowd is a "light weight" process which can be run on so-called
shadow master hosts in a Grid Engine cluster to detect failure of the
current Grid Engine master daemon, sge_qmaster(8), and to start-up a
new sge_qmaster(8) on the host on which the sge_shadowd runs. If
multiple shadow daemons are active in a cluster, they run a protocol
which ensures that only one of them will start-up a new master daemon.
The hosts suitable as shadow master hosts must have shared root
read/write access to the directory $SGE_ROOT/$SGE_CELL/common, as well
as to the master daemon spool directory (by default
$SGE_ROOT/$SGE_CELL/spool/qmaster). The names of the shadow master
hosts need to be contained in the file
$SGE_ROOT/$xQS_NAME_Sxx_CELL/common/shadow_masters.
RESTRICTIONS
sge_shadowd may only be started by root.
ENVIRONMENT VARIABLES
SGE_ROOT Specifies the location of the Grid Engine standard
configuration files.
SGE_CELL If set, specifies the default Grid Engine cell. To
address a Grid Engine cell sge_shadowd uses (in order of
precedence):
The name of the cell specified in the environment
variable SGE_CELL, if it is set.
The name of the default cell, i.e. default.
SGE_DEBUG_LEVEL
If set, specifies that debug information should be
written to stderr. In addition the level of detail in
which debug information is generated is defined.
SGE_QMASTER_PORT
If set, specifies the TCP port on which sge_qmaster(8)
is expected to listen for communication requests. Most
installations will use a services map entry for the
service "sge_qmaster" instead to define that port.
SGE_DELAY_TIME This variable controls the time for which sge_shadowd
pauses if a takeover bid fails. This value is used only
when there are multiple sge_shadowd instances and they
are contending to be the master. The default is 600
seconds.
SGE_CHECK_INTERVAL
This variable controls the interval between sge_shadowd
checks of the heartbeat file (60 seconds by default).
SGE_GET_ACTIVE_INTERVAL
This variable controls the interval between attempts by
a sge_shadowd instance to take over when the heartbeat
file has not changed. The default is 240 seconds.
FILES
<sge_root>/<cell>/common
Default configuration directory
<sge_root>/<cell>/common/shadow_masters
Shadow master hostname file.
<sge_root>/<cell>/spool/qmaster
Default master daemon spool directory
<sge_root>/<cell>/spool/qmaster/heartbeat
The heartbeat file.
SEE ALSO
sge_intro(1), sge_conf(5), sge_qmaster(8)
COPYRIGHT
See sge_intro(1) for a full statement of rights and permissions.
SGE 8.1.3pre 2007-11-08 SGE_SHADOWD(8)
Man(1) output converted with
man2html