Command Line and Scripting of Administrative Tasks in Grid Engine

The qmon(1) graphical user interface can be used to perform all administrative tasks in Grid Engine, and its usage provides a powerful learning tool for all the capabilities of Grid Engine. However, Grid Engine can also be completely administered through commands issued at the shell prompt and called from within shell scripts. Experienced administrators will find this to be a more flexible, quicker, and powerful way to change Grid Engine settings.

This HOWTO contains an overview and examples of shell-based adminstration. In addition, it contains additional techniques and constructions which can be used to enable more sophisticated tasks, such as wrapper scripts. For more basic level configuration commands, please see the HOWTO entitled "Common Administrative Tasks".

Contents

  1. Add or modify objects using files

  2. Modification of queues, hosts, and environments; qselect

  3. Modification of global configuration and scheduler


Add or modify objects using files

The qconf command can be used to add new objects or modify existing objects from the specification in a file. The syntax is

qconf -{A,M}<object> <filename>

Where -A means add, and -M means modify.

<object> can be:

c: complex
ckpt: checkpoint environment
e: execution host
p: parallel environment
q: queue
u: user set

This option can be used in combination with the "show" option of qconf (qconf -s<obj>) to take an existing object, modify it, and then update the existing object or create a new one.

Example: Write a shell script to specify queues of a checkpoint environment from a list in a file

#!/bin/sh
# ckptq.sh: specify queues of a checkpoint from a list in a file
# Usage: ckptq.sh <checkpoint-env-name> <filename>
# <filename> contains a list of queues,
#    separated by commas and/or newlines

TMPFILE=`mktemp`
CKPT=$1
QUEUELIST=$2

qconf -sckpt $CKPT | grep -v 'queue_list' > $TMPFILE
echo  queue_list `tr "\012" " " < $QUEUELIST | tr "," " "` >> $TMPFILE
qconf -Mckpt $TMPFILE
rm $TMPFILE

Modification of queues, hosts, and environments

Individual queues, hosts, and both parallel and checkpointing enviroments can be modified from the command line by using the qconf -M{q,e,p,ckpt} <filename> command as shown above, or by using the qconf -m{q,e,p,ckpt} <objname> command. This opens a temporary file in an editor, and when you save any changes you make to this file and exit the editor, the system immediately reflects those changes. However, when you want to change many objects at once, and to change object configuration non-interactively, the qconf -...attr set of commands are used.

The first type of commands makes modifications according to the specification on the command line.

qconf -{a,m,r,d}attr queue|exechost|pe|ckpt|hostgroup|resource_quota <attrib> <value> <queue_list>|<host_list>

while the second makes modifications according to specifications in a file:

qconf -{A,M,R,D}attr queue|exechost|pe|ckpt|hostgroup|resource_quota <filename>

In both sets of commands, the options indicate the following:

-A/a: add attribute
-M/m: modify attribute
-R/r: replace attribute
-D/d: delete attribute
<attrib>: queue or host attribute to be changed
<value>: value of attribute to be affected
<filename>: a file containing attribute-value pairs

a, m, d allow you to operate on individual values in a list of values, while r will replace the entire list of values with the new one which is specified, either on the command line or in the file.

Examples:

Change the queue type of "tcf27-e019.q" to batch-only

% qconf -rattr queue qtype batch tcf27-e019.q

Modify the queue type and shell start behavior of tcf27-e019.q based on the contents of the file "new.cfg":

% cat new.cfg
qtype batch interactive checkpointing
shell_start_mode unix_behavior
% qconf -Rattr queue new.cfg tcf27-e019.q

Attach the complexes named "storage" and "license" to the host "tcf27-e019"

% qconf -rattr exechost complex_list storage,license tcf27-e019

Add the resource named "scratch1" with a value of 1000M and "long" with a value of 2

% qconf -rattr exechost complex_values scratch1=1000M,long=2 tcf27-e019

Attach the resource named "short" to the host with a value of 4

% qconf -aattr exechost complex_values short=4 tcf27-e019

Change the value of "scratch1" to 500M while leaving other values untouched

% qconf -mattr exechost complex_values scratch1=500M tcf27-e019

Delete the resource "long"

% qconf -dattr exechost complex_values long tcf27-e019

Add "tcf27-b011.q" to the list of queues for checkpointing enviroment "sph"

% qconf -aattr ckpt queue_list tcf27-b011.q sph

Change the number of slots in parallel environment "make" to 50

% qconf -mattr pe slots 50 make

See also the qconf_scripts below.

qselect

The qselect command outputs a list of queues. If called with options, it lists only queues which match the given specifications. This can be used to great advantage in combination with the qconf -...attr queue commands to target specific queues to modify.

Examples:

all queues on Linux machines

% qselect -l arch=glinux

all queues on machines with 2 CPUs

% qselect -l num_proc=2

all queues on all 4 CPU 64-bit Solaris machines

% qselect -l arch=solaris64,num_proc=4

queues that provide an application license (previously configured)

% qselect -l app_lic=TRUE

You can combine qselect with qconf to do wide-reaching changes with a single command line. To do this, simply put the entire qselect command within backticks, and use it in place of the <queue_list> on the qconf command line.

Examples:

Set the prolog script to sol_prolog.sh on all queues on Solaris machines

% qconf -mattr queue prolog /usr/local/scripts/sol_prolog.sh `qselect -l arch=solaris`

set the attribute "fluent_license" to 2 on all queues on two-processor systems

% qconf -mattr queue complex_values fluent_license=2 `qselect -l num_proc=2`

The use of qconf in conjunction with qselect provides the most flexible way to automate the configuration of Grid Engine queues, allowing you to build up your own custom administration scripts.

For an example of generating a list of hosts on which to operate, see the qselect-node-list script.

qconf -sobjl

Another way to select hosts and queues, which may be more convenient, particularly for selecting hosts, is qconf -sobjl, e.g. to select 64-core hosts:

% qconf -sobjl exechost load_values '*num_proc=64*'

Modification of global configuration and scheduler

To modify the scheduler or global configuration, the qconf -m... command is used, as qconf -mconf to change the global configuration and qconf -msconf for the scheduler. Both of these commands open up a temporary file in an editor. When you exit the editor, any changes you have made to this temporary file are processed by the system and take effect immediately. The editor program used to open the temporary file is the one specified by the EDITOR enviroment variable. If this variable is undefined, then vi is used.

You can take advantage of the EDITOR environment variable to automate the behavior of the qconf -m... commands. Change the value of this variable to point to a program which modifies the file whose name is given by the first argument. After this program modifies the temporary file and exits, the system will read in the modifications and update immediately. NOTE: if the modification time of the file does not change after the edit operation, the system will sometimes incorrectly assume it has not been modified. Therefore, there should be a "sleep 1" inserted before writing the file, to ensure a different modification time.

Example: write a script to modify the schedule interval of the scheduler

#!/bin/sh
# sched_int.sh: modify the schedule interval
# usage: sched_int.sh <n>, where <n> is 
# the new interval, in seconds. n < 60

TMPFILE=`mktemp`
if [ $MOD_SGE_SCHED_INT ]; then
     grep -v schedule_interval $1 > $TMPFILE
     echo "schedule_interval 0:0:$MOD_SGE_SCHED_INT" >> $TMPFILE
# sleep to ensure modification time changes
     sleep 1
     mv $TMPFILE $1
else
     EDITOR=$0 export MOD_SGE_SCHED_INT=$1 qconf -msconf
fi

The sample script above modifies the EDITOR environment to point to itself, and then calls qconf -msconf. This second nested invocation of the script then modifies the temporary file specified by the first argument, and then exits. The Grid Engine system then automatically reads in the changes and the first invocation of the script terminates. The above technique can be used in conjunction with any qconf -m... command. However, it is especially useful for administration of the scheduler and global configuration, since there is no other way to automate this.

A collection of scripts providing qconf -aattr-like interfaces is available.