Rotating and truncating Sun Grid Engine Log Files

Contents

  1. Overview

  2. Using logrotate

  3. logchecker.sh

    1. Variables

    2. Command line parameters

    3. Examples


Overview

Grid Engine daemons create log files called "messages" in their respective spool directories. Also, an 'accounting' file is created. A specialized script for truncating these files is supplied with Grid Engine and described below. However it is probably best to use the logrotate method or employ any similar tool supplied with your operating system.

Using logrotate

It is possible simply to use logrotate(1) (or a similar system) to do the rotation on the same basis as files from syslog(8) etc. Here is a logrotate example, to be placed in /etc/logrotate.d/sge. It is necessary to substitute the actual values of $SGE_ROOT and $SGE_CELL for your installation. The parameters should be changed appropriately for local purposes, though the ones here are in production on a medium-sized cluster with a fairly low average throughput. This logrotate example particularly identifies the relevant files. (It is not necessary to use logrotate hooks to interact with the qmaster since it doesn't keep the files open.)

logchecker.sh

The script for truncating log files is found in the following directory:

$SGE_ROOT/util/logchecker.sh

The script is not activated by any of the Sun Grid Engine daemons automatically. It is intended to be edited according to the needs of your site. After customizing the script, you can add an entry to your crontab. The script is can run in verbose mode or completely silently. It can also run in a mode where it only prints what would be done. The script accepts only two command line parameters for overriding the ACTION_ON parameter and the location of the exec daemon spool directory (see below).

Sun Grid Engine Software daemons create log files in the qmaster_spool_dir and execd_spool_dir which are defined in the global cluster configuration, the can be overridden in the local cluster configuration of every execution host (usually this is not done). The directory is usually called 'default', and only if the $SGE_CELL variable is used, 'default' is overridden.

Default location of Sun Grid Engine log files:

<qmaster_spool_dir>/messages
<execd_spool_dir>/<hostname>/messages
<sge_root>/<sge_cell>/common/accounting
  

Since these directories can all be located in the same directory hierarchy in a shared NFS filesystem, or the execd spool directories can point to a local directory, it is possible to specify with the ACTION_ON parameter (see below) which 'messages' files should be rotated when the script is called.


Variables

The following variables need to be configured in the script. The "|" character specifies an alternative. All variables in the script must be entered in Bourne shell syntax. So there may be no white space before or after the equal "=" sign.


Command line parameters

The script accepts the following command line parameters:


Examples

  1. All Sun Grid Engine spool directories are shared. You can call the script on any one of your Sun Grid Engine hosts or on your file server.


    set ACTION_ON to "4" in the script. Set other values according to your needs and add the script to your crontab of one of the above machines.
  2. Sun Grid Engine execd spool are defined only through the global cluster configuration, but point to a local directory.


    set ACTION_ON="3". Add the start of the script to all crontabs of your execds in your cluster. On your qmaster machine (or on your file server) add the following call of the script to your crontab:
       <path_to_script>/logchecker.sh -action_on 1
  3. Sun Grid Engine spool directories of execds are defined in the local configuration.

    Set ACTION_ON="2" in the script:

    On your qmaster machine (or on your file server) add the following call of the script to your crontab:
       <path_to_script>/logchecker.sh -action_on 1
    On your exec hosts add the following line:
       <path_to_script>/logchecker.sh -execd_spool