Basic Usage of Grid Engine

qstat — Show job/queue status

no arguments Show user's currently running/pending jobs
-f Show full listing
-j jobid Show detailed information on pending/running job
-u \* Show jobs for all users

qhost — Show job/host status

no arguments Show all execution hosts and information about their configuration
-l attr=val Show only hosts matching the attribute
-j Show information on running jobs
-q Shows information on queues at each host

Using Grid Engine

The main submit commands are qsub, qrsh, qmake, and qtcsh. See the man pages for submit(1), qtcsh(1) and qmake(1) for more details.

qsub — submit scripts

Started with no arguments it accepts input from STDIN (^D to send submit input)
-cwd: Run the job from the current working directory (default: Home directory)
-v var: Pass the variable var (-V passes all variables)
-o: Redirect standard output (default: Home directory)
-e: Redirect standard error (default: Home directory)
-pe PE slots: Use parallel environment

Example:

qsub -cwd -v SOME_VAR -o /dev/null -e /dev/null myjob.sh

In general, qsub is used for traditional batch submit, that is where i/o is directed to a file. Use -b y to submit executable files, rather than shell scripts
See the qsub(1) man page for more details.

qdel — delete jobs

qdel jobid removes a waiting or running job with the given id from the system. See the qdel(1) man page for more details.

qacct — job post mortem

qacct(1) provides information about finished jobs—accounting and possible errors.

qacct -m -j jobnumber
prints information about the master node (in case of parallel jobs and accounting_summary false for the PE). This will usually indicate whether the job was killed because it ran for too long or used too much memory and what the exit status of the script/program was.

qrsh

Qrsh acts similarly to the rsh or ssh command, except that a host name is not given. Instead, a shell script or an executable file is run, potentially on any node in the cluster. I/O is directed back to the submitter's terminal window. By default, if the job cannot be run immediately, qrsh will not queue the job. Using the -now no flag to qrsh will allow jobs to queue. Note that I/O can be redirected with the shell redirect operators. For example, to run the uname -a command:

qrsh uname -a

The uname of some machine the scheduler selects in the cluster will then be displayed on the submitting terminal. To redirect the output,

qrsh uname -a > /tmp/myfile

The output from uname will be written to /tmp/myfile on the submitting host. To allow the command to queue:

qrsh -now no uname -a

If a suitable host is not immediately available the command will block until a suitable host is available. At that time, the command output will be displayed on the submitting terminal. See the qrsh(1) man page for more details.

qtcsh

Grid Engine provides a modified tcsh, qtcsh, which will automatically submit jobs listed in a task file to the cluster. See the qtcsh(1) and qtask(5) man pages for more details.