Home | Downloads | Issues | Bug reporting


This is a somewhat-edited and expanded version of the original Howtos page from the defunct gridengine.sunsource.net site. Some of it is doubtless no longer useful. Please report broken links, obsolete material etc., especially if you know an alternative. The original Howtos are distributed under the original sunsource terms and conditions. Later additions may have different licences—see their notices.

Grid Engine HOWTOs

Table Of Contents

General Grid Engine concepts
Resource management
Cluster management
Special Applications
Tight Integration of Parallel Libraries
Accounting and Reporting Database (ARCo)
DRMAA
Installation, Upgrade, Patches

Content

General Grid Engine concepts

Introduction to Grid Engine video
Basic Usage
Common Administrative Tasks
Customization of Qmon
Migration of Qmaster to Another Machine
Setting Up a Shadow Master
Commonly Seen Problems
Troubleshooting
Array Jobs

Resource management

Managing Resources Abstractly
Consumable Resources
Setting Up Load Sensors to Track Resource Availablility/Utilisation
Different resource management approaches with Grid Engine
Tracking interactive idle time of desktop workstations
Relocating Jobs From a User's Workstation
Grid Engine Enterprise Edition (features now in the generic version)
Sun Grid Engine, Enterprise Edition — Configuration Use Cases and Guidelines (features now in the generic version) [broken link]
Scheduler Policies for Job Prioritization in the N1 Grid Engine 6 System [broken link]
File Staging
Logical resource expressions
Resource quotas

Cluster management

Tuning guide
Master monitoring and bottleneck analysis on Solaris
Command Line and Scripting of Administrative Tasks
Submitting Binaries
Configuring qrsh and qlogin to use ssh, is now described in the remote_startup man page
Rotating and truncating Log Files
Reducing and Eliminating NFS Usage
Installing on a system with multiple network interfaces
Installing on a system with Solaris IP Multipathing
Deploying PCs with Grid Engine enabled KNOPPIX boot images
Using Host Groups and Cluster Queues [broken link]
What Solaris 10 containers are good for? A hands-on sample. [broken link]
Running jobs on data kept (on a USB connected HD) in a separate network via sshfs
Rocks-In-The-Box — A Virtual Rocks Cluster in a VirtualBox
Cluster simulation
Configuration backup
Security
Recipes for commonly-required configurations
Using the Warewulf Node Health Check

Special Applications

SGE Transfer Queue to Globus and GridWay and direct access from GridWay without Globus (not entirely clear it's under the GridWay licence)
Olesen-FLEXlm-Integration, also wiki documentation of the Olesen method
Using Clearcase
Using Mentor ModelSim and Mentor JobSpy
Mathematica
Ansys
Using mpiBLAST [broken link, and the MPI version has been said not to be worth the trouble]
MultiClustering using Transfer Queues
Integration of SGE and Solaris 9 Resource Manager
SGE-Globus integration
Checkpointing jobs using SGE's checkpointing support
Checkpointing under Linux with Berkeley Lab Checkpoint/Restart; see also the BLCR home and updated integration scripts
DMTCP checkpointing
JAM — Job & Application Manager
JGrid — an RMI-based Java interface for Grid Engine
Hostbased authentication for passphraseless SSH communication
Running containers

Tight Integration of Parallel Libraries

Tight Integration of LAM/MPI and SGE
Tight Integration of MPICH and SGE — With Application Notes
Tight Integration of MPICH2 and SGE
Removal of orphaned processes especially for MPICH2's mpd
Tight Integration of PVM and SGE
Mvapich (MPICH Infiniband) + Loose/Tight SGE Integration
Tight integration of Open MPI with SGE and Open MPI suspend/resume

DRMAA

DRMAA C Binding
File Staging in Grid Engine 6.0 with DRMAA
DRMAA JavaTM Language Binding
DRMAA Python Tutorial and Information
See also the Ruby, Perl, Clojure, Tcl, alternative Java/Ruby, Go, and Erlang bindings.

Accounting and Reporting Database (ARCo)

Information from ARCo source repository. The webconsole/reporting components aren't supported.
ARCo and Oracle 10g Database
ARCo on MySQL Database (obsolete)
Setting up dbwriter with Postgres
Space Requirements for the ARCo database

Installation, Upgrade, Patches

Install SGE 6.2 patches
Bugfixes for SGE 6.2
Bugfixes for SGE 6.1
Bugfixes for SGE 6.0
Bugfixes for SGE 5.3
Installation on Windows XP/SFU [broken link] and older Windows material