Home |
Downloads
| Issues
| Bug reporting
Tools/add-ons &c for SGE
This list needs expanding, and possibly contracting elsewhere
— suggestions welcome. These things aren't all necessarily
recommended, or even known to work. See also the HOWTO section.
Accounting
- dbwriter
- Maintains a database view of the reporting data. Distributed with
SGE releases. The ARCo web-based
console isn't supported.
- SunGrid
Graphical Accounting Engine
- Web graphical display of accounting data.
- UBMoD
- Another database/web system for accounting data.
- XDMoD
- Replacement for UBMoD, but more complicated.
- Gold Allocation Manager
- Tracks resource usage and allows limiting use to given
allowances. Seems to have been purged apart from the web
site, but
v. 2.2.0.5 download available.
Requires grid engine integration — contributions wanted.
See also the old experimental version.
- myJAM
- could probably be used with SGE with a little work
Service management
- SDM/Hedeby
- See the distribution of the companion Service Domain Manager
system and documentation.
Energy saving (power-up/down) systems
These could use evaluating, but really need support from SGE to do
a better job.
- Liqueur
- Green scheduler
- CLUES
- Greenginecode
- gewake
- SDM (Hedeby) "cloud" component
Licence management
(Preferably just say No.)
- flex-grid
- Load
sensor-based integration of external licence managers with
consumables (now free software);
- Licence
juggler
- Juggling between clusters
Monitoring
- GE Web
Application
- Web Application to check the status of jobs on a
Grid Engine Cluster.
- PHPQstat
- Web
interface to monitor jobs and queues.
- xml-qstat
- Display of qstat output via XSLT transformations; the original
author suggests PHPQstat instead.
- Job
Monarch
- puts reporting into Ganglia; the SGE support needs tidying up.
- Node
Health Check
or nodediag
- can be used in load sensors to
set an alarm load level in case of problems.
Networked submission
See also PTP.
- Grid Engine Portal
- The old code might be revived by porting away from the non-free
com.oreilly.servlet package and to a current free servlet engine;
- OGF HPC Basic
Profile/Basic Execution Service
(BES)
- BES++, QCG-Computing,
GridSAM
(if you must use SOAP web services)
- Rapid Portal Development
- Rapid development of portal-based user interfaces for
submitting jobs.
- Asynchronous Job Operator
- A tool designed and developed to provide a transparent gateway
between a web application or service and an HPC system.
- BLAH
- Light component accepting commands to manage
jobs on different Local Resource Management Systems.
- Redmine/CWA - Cluster Web Access
- Redmine plugin for Cluster Access via the web.
Authentication
- Maintaining Kerberos (or AFS) credentials
- The basic Kerberos security model doesn't fit at all well with
batch computing, and it's necessary to subvert it to some extent. See
arcxd discussed
in a workshop
paper and
AUKS,
which is now recommended but currently without an SGE integration recipe.
There is also an old system PSR intended for
managing AFS access under PBS which might be adaptable.
[You might be better off with MR-MPI and/or
PHISH for that
sort of thing on an HPC system.]
- Herd
- HDFS-aware
integration
in the SGE distribution.
It builds against an old Hadoop distribution. On an HPC system,
you probably just want to use a normal parallel filesystem under
Hadoop anyway; there are free connectors for OrangeFS, GlusterFS,
and now Lustre.
-
Hadoop on Demand
under
SGE (may need updating).
- myHadoop
and update(?)
- magpie
- doesn't have an SGE port, but that's probably easy and worthwhile
for a more general framework.
- The Intel
scheduling
connector
- also doesn't have an SGE port.
Parallel Debugger Interfaces
See also PTP.
- padb-sge
- Script interfacing
the padb parallel debugger
to work in terms of SGE jobs.
"Workflow"
In case qmake isn't good enough to express dependencies:
- flow
- "Workflow interpreter and processor" in the distribution as an
example of DRMAA with Ruby.
- Workflow
- Manage
the creation, execution and monitoring of a directed acyclic graph
(DAG) of commands (somewhat more general than qmake).
- Makeflow
- Wok
- nextflow has an SGE executor
IDE support
- Eclipse Parallel Tools Platform
- Supposedly supports SGE, but needs fixing, particularly not to use showq.
Other Interfaces
- Schedule-DRMAAc
- Perl DRMAA binding. Unfortunately the GPL licence isn't
compatible with the SISSL of the SGE DRMAA library.
- Alternative DRMAA bindings
- Gridway
has alternative bindings for Java and Ruby to the ones distributed with SGE. It
isn't known whether they work against the SGE C library.
- Go DRMAA binding
- Has the same GPL/SISSL licence issue as the Perl one.
- SGE library for
ruby
- Low-level Python interface to Sun Grid Engine
Deployment
- StarCluster
- Deploy SGE-based cluster on Amazon Web Services (actually needs porting to use SoGE)
- Chef cookbook
Miscellaneous
- sge4vfx
- System for visual effects work.