Job_JAPI -- Grid Engine's API for job submission and control.
JAPI_Implementation -- Functions used to implement JAPI
JAPI_Interface -- The enlisted functions are the interface of the JAPI library
JAPI_Session_state -- All global variables together constitute the state of a JAPI session
static pthread_t japi_event_client_thread; static int japi_ec_return_value; static int japi_session = JAPI_SESSION_INACTIVE; static int japi_ec_state = JAPI_EC_DOWN; static u_long32 japi_ec_id = 0; static lList *Master_japi_job_list = NULL; static int japi_threads_in_session = 0; static char *japi_session_key = NULL; static bool japi_delegated_file_staging_is_enabled = false;
japi_event_client_thread - the event client thread. Used by japi_init() and japi_exit() to control start and shutdown of this implementation thread. japi_ec_return_value - return value of the event client thread japi_session - reflects state of a JAPI session state is set to JAPI_SESSION_ACTIVE when japi_init() succeeded and set to JAPI_SESSION_INACTIVE by japi_exit() Code using japi_session must be made reentrant with the mutex japi_session_mutex. japi_ec_state - is used for synchronizing with startup of the event client thread in japi_init() and for synchronizing with event client thread in japi_exit(). Also it is used to ensure blocking functions that depend upon event client functionality finish when the event client thread finishes as a result of a japi_exit() called by another thread. Code using japi_ec_state must be made reentrant with japi_ec_state_mutex. To communicate state transitions the condition variable japi_ec_state_starting_cv is used. japi_ec_id - contains event client id written by event client thread read by thread doing japi_exit() to unregister event client from qmaster. Master_japi_job_list - The Master_japi_job_list contains information about all jobs' state of this session. It is used to allow japi_wait() and japi_synchronize() for waiting for jobs to finish. New jobs are added into this data structure by japi_run_job() and japi_run_bulk_jobs(), job finish information is stored by the event client thread. Jobs are removed by japi_wait() and japi_synchronize() each time when a job is reaped. Code depending upon Master_japi_job_list must be made reentrant using mutex Master_japi_job_list_mutex. To implement synchronous wait for job finish information being added condition variable Master_japi_job_list_finished_cv is used. See japi_threads_in_session on strategy to ensure Master_japi_job_list integrity in case of multiple application threads. japi_threads_in_session - A counter indicating the number of threads depending upon Master_japi_job_list: Each thread entering such a JAPI call must increase this counter and decrease it again when leaving. Code using japi_threads_in_session must be made reentrant using the mutex japi_threads_in_session_mutex. When decreasing the counter to 0 the condition variable japi_threads_in_session_cv is used to notify japi_exit() that Master_japi_job_list can be released. japi_session_key - is a string key used during event client registration to select only those job events that are related to the JAPI session. Code using japi_session_key must be made reentrant with mutex japi_session_mutex. It is assumed the session key is not changed during an active session. japi_delegated_file_staging_is_enabled - An int indicating if delegated file staging is enabled in the cluster configuration. should always be accessed via japi_is_delegated_file_staging_enabled() which protects the variable with a mutex.
japi_add_job() -- Add job/bulk job to library session data
static int japi_add_job(u_long32 jobid, u_long32 start, u_long32 end, u_long32 incr, bool is_array, const char *func)
Add the job/bulk job to the library session data.
u_long32 jobid - the jobid u_long32 start - start index u_long32 end - end index u_long32 incr - increment bool is_array - true for array/bulk jobs false otherwise
static int - DRMAA error codes
MT-NOTES: japi_add_job() is not MT safe due to Master_japi_job_list
japi_allocate_string_vector() -- Allocate a string vector
static drmaa_attr_values_t* japi_allocate_string_vector(int type)
Allocate a string vector iterator. Two different variations are supported: JAPI_ITERATOR_BULK_JOBS Provides bulk job id strings in a memory efficient fashion. JAPI_ITERATOR_STRINGS Implements a simple string list.
int type - JAPI_ITERATOR_BULK_JOBS or JAPI_ITERATOR_STRINGS
static drmaa_attr_values_t* - the iterator
MT-NOTE: japi_allocate_string_vector() is MT safe should be moved to drmaa.c
japi_clean_up_jobs() -- stops jobs still running in the session
int japi_clean_up_jobs(int flag, dstring *diag)
Deletes jobs running in the session when flag is set to JAPI_EXIT_KILL_ALL or JAPI_EXIT_KILL_PENDING.
int - 0 = OK, 1 = Error
MT-NOTES: japi_clean_up_jobs() is MT safe (assumptions)
japi_control() -- Apply control operation on JAPI jobs.
int japi_control(const char *jobid, int action, dstring *diag)
Apply control operation to the job specified. If 'jobid' is DRMAA_JOB_IDS_SESSION_ALL, then this routine acts on all jobs *submitted* during this DRMAA session. This routine returns once the action has been acknowledged, but does not necessarily wait until the action has been completed.
const char *jobid - The job id or DRMAA_JOB_IDS_SESSION_ALL. int action - The action to be performed. One of DRMAA_CONTROL_SUSPEND: stop the job (qmod -s ) DRMAA_CONTROL_RESUME: (re)start the job (qmod -us) DRMAA_CONTROL_HOLD: put the job on-hold (qhold) DRMAA_CONTROL_RELEASE: release the hold on the job (qrls) DRMAA_CONTROL_TERMINATE: kill the job (qdel)
drmaa_attr_values_t **jobidsp - a string array of jobids - on success
int - DRMAA error codes
MT-NOTE: japi_control() is MT safe Would be good to have japi_control() operate on a vector of jobids. Would be good to interface also operations qmod -r and qmod -c.
japi_delete_string_vector() -- Release all resources of a string vector
void japi_delete_string_vector(drmaa_attr_values_t* iter)
Release all resources of a string vector.
drmaa_attr_values_t* iter - to be released
MT-NOTE: japi_delete_string_vector() is MT safe should be moved to drmaa.c
japi_enable_job_wait() -- Do setup required for doing job waits
int japi_enable_job_wait(const char *session_key_in, string *session_key_out, dstring *diag)
Does all of the required setup to be able to use the japi_wait() and japi_synchronize() calls. This includes starting up the event client thread and establishing a session. If japi_init() was called with enable_wait set to false, this method must be called before japi_wait() or japi_synchronize() can be used. This is useful if, for example, when one doesn't know for sure whether japi_wait() will be needed at the time japi_init() is called. The overhead associated with starting and stopping the event client thread and creating and destroying a session can thereby be avoided.
const char *session_key_in - if non NULL japi_enable_job_wait() tries to restart a former session using this session key. error_handler_t handler - A callback to be used for error messages from the event client thread. When NULL, no error messages will be generated by the event client thread. The callback should not free the error message after processing it.
dstring *session_key_out - Returns session key of new session - on success. dstring *diag - Returns diagnosis information - on failure
int - DRMAA error codes
japi_session_mutex -> japi_ec_state_mutex
MT-NOTE: japi_enable_job_wait() is MT safe
japi_exit() -- Optionally close JAPI session and shutdown JAPI library.
int japi_exit(bool close_session, dstring *diag)
Disengage from JAPI library and allow the JAPI library to perform any necessary internal clean up. Depending on 'close_session' this routine also ends a JAPI Session. japi_exit() has no impact on jobs (e.g., queued and running jobs remain queued and running).
bool close_session - If true the JAPI session is always closed otherwise it remains and can be reopened later on.
dstring *diag - diagnostic information - on error
int - DRMAA error codes
japi_session_mutex -> japi_threads_in_session_mutex
MT-NOTE: japi_exit() is MT safe
japi_get_drm_system() -- ???
int japi_get_drm_system(dstring *drm, dstring *diag)
Returns SGE system implementation information. The output contain the DRM name and release information.
dstring *drm - Returns DRM name - on success dstring *diag - Returns diagnostic information - on error. int me - Me.wo progname
int - DRMAA error codes
MT-NOTE: japi_get_drm_system() is MT safe
japi_get_job() -- get job and the queue via GDI for job status
static int japi_get_job(u_long32 jobid, lList **retrieved_job_list, dstring *diag)
We use GDI GET to get jobs status. Additionally also the queue list must be retrieved because the (queue) system suspend state is kept in the queue where the job runs.
u_long32 jobid - the jobs id lList **retrieved_job_list - resulting job list dstring *diag - diagnosis info
static int - DRMAA error codes
MT-NOTES: japi_get_job() is MT safe
Under construction
japi_implementation_thread() -- Control flow implementation thread
MT-NOTE: japi_implementation_thread() is MT safe
japi_init() -- Initialize JAPI library
int japi_init(const char *contact, const char *session_key_in, dstring *session_key_out, dstring *diag)
Initialize JAPI library and create a new JAPI session. This routine must be called before any other JAPI calls, except for japi_version(). Initializes internal data structures. Also registers with qmaster using the event client mechanism if the enable_wait parameter is set to true. If enable_wait is set to false, japi_enable_job_wait() must be called before calling japi_wait() or japi_synchronize(). If enable_wait is set to true, a second thread is spawned as an event client, which imposes threading and synchronization overhead. If japi_wait() and japi_synchronize() are not needed, JAPI can be made much lighter weight by setting enable_wait to false.
const char *contact - 'Contact' is an implementation dependent string which may be used to specify which DRM system to use. If 'contact' is NULL, the default DRM system will be used. const char *session_key_in - if non NULL japi_init() tries to restart a former session using this session key. int my_prog_num - the index into prognames to use when registering with the qmaster. See sge_gdi_setup(). bool enable_wait - Whether to start up in multi-threaded mode to allow japi_wait() and japi_synchronize() to function. When true, a new session is created (if needed), and the event client thread is started. When false, no session string is set, and the event client is not started. When false, japi_synchronize() and japi_wait() will return DRMAA_ERRNO_NO_ACTIVE_SESSION. If enable_wait is set to false, job waiting can be explicitly enabled later by calling the japi_enable_job_wait() function. error_handler_t handler - A callback to be used for error messages from the event client thread. When enable_wait is false, handler should be set to NULL. The callback should not free the error message after processing it.
dstring *session_key_out - Returns session key of new session - on success. dstring *diag - Returns diagnosis information - on failure
int - DRMAA error codes
japi_session_mutex
MT-NOTE: japi_init() is MT safe
japi_init_mt() -- Per thread library initialization
int japi_init_mt(dstring *diag)
Do all per thread initialization required for libraries JAPI builds upon.
dstring *diag - returns diagnosis information - on error
static int - DRMAA error codes
MT-NOTES: japi_init_mt() is MT safe
japi_job_ps() -- Get job status
int japi_job_ps(const char *job_id_str, int *remote_ps, dstring *diag)
Get the program status of the job identified by 'job_id'. The possible values returned in 'remote_ps' and their meanings are: DRMAA_PS_UNDETERMINED = 00H : process status cannot be determined, DRMAA_PS_QUEUED_ACTIVE = 10H : job is queued and active, DRMAA_PS_SYSTEM_ON_HOLD = 11H : job is queued and in system hold, DRMAA_PS_USER_ON_HOLD = 12H : job is queued and in user hold, DRMAA_PS_USER_SYSTEM_ON_HOLD = 13H : job is queued and in user and system hold, DRMAA_PS_RUNNING = 20H : job is running, DRMAA_PS_SYSTEM_SUSPENDED = 21H : job is system suspended, DRMAA_PS_USER_SUSPENDED = 22H : job is user suspended, DRMAA_PS_USER_SYSTEM_SUSPENDED = 23H : job is user and system suspended, DRMAA_PS_DONE = 30H : job finished normally, and DRMAA_PS_FAILED = 40H : job finished, but failed.
const char *job_id_str - A job id
int *remote_ps - Returns the job state - on success dstring *diag - Returns diagnosis information - on error.
int - DRMAA error codes
MT-NOTE: japi_job_ps() is MT safe Would be good to enhance drmaa_job_ps() to operate on an array of jobids. Would be good to have DRMAA_JOB_IDS_SESSION_ALL supported with drama_job_ps(). This function should be changed in a way that local JAPI-internal information is evaluated at first and no GDI request is done if this isn't necessary: (1) A GDI request isn't actually required for argument checking to prevent "jobid" being passed for array jobs or "jobid.taskid" be passed for non-array jobs. This is true at least for jobs that were submitted during the session which can be assumed the majority. Argument checking can be done based on JJ_type. (2) A GDI request isn't actually required if job finish event already arrived at JAPI. in these cases GDI request could be saved. This would help improving qmaster availability.
japi_open_session() -- create or reopen JAPI session
static int japi_open_session(const char *key_in, dstring *key_out, dstring *diag)
A JAPI session is created or reopened, depending on the value of key_in. The session key of the opened session is returned.
const char *key_in - If 'key' is non NULL it is used to reopen the JAPI session. Otherwise a new session is always created.
dstring *key_out - Returns session key of the session that was opened on success. dstring *diag - Diagnosis information - on failure.
static int - DRMAA error codes
MT-NOTE: japi_open_session() is MT safe
japi_parse_jobid() -- Parse jobid string
static int japi_parse_jobid(const char *job_id_str, u_long32 *jp, u_long32 *tp, bool *ap, dstring *diag)
The string is parsed. Jobid and task id are returned, also it is returned whether the id appears to be an array taskid.
const char *job_id_str - the jobid string u_long32 *jp - destination for jobid u_long32 *tp - destination for taskid bool *ap - was it an array task dstring *diag - diagnosis
static int - DRMAA error codes
MT-NOTE: japi_parse_jobid() is MT safe
japi_run_bulk_jobs() -- Submit a bulk of jobs
int japi_run_bulk_jobs(drmaa_attr_values_t **jobidsp, lListElem *sge_job_template, int start, int end, int incr, dstring *diag)
Submit the SGE job template as array job.
lListElem *sge_job_template - SGE job template int start - array job start index int end - array job end index int incr - array job increment
drmaa_attr_values_t **jobidsp - a string array of jobids - on success
int - DRMAA error codes
MT-NOTE: japi_run_bulk_jobs() is MT safe Would be better to return job_id instead of drmaa_attr_values_t.
japi_run_job() -- Submit a job using a SGE job template.
int japi_run_job(dstring *job_id, lListElem *sge_job_template, bool use_euid_egid, dstring *diag)
The job described in the SGE job template is submitted. The id of the job is returned. If use_euid_egid is true, the job is run with the current effective uid and gid rather than the real ones.
lListElem **sge_job_template - SGE job template. Might be modified by JSV dstring *job_id - SGE jobid as string - on success. dstring *diag - diagnosis information - on error.
int - DRMAA error codes
japi_session_mutex -> japi_threads_in_session_mutex Master_japi_job_list_mutex japi_threads_in_session_mutex
MT-NOTE: japi_run_job() is MT safe Would be better to return job_id as u_long32.
japi_send_job() -- Send job to qmaster using GDI
static int japi_send_job(lListElem *job, u_long32 *jobid, dstring *diag)
The job passed is sent to qmaster using GDI. The jobid is returned.
lListElem *job - the job (JB_Type) u_long32 *jobid - destination for resulting jobid dstring *diag - diagnosis information
int - DRMAA error codes
MT-NOTE: japi_send_job() is MT safe
japi_sge_state_to_drmaa_state() -- Map Grid Engine state into DRMAA state
static int japi_sge_state_to_drmaa_state(lListElem *job, bool is_array_task, u_long32 jobid, u_long32 taskid, int *remote_ps, dstring *diag)
All Grid Engine state information is used and combined into a DRMAA job state.
lListElem *job - the job (JB_Type) bool is_array_task - if false jobid is considered the job id of a seq. job, if true jobid and taskid must fit to an existing array task. u_long32 jobid - the jobid of a seq. job or an array job u_long32 taskid - the array task id in case of array jobs, 1 otherwise int *remote_ps - destination of DRMAA job state dstring *diag - diagnosis information
static int - DRMAA error codes
MT-NOTE: japi_sge_state_to_drmaa_state() is MT safe
japi_standard_error() -- Provide standard diagnosis message.
static void japi_standard_error(int drmaa_errno, dstring *diag)
int drmaa_errno - DRMAA error code
dstring *diag - diagnosis message
MT-NOTE: japi_standard_error() is MT safe
japi_stop_event_client() -- stops the event client
int japi_stop_event_client(void)
Uses the Event Master interface to send a SHUTDOWN event to the event client.
int - 0 = OK, 1 = Error
MT-NOTES: japi_stop_event_client() is MT safe (assumptions)
japi_strerror() -- JAPI strerror()
void japi_strerror(int drmaa_errno, char *error_string, int error_len)
Returns readable text version of errno (constant string)
int drmaa_errno - DRMAA error code
A string describing the DRMAA error case for valid DRMAA error code and NULL otherwise.
MT-NOTE: japi_strerror() is MT safe
japi_string_vector_get_next() -- Return next entry of a string vector
int japi_string_vector_get_next(drmaa_attr_values_t* iter, dstring *val)
DRMAA_ERRNO_NO_MORE_ELEMENTS is returned for an empty string vector. The next entry of a string vector is returned.
drmaa_attr_values_t* iter - The string vector
dstring *val - Returns next string value - on success.
int - DRMAA error codes
MT-NOTE: japi_string_vector_get_next() is MT safe
japi_string_vector_get_num() -- Return number of entries of a string vector
int japi_string_vector_get_num(drmaa_attr_values_t* iter)
Returns the total number of elements in the string vector.
drmaa_attr_values_t* iter - The string vector
int - number of entries, -1 on failure
MT-NOTE: japi_string_vector_get_num() is MT safe
japi_sync_job_tasks() -- adjusts JAPI job structure tasks to match the state of the SGE job structure tasks
int japi_sync_job_tasks(lListElem *japi_job, lListElem *sge_job)
Iterates through the JAPI job structure's JJ_not_yet_finished_task_ids list and moves finished jobs into the JJ_finished_tasks list.
The number of finished tasks
MT-NOTES: japi_sync_job_tasks() is MT safe.
japi_synchronize() -- Synchronize with jobs to finish w/ and w/o reaping job finish information.
int japi_synchronize(const char *job_ids[], signed long timeout, bool dispose, dstring *diag)
Wait until all jobs specified by 'job_ids' have finished execution. When DRMAA_JOB_IDS_SESSION_ALL is used as jobid one can synchronize with all jobs that were submitted during this JAPI session. A timeout can be specified to prevent blocking indefinitely. If the call exits before timeout all the jobs have been waited on or there was an interrupt. If the invocation exits on timeout, the return code is DRMAA_ERRNO_EXIT_TIMEOUT. The dispose parameter specifies whether job finish information shall be reaped. This method requires the event client to have been started, either by passing enable_wait as true to japi_init() or by calling japi_enable_job_wait().
const char *job_ids[] - A vector of job id strings. signed long timeout - timeout in seconds or DRMAA_TIMEOUT_WAIT_FOREVER for infinite waiting DRMAA_TIMEOUT_NO_WAIT for immediate returning bool dispose - Whether job finish information shall be reaped.
dstring *diag - Diagnosis information - on error.
int - DRMAA error codes
japi_session_mutex -> japi_threads_in_session_mutex
MT-NOTE: japi_synchronize() is MT safe The caller must check system time before and after this call in order to check how much time has passed. This should be improved.
japi_synchronize_jobids_retry() -- Look whether particular jobs finished
static int japi_synchronize_jobids_retry(const char *job_ids[], int dispose)
The Master_japi_job_list is searched to investigate whether particular jobs specified in job_ids finished. If dispose is true job finish information is also removed during this operation.
const char *job_ids[] - the jobids bool dispose - should job finish information be removed
static int - JAPI_WAIT_ALLFINISHED = there is nothing more to wait for JAPI_WAIT_UNFINISHED = there are still unfinished tasks
japi_synchronize_jobids_retry() does no error checking with the job_ids passed. Assumption is this was ensured before japi_synchronize_jobids_retry() is called. MT-NOTE: due to access to Master_japi_job_list japi_synchronize_jobids_retry() MT-NOTE: is not MT safe; only one instance may be called at a time!
japi_user_hold_add_jobid() -- Helper function for composing GDI request
static int japi_user_hold_add_jobid(u_long32 gdi_action, lList **request_list, u_long32 jobid, u_long32 taskid, bool array, dstring *diag)
Adds a reduced job structure to the request list that causes the job/task be hold/released when it is used with sge_gdi(SGE_JB_LIST, SGE_GDI_MOD).
u_long32 gdi_action - the GDI action to be performed lList **request_list - the request list we operate on u_long32 jobid - the jobid u_long32 taskid - the taskid bool array - true in case of an array job
dstring *diag - diagnosis information in case of an error
int - DRMAA error codes
MT-NOTE: japi_user_hold_add_jobid() is MT safe
japi_wait() -- Wait for job(s) to finish and reap job finish info
int japi_wait(const char *job_id, dstring *waited_job, int *stat, signed long timeout, drmaa_attr_values_t **rusage, dstring *diag)
This routine waits for a job with job_id to fail or finish execution. Passing a special string DRMAA_JOB_IDS_SESSION_ANY instead job_id waits for any job. If such a job was successfully waited its job_id is returned as a second parameter. This routine is modeled on wait3 POSIX routine. To prevent blocking indefinitely in this call the caller could use timeout specifying after how many seconds to time out in this call. If the call exits before timeout the job has been waited on successfully or there was an interrupt. If the invocation exits on timeout, the return code is DRMAA_ERRNO_EXIT_TIMEOUT. The caller should check system time before and after this call in order to check how much time has passed. The routine reaps jobs on a successful call, so any subsequent calls to japi_wait() should fail returning an error DRMAA_ERRNO_INVALID_JOB meaning that the job has been already reaped. This error is the same as if the job was unknown. Failing due to an elapsed timeout has an effect that it is possible to issue japi_wait() multiple times for the same job_id. This method requires the event client to have been started, either by passing enable_wait as true to japi_init() or by calling japi_enable_job_wait().
const char *job_id - job id string representation of job to wait for or DRMAA_JOB_IDS_SESSION_ANY to wait for any job signed long timeout - timeout in seconds or DRMAA_TIMEOUT_WAIT_FOREVER for infinite waiting DRMAA_TIMEOUT_NO_WAIT for immediate returning dstring *waited_job - returns job id string presentation of waited job int *wait_status - returns job finish information about exit status/ signal/whatever int event_mask - Indicates what events to listen for. Can be: JAPI_JOB_START JAPI_JOB_FINISH or a combination by oring them together. int *event - returns the actual event that occurred. When the event_mask includes JAPI_JOB_START, this parameter must be checked to be sure that a JAPI_JOB_START event was received. It is possible, such as in the case of a rejected immediate job, that japi_wait() will return DRMAA_ERRNO_SUCCESS for a JAPI_JOB_FINISH event even though the event_mask was set to JAPI_JOB_START. drmaa_attr_values_t **rusage - returns resource usage information about job run when waiting for JAPI_JOB_FINISH. dstring *diag - diagnosis information in case japi_wait() fails
DRMAA_ERRNO_SUCCESS Job finished. DRMAA_ERRNO_EXIT_TIMEOUT No job end within specified time. DRMAA_ERRNO_INVALID_JOB The job id specified was invalid or DRMAA_JOB_IDS_SESSION_ANY has been specified and all jobs of this session have already finished. DRMAA_ERRNO_NO_ACTIVE_SESSION No active session. DRMAA_ERRNO_DRM_COMMUNICATION_FAILURE DRMAA_ERRNO_AUTH_FAILURE DRMAA_ERRNO_NO_RUSAGE
japi_session_mutex -> japi_threads_in_session_mutex Master_japi_job_list_mutex -> japi_ec_state_mutex
MT-NOTE: japi_wait() is MT safe Would be good to also return information about job failures in JJAT_failed_text. Would be good to enhance japi_wait() in a way allowing not only to wait for job finish events but also other events that have an meaning for the end user, e.g. job scheduled, job started, job rescheduled.
japi_wait_retry() -- seek for job_id in JJ_finished_jobs of all jobs
static int japi_wait_retry(lList *japi_job_list, int wait4any, int jobid, int taskid, bool is_array_task, u_long32 *wjobidp, u_long32 *wtaskidp, bool *wis_task_arrayp, int *wait_status)
Search the passed japi_job_list for finished jobs matching the wait4any/ jobid/taskid condition.
lList *japi_job_list - The JJ_Type japi joblist that is searched. int wait4any - 0 any finished job/task is fine u_long32 jobid - specifies which job is searched u_long32 taskid - specifies which task is searched bool is_array_task - true if it is an array taskid int event_mask - the events to wait for u_long32 *wjobidp - destination for jobid of waited job u_long32 *wtaskidp - destination for taskid of waited job u_long32 *wis_task_arrayp - destination for taskid of waited job int *wait_status - destination for status that is finally returned by japi_wait() int *wevent - destination for actual event received lList **rusagep - destination for rusage info of waited job
static int - JAPI_WAIT_ALLFINISHED = there is nothing more to wait for JAPI_WAIT_UNFINISHED = no job/task finished, but there are still unfinished tasks JAPI_WAIT_FINISHED = got a finished task
MT-NOTE: japi_wait_retry() is MT safe
japi_wexitstatus() -- Get jobs exit status.
int japi_wexitstatus(int *exit_status, int stat, dstring *diag)
Retrieves the exit status of a job assumed it exited regularly according japi_wifexited().
int stat - 'stat' value returned by japi_wait()
int *exit_status - Returns the jobs exit status - on success. dstring *diag - Returns diagnosis information - on error.
int - DRMAA error codes
MT-NOTE: japi_wexitstatus() is MT safe
japi_wifaborted() -- Did the job ever run?
int japi_wifaborted(int *aborted, int stat, dstring *diag)
Evaluates into 'aborted' a non-zero value if 'stat' was returned for a JAPI job that ended before entering the running state.
int stat - 'stat' value returned by japi_wait()
int *aborted - Returns 1 if the job was aborted, 0 otherwise - on success. dstring *diag - Returns diagnosis information - on error.
int - DRMAA error codes
MT-NOTE: japi_wifaborted() is MT safe
japi_wifcoredump() -- Did job core dump?
int japi_wifcoredump(int *core_dumped, int stat, dstring *diag)
If drmaa_wifsignaled() indicates a job died through a signal this function evaluates into 'core_dumped' a non-zero value if a core image of the terminated job was created.
int stat - 'stat' value returned by japi_wait()
int *core_dumped - Returns 1 if a core image was created, 0 otherwise - on success. dstring *diag - Returns diagnosis information - on error.
int - DRMAA error codes
MT-NOTE: japi_wifcoredump() is MT safe
japi_wifexited() -- Has job exited?
int japi_wifexited(int *exited, int stat, dstring *diag)
Allows to investigate whether a job has exited regularly. If 'exited' returns 1 the exit status can be retrieved using japi_wexitstatus().
int stat - 'stat' value returned by japi_wait()
int *exited - Returns 1 if the job exited, 0 otherwise - on success. dstring *diag - Returns diagnosis information - on error.
int - DRMAA error codes
MT-NOTE: japi_wifexited() is MT safe
japi_wifsignaled() -- Did the job die through a signal.
int japi_wifsignaled(int *signaled, int stat, dstring *diag)
Allows to investigate whether a job died through a signal. If 'signaled' returns 1 the signal can be retrieved using japi_wtermsig().
int stat - 'stat' value returned by japi_wait()
int *signaled - Returns 1 if the job died through a signal, 0 otherwise - on success. dstring *diag - Returns diagnosis information - on error.
int - DRMAA error codes
MT-NOTE: japi_wifsignaled() is MT safe
japi_wtermsig() -- Retrieve the signal a job died through.
int japi_wtermsig(dstring *sig, int stat, dstring *diag)
Retrieves the signal of a job assumed it died through a signal according japi_wifsignaled().
int stat - 'stat' value returned by japi_wait()
dstring *sig - Returns signal the job died trough in string form (e.g. "SIGKILL") dstring *diag - Returns diagnosis information - on error.
int - DRMAA error codes
MT-NOTE: japi_wtermsig() is MT safe Would be better to directly SGE signal value, instead of a string.
do_gdi_delete() -- Delete the job list
static int do_gdi_delete (lList **id_list, int action, bool delete_all, dstring diag)
Deletes all the jobs in the job id list, converts and GDI errors into DRMAA errors, and frees the job id list.
lList **id_list - List of job ids to delete. Gets freed. int action - The action that caused this delete bool delete_all - Whether this call is deleting all jobs in the session
dstring *diag - returns diagnosis information - on error
int - DRMAA_ERRNO_SUCCESS on success, DRMAA error code on error.
MT-NOTES: do_gdi_delete() is MT safe
japi_get_contact() -- Return current contact information
void japi_get_contact(dstring *contact)
Current contact information for DRM system
dstring *contact - Returns a string similar to 'contact' of japi_init().
int - DRMAA error code
MT-NOTES: japi_get_contact() is MT safe
japi_is_delegated_file_staging_enabled() -- Is file staging enabled, i.e. is the "delegated_file_staging" configuration entry set to true?
bool japi_is_delegated_file_staging_enabled()
Returns if delegated file staging is enabled.
bool - true if delegated file staging is enabled, else false.
MT-NOTES: japi_is_delegated_file_staging_enabled() is MT safe
japi_read_dynamic_attributes() -- Read the 'dynamic' attributes from the DRM configuration.
static int japi_read_dynamic_attributes(dstring *diag)
Reads from the DRM configuration, which 'dynamic' attributes are enabled.
dstring *diag - returns diagnosis information - on error
int - DRMAA_ERRNO_SUCCESS on success, DRMAA_ERRNO_DRM_COMMUNICATION_FAILURE, DRMAA_ERRNO_INVALID_ARGUMENT on error.
MT-NOTES: japi_read_dynamic_attributes() is not MT safe. It assumes that the calling thread holds the session mutex.
japi_subscribe_job_list() -- Do event subscription for job list
static void japi_subscribe_job_list(const char *japi_session_key, sge_evc_class_t *evc)
Event subscription for job list can be very costly. It requires qmaster to copy the entire job list temporarily at the time when an event is registered. For that reason subscribing the job list was factorized out, so that it can be done only when required. Subscribing the job list event is required only in cases (a) when the client event client connection breaks down e.g. due to qmaster be shut-down and restarted (b) when a JAPI session is restarted e.g when DRMAA is used
const char *japi_session_key - JAPI session key sge_evc_class_t *evc - event client object
MT-NOTE: japi_subscribe_job_list() is MT safe
japi_version() -- Return DRMAA version the JAPI library is compliant to.
void japi_version(unsigned int *major, unsigned int *minor)
Return DRMAA version the JAPI library is compliant to. OUTPUTs unsigned int *major - ??? unsigned int *minor - ???
void - none
MT-NOTE: japi_version() is MT safe
japi_was_init_called() -- Return current contact information
int japi_was_init_called(dstring* diag)
Check if japi_init was already called.
dstring *diag - returns diagnosis information - on error
int - DRMAA_ERRNO_SUCCESS if japi_init was already called, DRMAA_ERRNO_NO_ACTIVE_SESSION if japi_init was not called, DRMAA_ERRNO_INTERNAL_ERROR if an unexpected error occurs.
MT-NOTES: japi_was_init_called() is MT safe
--Job_API
: JAPI --Job_API-JAPI_Implementation
: JAPI -JAPI_Implementation-JAPI_Interface
: JAPI -JAPI_Interface-JAPI_Session_state
: JAPI -JAPI_Session_statedo_gdi_delete
: japi do_gdi_deletejapi_add_job
: JAPI japi_add_jobjapi_allocate_string_vector
: JAPI japi_allocate_string_vectorjapi_clean_up_jobs
: JAPI japi_clean_up_jobsjapi_control
: JAPI japi_controljapi_delete_string_vector
: JAPI japi_delete_string_vectorjapi_enable_job_wait
: JAPI japi_enable_job_waitjapi_exit
: JAPI japi_exitjapi_get_contact
: japi japi_get_contactjapi_get_drm_system
: JAPI japi_get_drm_systemjapi_get_job
: JAPI japi_get_jobjapi_implementation_thread
: JAPI japi_implementation_threadjapi_init
: JAPI japi_initjapi_init_mt
: JAPI japi_init_mtjapi_is_delegated_file_staging_enabled
: japi japi_is_delegated_file_staging_enabledjapi_job_ps
: JAPI japi_job_psjapi_open_session
: JAPI japi_open_sessionjapi_parse_jobid
: JAPI japi_parse_jobidjapi_read_dynamic_attributes
: japi japi_read_dynamic_attributesjapi_run_bulk_jobs
: JAPI japi_run_bulk_jobsjapi_run_job
: JAPI japi_run_jobjapi_send_job
: JAPI japi_send_jobjapi_sge_state_to_drmaa_state
: JAPI japi_sge_state_to_drmaa_statejapi_standard_error
: JAPI japi_standard_errorjapi_stop_event_client
: JAPI japi_stop_event_clientjapi_strerror
: JAPI japi_strerrorjapi_string_vector_get_next
: JAPI japi_string_vector_get_nextjapi_string_vector_get_num
: JAPI japi_string_vector_get_numjapi_subscribe_job_list
: japi japi_subscribe_job_listjapi_sync_job_tasks
: JAPI japi_sync_job_tasksjapi_synchronize
: JAPI japi_synchronizejapi_synchronize_jobids_retry
: JAPI japi_synchronize_jobids_retryjapi_user_hold_add_jobid
: JAPI japi_user_hold_add_jobidjapi_version
: japi japi_versionjapi_wait
: JAPI japi_waitjapi_wait_retry
: JAPI japi_wait_retryjapi_was_init_called
: japi japi_was_init_calledjapi_wexitstatus
: JAPI japi_wexitstatusjapi_wifaborted
: JAPI japi_wifabortedjapi_wifcoredump
: JAPI japi_wifcoredumpjapi_wifexited
: JAPI japi_wifexitedjapi_wifsignaled
: JAPI japi_wifsignaledjapi_wtermsig
: JAPI japi_wtermsig