Workload Management

Workload Management API functions via PyLoadL


SYNOPSIS


  #
  # Workload Management API
  #

  (rc, errObj) = ll_cluster( host_list, CLUSTER_SET | CLUSTER_UNSET )

  (rc, errObj) = ll_cluster_auth()

  rc = ll_control( control_op, host_list, user_list, job_list, class_list, priority )

  rc = llctl(LL_CONTROL_RECYCLE | LL_CONTROL_RECONFIG | 
               LL_CONTROL_START | LL_CONTROL_STOP |
	       LL_CONTROL_DRAIN | LL_CONTROL_DRAIN_STARTD |
               LL_CONTROL_DRAIN_SCHEDD | LL_CONTROL_PURGE_SCHEDD | 
	       LL_CONTROL_FLUSH | LL_CONTROL_SUSPEND | LL_CONTROL_RESUME |
	       LL_CONTROL_RESUME_STARTD | LL_CONTROL_RESUME_SCHEDD | 
	       LL_CONTROL_FAVOR_JOB | LL_CONTROL_UNFAVOR_JOB |
               LL_CONTROL_FAVOR_USER | LL_CONTROL_UNFAVOR_USER |
               LL_CONTROL_HOLD_USER | LL_CONTROL_HOLD_SYSTEM | 
	       LL_CONTROL_HOLD_RELEASE | LL_CONTROL_PRIO_ABS |
	       LL_CONTROL_PRIO_ADJ | LL_CONTROL_START_DRAINED,
               host_list, class_list )

  rc = llfavorjob( LL_CONTROL_FAVOR_JOB | LL_CONTROL_UNFAVOR_JOB, job_list )

  rc = llfavoruser( LL_CONTROL_FAVOR_USER | LL_CONTROL_UNFAVOR_USER, user_list )

  rc = llhold( LL_CONTROL_HOLD_USER | LL_CONTROL_HOLD_SYSTEM |
                    LL_CONTROL_HOLD_RELEASE, host_list, user_list, job_list )

  (rc, errObj) = ll_modify( EXECUTION_FACTOR | CONSUMABLE_CPUS |
                                    CONSUMABLE_MEMORY | WCLIMIT_ADD_MIN |
                                    JOB_CLASS | ACCOUNT_NO, value, job_step )

  (rc, errObj) = ll_move_job( job_id, cluster_name )

  rc = llprio( LL_CONTROL_PRIO_ABS | LL_CONTROL_PRIO_ADJ, job_list, priority )
  
  (rc, errObj) = ll_preempt( job_step_id, PREEMPT_STEP | RESUME_STEP | SYSTEM_PREEMPT_STEP )

  (rc, errObj) = ll_preempt_jobs( user_list, host_list, job_list, PREEMPT_STEP | RESUME_STEP,
                                    LL_PREEMPT_SUSPEND | LL_PREEMPT_VACATE | LL_PREEMPT_REMOVE
                                    LL_PREEMPT_SYS_HOLD | LL_PREEMPT_USER_HOLD )

  rc = ll_run_scheduler()

  rc = ll_start_job_ext( cluster, proc, from_host, node_list )
  
  rc = ll_terminate_job( cluster, proc, from_host, msg )


DESCRIPTION

The LoadLeveler Workload Management API via PyLoadL has the following functions:

ll_cluster
ll_cluster_auth
ll_control
llctl
llfavorjob
llfavoruser
llhold
ll_modify
ll_movejob
llprio
ll_preempt
ll_preempt_jobs
ll_run_scheduler
ll_start_job_ext
ll_terminate_job

ll_cluster

Function to set following function calls on a selected cluster or unselect a previous selected cluster.

  (rc, errObj) = ll_cluster( cluster_list, cluster_op )
  

Parameters

  1. cluster_list

    List : Currently restricted to a list of one.

  2. cluster_op

    Numeric : CLUSTER_SET - select cluster, CLUSTER_UNSET - unselect cluster.

ll_cluster_auth

Function to generate SSL keys, necessary for secure multicluster communications.

  (rc, errObj) = ll_cluster_auth()
  
ll_control

Function to perfrom control operations against hosts, jobs, users or job classes.

  rc = ll_control( control_op, host_list, user_list, job_list, class_list, priority )
  

Parameters

  1. Control Operation


  2. host_list

    List : Host machines to perforn control operation on.

  3. user_list

    List : Users to perform control operation on.

  4. job_list

    List : Job step IDs to perfrom control operation on.

  5. class_list

    List : Users to perform control operation on.

  6. priority

    Numeric : Value to be assigned fro control operation.

llfavoruser

Function to favor and unfavor given users, this is really just a wrapper function of ll_control.

rc = llfavoruser( LL_CONTROL_FAVOR_USER | LL_CONTROL_UNFAVOR_USER, user_list)

Parameters

  1. Favor Operation

    1. LL_CONTROL_FAVOR_USER

      Favor the users in user_list.

    2. LL_CONTROL_UNFAVOR_USER

      Unfavor the users in user_list.

  2. user_list

    List : Users to perform hold operation on.

llhold

Function to hold and release given job steps or users, this is really just a wrapper function of ll_control.

rc = llhold( LL_CONTROL_HOLD_USER | LL_CONTROL_HOLD_SYSTEM | LL_CONTROL_HOLD_RELEASE, host_list, user_list, job_list)

Parameters

  1. Hold Operation

    1. LL_CONTROL_HOLD_USER

      Place on user hold.

    2. LL_CONTROL_HOLD_SYSTEM

      Place on system hold, you need to be a LoadLeveler administer to perfrom this operation.

    3. LL_CONTROL_HOLD_RELEASE

      Release from hold, you need to be a LoadLeveler adminster to perfrom this against system held jobs.

  2. host_list

    List : Host machines.

  3. user_list

    List : Users to perform hold operation on.

  4. job_list

    List : Job step IDs to perfrom hold operation on.

llprio

Function to adjust the priorities of job steps, this is really just a wrapper function of ll_control.

rc = llprio( LL_CONTROL_PRIO_ABS | LL_CONTROL_PRIO_ADJ, job_list, priority )

Parameters

  1. Priority Operation

    1. LL_CONTROL_PRIO_ABS

      New absolute priority value.

    2. LL_CONTROL_PRIO_ADJ

      New adjusted priority value.

  2. job_list

    List : Job step IDs.

  3. priority

    Numeric : Priority value to assign to the list of job steps.

ll_preempt

Function to preempt a running job step or to resume a job_step that has already been preempted through the LoadLeveler llpreempt command or via ll_preempt. ll_preempt cannot resume a job step preempted through PREEMPT_CLASS (system-initiated).

  (rc, errObj) = ll_preempt(job_step, preempt_op)

Parameters

  1. job_step

    String : The Job Step ID.

  2. preempt_op

    Constant : Preemption operation, which can be the following -

    1. PREEMPT_STEP

      Preempts the job step.

    2. RESUME_STEP

      Resumes the job step.

ll_preempt_jobs

Function to preempt a set of running job steps using the specified preempt method, or to resume job steps that have already been preempted with the preempt method of suspend through the llpreempt command or the ll_preempt_jobs routine. The ll_preempt_jobs routine cannot resume a job step that was preempted through the PREEMPT_CLASS rules, or a job step that was preempted with a preempt method other than suspend.

  (rc, errObj) = ll_preempt_jobs(user_list, host_list, job_list, preempt_op, preempt_method)

Parameters

  1. user_list

    List : Users to be targeted.

  2. host_list

    List : Hosts to be targeted.

  3. job_list

    List : Jobsteps in the form host.job_id.step_id i.e shivling.5.0

  4. preempt _op - Preemption operation to perform
    1. PREEMPT_STEP

      Preempts the job step.

    2. RESUME_STEP

      Resumes the job step.

  5. preempt_method - Preemption method to perform
    1. LL_PREEMPT_SUSPEND

      Preempts the job step.

    2. LL_PREEMPT_VACATE

      Resumes the job step.

    3. LL_PREEMPT_REMOVE

      Resumes the job step.

    4. LL_PREEMPT_SYS_HOLD

      Resumes the job step.

    5. LL_PREEMPT_USER_HOLD

      Resumes the job step.

ll_start_job_ext

Function to instruct the LoadLeveler negotiator to start a job on the specified nodes and adapters. This is meant for use by people writing external schedulers.

  rc = ll_start_job_ext( step_id, node_list, adapter_list )

Parameters

    List of node names where the job will be started. The first member of the list is the parallel master node.
  1. step_id

    String : The Job Step ID.

  2. node_list

    List : Node names where the job will be started. The first member of the list is the parallel master node.

  3. adapter_list

    List of Lists : Adapter information for each node. The members of the list are :

    1. dev_name

      Device name of adapter to be used such as css0

    2. protocol

      Communication protocol this usage supports. Valid values are MPI, LAPI, and MPI_LAPI.

    3. subsystem

      Communication subsystem this usage supports. Valid values are IP or US.

    4. wid

      For US subsystem usages, this indicates which adapter window ID to use. For IP subsystem usages, this field is ignored.

    5. mem

      For US subsystem usages, this is the amount of adapter memory to dedicate to the adapter usage. For IP subsystem usages, this field is ignored.

    Each element in the adapter_list represents one communication channel for a task If the subsystem is US (User Space), a communication channel will require a switch adapter window. Adapter windows, and User Space usages, must be specified on actual switch adapters that are only accessible if AGGREGATE_ADAPTERS=False is specified in the configuration file.The name of the schedd host.

ll_terminate_job

Function to instruct the LoadLeveler negotiator to cancel the specified job_step.

  rc  = ll_terminate_job(cluster, proc, from_ host, msg)

Parameters

  1. cluster

    String : The Job ID.

  2. proc

    String : The job step to be cancelled.

  3. from_host

    String : Name of the schedd host.

  4. msg

    String : The message via ll_get_data as to why the job was cancelled.

ll_modify

Function to modify the attributes of the submitted job step. This interface only supports one Job step, the API also only allows one job step at present but it is designed for expansion, therfore this interface may change in the future.

  (rc, errObj) = ll_modify(modify_op,  value, job_step)

Parameters

  1. modify_op

    Constant : The modify operation to perform.

    1. EXECUTION FACTOR

      New execution factor, modify_data input is a numeric.

    2. CONSUMABLE_CPUS

      New consumable cpus value, modify_data input is a numeric

    3. CONSUMABLE_MEMORY

      New consumable memory in megabytes, modify_data input is a numeric.

    4. WCLIMIT_ADD_MIN

      Additional minutes to add to hard wallclock limit, modify_data input is a numeric.

    5. JOB_CLASS

      New job class, modify_data input is a string.

  2. modify_data

    The new data value for modify_op.

  3. job_step

    String : The job ID.

ll_run_scheduler

This is used when the internal scheduling interval has been disabled so that an external program can control when the central manager attempts to schedule job steps. The ll_run_scheduler subroutine sends a request to the central manager to run the scheduling algorithm.

  rc = ll_run_scheduler()


SEE ALSO

the LoadLeveler page the DataAccess page the Error Handling page

IBM LoadLeveler for AIX 5L: Using and Administering