[ Platform Documentation ] [ Title ] [ Contents ] [ Previous ] [ Next ] [ Index ]
bqueues
displays information about queues
SYNOPSIS
bqueues
[-w
|-l
|-r
] [-m
host_name |-m
host_group |-m
cluster_name |-m all
] [-u
user_name |-u
user_group |-u all
] [queue_name ...]
bqueues
[-h
|-V
]DESCRIPTION
Displays information about queues.
By default, returns the following information about all queues: queue name, queue priority, queue status, job slot statistics, and job state statistics.
In MultiCluster, returns the information about all queues in the local cluster.
Batch queue names and characteristics are set up by the LSF administrator (see
lsb.queues
(5) andmbatchd(8)
).CPU time is normalized.
OPTIONS
-w
Displays queue information in a wide format. Fields are displayed without truncation.
-l
Displays queue information in a long multiline format. The
-l
option displays the following additional information: queue description, queue characteristics and statistics, scheduling parameters, resource usage limits, scheduling policies, users, hosts, associated commands, dispatch and run windows, and job controls.Also displays user shares.
If you specified an administrator comment with the
-C
option of the queue control commandsqclose
,qopen
,qact
, andqinact
,qhist
displays the comment text.-r
Displays the same information as the
-l
option. In addition, if fairshare is defined for the queue, displays recursively the share account tree of the fairshare queue.-m host_name | -m host_group | -m cluster_name | -m all
Displays the queues that can run jobs on the specified host. If the keyword
all
is specified, displays the queues that can run jobs on all hosts.If a host group is specified, displays the queues that include that group in their configuration. For a list of host groups see
bmgroup(1)
.In MultiCluster, if the
all
keyword is specified, displays the queues that can run jobs on all hosts in the local cluster. If a cluster name is specified, displays all queues in the specified cluster.-u user_name | -u user_group | -u all
Displays the queues that can accept jobs from the specified user. If the keyword
all
is specified, displays the queues that can accept jobs from all users.If a user group is specified, displays the queues that include that group in their configuration. For a list of user groups see
bugroup(1)
).queue_name ...
Displays information about the specified queues.
-h
Prints command usage to
stderr
and exits.-V
Prints LSF release version to
stderr
and exits.OUTPUT
Default Output
Displays the following fields:
QUEUE_NAME
The name of the queue. Queues are named to correspond to the type of jobs usually submitted to them, or to the type of services they provide.
lost_and_found
If the LSF administrator removes queues from the system, LSF creates a queue called
lost_and_found
and places the jobs from the removed queues into thelost_and_found
queue. Jobs in thelost_and_found
queue will not be started unless they are switched to other queues (seebswitch
).PRIO
The priority of the queue. The larger the value, the higher the priority. If job priority is not configured, determines the queue search order at job dispatch, suspension and resumption time. Jobs from higher priority queues are dispatched first (this is contrary to UNIX process priority ordering), and jobs from lower priority queues are suspended first when hosts are overloaded.
STATUS
The current status of the queue. The possible values are:
Open
The queue is able to accept jobs.
Closed
The queue is not able to accept jobs.
Active
Jobs in the queue may be started.
Inactive
Jobs in the queue cannot be started for the time being.
At any moment, each queue is either
Open
orClosed
, and is eitherActive
orInactive
. The queue can be opened, closed, inactivated and re-activated by the LSF administrator usingbadmin
(seebadmin(8)
).Jobs submitted to a queue that is later closed are still dispatched as long as the queue is active. The queue can also become inactive when either its dispatch window is closed or its run window is closed (see DISPATCH_WINDOWS in the "Output for the -l Option" section). In this case, the queue cannot be activated using
badmin
. The queue is re- activated by LSF when one of its dispatch windows and one of its run windows are open again. The initial state of a queue at LSF boot time is set to open, and either active or inactive depending on its windows.MAX
The maximum number of job slots that can be used by the jobs from the queue. These job slots are used by dispatched jobs which have not yet finished, and by pending jobs which have slots reserved for them.
A sequential job will use one job slot when it is dispatched to a host, while a parallel job will use as many job slots as is required by
bsub
-n
when it is dispatched. Seebsub(1)
for details. If `-' is displayed, there is no limit.JL/U
The maximum number of job slots each user can use for jobs in the queue. These job slots are used by your dispatched jobs which have not yet finished, and by pending jobs which have slots reserved for them. If `-' is displayed, there is no limit.
JL/P
The maximum number of job slots a processor can process from the queue. This includes job slots of dispatched jobs that have not yet finished, and job slots reserved for some pending jobs. The job slot limit per processor (JL/P) controls the number of jobs sent to each host. This limit is configured per processor so that multiprocessor hosts are automatically allowed to run more jobs. If `-' is displayed, there is no limit.
JL/H
The maximum number of job slots a host can allocate from this queue. This includes the job slots of dispatched jobs that have not yet finished, and those reserved for some pending jobs. The job slot limit per host (JL/H) controls the number of jobs sent to each host, regardless of whether a host is a uniprocessor host or a multiprocessor host. If `-' is displayed, there is no limit.
NJOBS
The total number of job slots held currently by jobs in the queue. This includes pending, running, suspended and reserved job slots. A parallel job that is running on n processors is counted as n job slots, since it takes n job slots in the queue. See
bjobs(1)
for an explanation of batch job states.PEND
The number of job slots used by pending jobs in the queue.
RUN
The number of job slots used by running jobs in the queue.
SUSP
The number of job slots used by suspended jobs in the queue.
Output for -l Option
In addition to the above fields, the
-l
option displays the following:Description
A description of the typical use of the queue.
Default queue indication
Indicates that this is the default queue.
PARAMETERS/STATISTICS
NICE
The nice value at which jobs in the queue will be run. This is the UNIX nice value for reducing the process priority (see
nice(1)
).STATUS
Inactive
The long format for the
-l
option gives the possible reasons for a queue to be inactive:Inact_Win
The queue is out of its dispatch window or its run window.
Inact_Adm
The queue has been inactivated by the LSF administrator.
SSUSP
The number of job slots in the queue allocated to jobs that are suspended by LSF because of load levels or run windows.
USUSP
The number of job slots in the queue allocated to jobs that are suspended by the job submitter or by the LSF administrator.
RSV
The number of job slots in the queue that are reserved by LSF for pending jobs.
Migration threshold
The length of time in seconds that a job dispatched from the queue will remain suspended by the system before LSF attempts to migrate the job to another host. See the MIG parameter in
lsb.queues
andlsb.hosts
.Schedule delay for a new job
The delay time in seconds for scheduling after a new job is submitted. If the schedule delay time is zero, a new scheduling session is started as soon as the job is submitted to the queue. See the NEW_JOB_SCHED_DELAY parameter in
lsb.queues
.Interval for a host to accept two jobs
The length of time in seconds to wait after dispatching a job to a host before dispatching a second job to the same host. If the job accept interval is zero, a host may accept more than one job in each dispatching interval. See the JOB_ACCEPT_INTERVAL parameter in
lsb.queues
andlsb.params
.RESOURCE LIMITS
The hard resource usage limits that are imposed on the jobs in the queue (see
getrlimit
(2) andlsb.queues
(5)). These limits are imposed on a per-job and a per-process basis.The possible per-job limits are:
CPULIMIT
The maximum CPU time a job can use, in minutes, relative to the CPU factor of the named host. CPULIMIT is scaled by the CPU factor of the execution host so that jobs are allowed more time on slower hosts.
When the job-level CPULIMIT is reached, a SIGXCPU signal is sent to all processes belonging to the job. If the job has no signal handler for SIGXCPU, the job is killed immediately. If the SIGXCPU signal is handled, blocked, or ignored by the application, then after the grace period expires, LSF sends SIGINT, SIGTERM, and SIGKILL to the job to kill it.
PROCLIMIT
The maximum number of processors allocated to a job. Jobs that request fewer slots than the minimum PROCLIMIT or more slots than the maximum PROCLIMIT are rejected. If the job requests minimum and maximum job slots, the maximum slots requested cannot be less than the minimum PROCLIMIT, and the minimum slots requested cannot be more than the maximum PROCLIMIT.
MEMLIMIT
The maximum running set size (RSS) of a process, in KB. If a process uses more than MEMLIMIT
kilobytes of memory, its priority is reduced so that other processes are more likely to be paged in to available memory. This limit is enforced by the
setrlimit
system call if it supports theRLIMIT_RSS
option.SWAPLIMIT
The swap space limit that a job may use. If SWAPLIMIT is reached, the system sends the following signals in sequence to all processes in the job:
SIGINT
,SIGTERM
, andSIGKILL
.PROCESSLIMIT
The maximum number of concurrent processes allocated to a job. If PROCESSLIMIT is reached, the system sends the following signals in sequence to all processes belonging to the job:
SIGINT
,SIGTERM
, andSIGKILL
.THREADLIMIT
The maximum number of concurrent threads allocated to a job. If THREADLIMIT is reached, the system sends the following signals in sequence to all processes belonging to the job:
SIGINT
,SIGTERM
, andSIGKILL
.The possible UNIX per-process resource limits are:
RUNLIMIT
The maximum wall clock time a process can use, in minutes. RUNLIMIT is scaled by the CPU factor of the execution host. When a job has been in the RUN state for a total of RUNLIMIT minutes, LSF sends a
SIGUSR2
signal to the job. If the job does not exit within 10 minutes, LSF sends aSIGKILL
signal to kill the job.FILELIMIT
The maximum file size a process can create, in kilobytes. This limit is enforced by the UNIX
setrlimit
system call if it supports theRLIMIT_FSIZE
option, or theulimit
system call if it supports theUL_SETFSIZE
option.DATALIMIT
The maximum size of the data segment of a process, in kilobytes. This restricts the amount of memory a process can allocate. DATALIMIT is enforced by the
setrlimit
system call if it supports theRLIMIT_DATA
option, and unsupported otherwise.STACKLIMIT
The maximum size of the stack segment of a process, in kilobytes. This restricts the amount of memory a process can use for local variables or recursive function calls. STACKLIMIT
is enforced by the
setrlimit
system call if it supports theRLIMIT_STACK
option.CORELIMIT
The maximum size of a core file, in KB. This limit is enforced by the
setrlimit
system call if it supports theRLIMIT_CORE
option.If a job submitted to the queue has any of these limits specified (see
bsub(1)
), then the lower of the corresponding job limits and queue limits are used for the job.If no resource limit is specified, the resource is assumed to be unlimited.
SCHEDULING PARAMETERS
The scheduling and suspending thresholds for the queue.
The scheduling threshold
loadSched
and the suspending thresholdloadStop
are used to control batch job dispatch, suspension, and resumption. The queue thresholds are used in combination with the thresholds defined for hosts (seebhosts(1)
andlsb.hosts
(5)). If both queue level and host level thresholds are configured, the most restrictive thresholds are applied.The
loadSched
andloadStop
thresholds have the following fields:r15s
The 15-second exponentially averaged effective CPU run queue length.
r1m
The 1-minute exponentially averaged effective CPU run queue length.
r15m
The 15-minute exponentially averaged effective CPU run queue length.
ut
The CPU utilization exponentially averaged over the last minute, expressed as a percentage between 0 and 1.
pg
The memory paging rate exponentially averaged over the last minute, in pages per second.
io
The disk I/O rate exponentially averaged over the last minute, in kilobytes per second.
ls
The number of current login users.
it
On UNIX, the idle time of the host (keyboard not touched on all logged in sessions), in minutes.
On Windows, the
it
index is based on the time a screen saver has been active on a particular host.tmp
The amount of free space in /tmp, in megabytes.
swp
The amount of currently available swap space, in megabytes.
mem
The amount of currently available memory, in megabytes.
In addition to these internal indices, external indices are also displayed if they are defined in
lsb.queues
(seelsb.queues
(5)).The
loadSched
threshold values specify the job dispatching thresholds for the corresponding load indices. If `-' is displayed as the value, it means the threshold is not applicable. Jobs in the queue may be dispatched to a host if the values of all the load indices of the host are within (below or above, depending on the meaning of the load index) the corresponding thresholds of the queue and the host. The same conditions are used to resume jobs dispatched from the queue that have been suspended on this host.Similarly, the
loadStop
threshold values specify the thresholds for job suspension. If any of the load index values on a host go beyond the corresponding threshold of the queue, jobs in the queue will be suspended.JOB EXCEPTION PARAMETERS
Configured job exception thresholds and number of jobs in each exception state for the queue.
Threshold
andNumOfJobs
have the following fields:overrun
Configured threshold in minutes for overrun jobs, and the number of jobs in the queue that have triggered an overrun job exception by running longer than the overrun threshold
underrun
Configured threshold in minutes for underrun jobs, and the number of jobs in the queue that have triggered an underrun job exception by finishing sooner than the underrun threshold
idle
Configured threshold (CPU time/runtime) for idle jobs, and the number of jobs in the queue that have triggered an overrun job exception by having a job idle factor less than the threshold
SCHEDULING POLICIES
Scheduling policies of the queue. Optionally, one or more of the following policies may be configured:
FAIRSHARE
Queue-level fairshare scheduling is enabled. Jobs in this queue are scheduled based on a fairshare policy instead of the first-come, first-serve (FCFS) policy.
BACKFILL
A job in a backfill queue can use the slots reserved by other jobs if the job can run to completion before the slot-reserving jobs start.
Backfilling does not occur on queue limits and user limit but only on host based limits. That is, backfilling is only supported when MXJ, JL/U, JL/P, PJOB_LIMIT, and HJOB_LIMIT are reached. Backfilling is not supported when MAX_JOBS, QJOB_LIMIT, and UJOB_LIMIT are reached.
IGNORE_DEADLINE
If IGNORE_DEADLINE is set to Y, starts all jobs regardless of the run limit.
EXCLUSIVE
Jobs dispatched from an exclusive queue can run exclusively on a host if the user so specifies at job submission time (see
bsub(1)
). Exclusive execution means that the job is sent to a host with no other batch job running there, and no further job, batch or interactive, will be dispatched to that host while the job is running. The default is not to allow exclusive jobs.NO_INTERACTIVE
This queue does not accept batch interactive jobs. (see the
-I
,-Is
, and-Ip
options ofbsub(1)
). The default is to accept both interactive and non-interactive jobs.ONLY_INTERACTIVE
This queue only accepts batch interactive jobs. Jobs must be submitted using the
-I
,-Is
, and-Ip
options ofbsub(1)
. The default is to accept both interactive and non-interactive jobs.FAIRSHARE_QUEUES
Lists queues participating in cross-queue fairshare. The first queue listed is the master queue--the queue in which fairshare is configured; all other queues listed inherit the fairshare policy from the master queue. Fairshare information applies to all the jobs running in all the queues in the master- slave set.
DISPATCH_ORDER
DISPATCH_ORDER=QUEUE is set in the master queue. Jobs from this queue are dispatched according to the order of queue priorities first, then user fairshare priority. Within the queue, dispatch order is based on user share quota. This avoids having users with higher fairshare priority getting jobs dispatched from low-priority queues.
USER_SHARES
A list of [user_name, share] pairs. user_name is either a user name or a user group name. share is the number of shares of resources assigned to the user or user group. A party will get a portion of the resources proportional to that party's share divided by the sum of the shares of all parties specified in this queue.
DEFAULT HOST SPECIFICATION
The default host or host model that will be used to normalize the CPU time limit of all jobs.
If you want to view a list of the CPU factors defined for the hosts in your cluster, see
lsinfo(1)
. The CPU factors are configured inlsf.shared(5)
.The appropriate CPU scaling factor of the host or host model is used to adjust the actual CPU time limit at the execution host (see CPULIMIT in
lsb.queues(5)
). The DEFAULT_HOST_SPEC parameter inlsb.queues
overrides the system DEFAULT_HOST_SPEC parameter inlsb.params
(seelsb.params(5)
). If a user explicitly gives a host specification when submitting a job usingbsub -c
cpu_limit[/
host_name| /
host_model]
, the user specification overrides the values defined in bothlsb.params
andlsb.queues
.RUN_WINDOWS
The time windows in a week during which jobs in the queue may run.
When a queue is out of its window or windows, no job in this queue will be dispatched. In addition, when the end of a run window is reached, any running jobs from this queue are suspended until the beginning of the next run window, when they are resumed. The default is no restriction, or always open.
DISPATCH_WINDOWS
Dispatch windows are the time windows in a week during which jobs in the queue may be dispatched.
When a queue is out of its dispatch window or windows, no job in this queue will be dispatched. Jobs already dispatched are not affected by the dispatch windows. The default is no restriction, or always open (that is, twenty-four hours a day, seven days a week). Note that such windows are only applicable to batch jobs. Interactive jobs scheduled by LIM are controlled by another set of dispatch windows (see
lshosts(1)
). Similar dispatch windows may be configured for individual hosts (seebhosts(1)
).A window is displayed in the format begin_time-end_time. Time is specified in the format [day:]hour[:minute], where all fields are numbers in their respective legal ranges: 0(Sunday)-6 for day, 0-23 for hour, and 0-59 for minute. The default value for minute is 0 (on the hour). The default value for day is every day of the week. The begin_time and end_time of a window are separated by `-', with no blank characters (SPACE and TAB) in between. Both begin_time and end_time must be present for a window. Windows are separated by blank characters.
USERS
A list of users allowed to submit jobs to this queue. LSF cluster administrators can submit jobs to the queue even if they are not listed here.
User group names have a slash (/) added at the end of the group name. See
bugroup(1)
.If the fairshare scheduling policy is enabled, users cannot submit jobs to the queue unless they also have a share assignment. This also applies to LSF administrators.
HOSTS
A list of hosts where jobs in the queue can be dispatched.
Host group names have a slash (/) added at the end of the group name. See
bmgroup(1)
.NQS DESTINATION QUEUES
A list of NQS destination queues to which this queue can dispatch jobs.
When you submit a job using
bsub -q
queue_name, and the specified queue is configured to forward jobs to the NQS system, LSF routes your job to one of the NQS destination queues. The job runs on an NQS batch server host, which is not a member of the LSF cluster. Although running on an NQS system outside the LSF cluster, the job is still managed by LSF in almost the same way as jobs running inside the LSF cluster. Thus, you may have your batch jobs transparently sent to an NQS system to run and then get the results of your jobs back. You may use any supported user interface, including LSF commands and NQS commands (seelsnqs(1)
) to submit, monitor, signal and delete your batch jobs that are running in an NQS system. Seelsb.queues(5)
andbsub(1)
for more information.ADMINISTRATORS
A list of queue administrators. The users whose names are specified here are allowed to operate on the jobs in the queue and on the queue itself. See
lsb.queues(5)
for more information.PRE_EXEC
The queue's pre-execution command. The pre-execution command is executed before each job in the queue is run on the execution host (or on the first host selected for a parallel batch job). See
lsb.queues(5)
for more information.POST_EXEC
The queue's post-execution command. The post-execution command is run on the execution host when a job terminates. See
lsb.queues(5)
for more information.REQUEUE_EXIT_VALUES
Jobs that exit with these values are automatically requeued. See
lsb.queues(5)
for more information.RES_REQ
Resource requirements of the queue. Only the hosts that satisfy these resource requirements can be used by the queue.
Maximum slot reservation time
The maximum time in seconds a slot is reserved for a pending job in the queue. See the SLOT_RESERVE=MAX_RESERVE_TIME[n] parameter in
lsb.queues
.RESUME_COND
The conditions that must be satisfied to resume a suspended job on a host. See
lsb.queues(5)
for more information.STOP_COND
The conditions which determine whether a job running on a host should be suspended. See
lsb.queues(5)
for more information.JOB_STARTER
An executable file that runs immediately prior to the batch job, taking the batch job file as an input argument. All jobs submitted to the queue are run via the job starter, which is generally used to create a specific execution environment before processing the jobs themselves. See
lsb.queues(5)
for more information.CHUNK_JOB_SIZE
Chunk jobs only. Specifies the maximum number of jobs allowed to be dispatched together in a chunk job. All of the jobs in the chunk are scheduled and dispatched as a unit rather than individually. The ideal candidates for job chunking are jobs that typically takes 1 to 2 minutes to run.
SEND_JOBS_TO
MultiCluster. List of remote queue names to which the queue forwards jobs.
RECEIVE_JOBS_FROM
MultiCluster. List of remote cluster names from which the queue receives jobs.
PREEMPTION
PREEMPTIVE
The queue is preemptive. Jobs in a preemptive queue may preempt running jobs from lower-priority queues, even if the lower-priority queues are not specified as preemptive.
PREEMPTABLE
The queue is preemptable. Running jobs in a preemptable queue may be preempted by jobs in higher-priority queues, even if the higher- priority queues are not specified as preemptive.
RERUNNABLE
If the RERUNNABLE field displays
yes
, jobs in the queue are rerunnable. That is, jobs in the queue are automatically restarted or rerun if the execution host becomes unavailable. However, a job in the queue will not be restarted if the you have removed the rerunnable option from the job. Seelsb.queues(5)
for more information.CHECKPOINT
If the CHKPNTDIR field is displayed, jobs in the queue are checkpointable. Jobs will use the default checkpoint directory and period unless you specify other values. Note that a job in the queue will not be checkpointed if you have removed the checkpoint option from the job. See
lsb.queues(5)
for more information.CHKPNTDIR
Specifies the checkpoint directory using an absolute or relative path name.
CHKPNTPERIOD
Specifies the checkpoint period in seconds.
Although the output of
bqueues
reports the checkpoint period in seconds, the checkpoint period is defined in minutes (the checkpoint period is defined through thebsub -k
"
checkpoint_dir [checkpoint_period]"
option, or inlsb.queues
).JOB CONTROLS
The configured actions for job control. See JOB_CONTROLS parameter in
lsb.queues
.The configured actions are displayed in the format [action_type, command] where action_type is either SUSPEND, RESUME, or TERMINATE.
ADMIN ACTION COMMENT
If the LSF administrator specified an administrator comment with the
-C
option of the queue control commandsqclose
,qopen
,qact
, andqinact
,qhist
the comment text is displayed.SLOT_SHARE
Share of job slots for queue-based fairshare. Represents the percentage of running jobs (job slots) in use from the queue. SLOT_SHARE must be greater than zero.
The sum of SLOT_SHARE for all queues in the pool does not need to be 100%. It can be more or less, depending on your needs.
SLOT_POOL
Name of the pool of job slots the queue belongs to for queue-based fairshare. A queue can only belong to one pool. All queues in the pool must share the same set of hosts.
Output for -r option
In addition to the fields displayed for the
-l
option, the-r
option displays the following:SCHEDULING POLICIES
FAIRSHARE
The
-r
option causesbqueues
to recursively display the entire share information tree associated with the queue.SEE ALSO
bugroup
(1),nice
(1),getrlimit
(2),lsb.queues
(5),bsub
(1),bjobs
(1),bhosts
(1),badmin
(8),mbatchd
(8)[ Top ]
[ Platform Documentation ] [ Title ] [ Contents ] [ Previous ] [ Next ] [ Index ]
Date Modified: February 24, 2004
Platform Computing: www.platform.com
Platform Support: support@platform.com
Platform Information Development: doc@platform.com
Copyright © 1994-2004 Platform Computing Corporation. All rights reserved.