Learn more about Platform products at http://www.platform.com

[ Platform Documentation ] [ Title ] [ Contents ] [ Previous ] [ Next ] [ Index ]



lsb.params


The lsb.params file defines general parameters used by the LSF system. This file contains only one section, named Parameters. mbatchd uses lsb.params for initialization. The file is optional. If not present, the LSF-defined defaults are assumed.

Some of the parameters that can be defined in lsb.params control timing within the system. The default settings provide good throughput for long- running batch jobs while adding a minimum of processing overhead in the batch daemons.

This file is installed by default in LSB_CONFDIR/cluster_name/configdir.

Contents

[ Top ]


Parameters Section

This section and all the keywords in this section are optional. If keywords are not present, the default values are assumed. The valid keywords for this section are:

ABS_RUNLIMIT

Syntax

ABS_RUNLIMIT = y | Y

Description

If set, the run time limit specified by the -W option of bsub, or the RUNLIMIT queue parameter in lsb.queues is not normalized by the host CPU factor. Absolute wall-clock run time is used for all jobs submitted with a run limit.

Default

Undefined. Run limit is normalized.

ACCT_ARCHIVE_AGE

Syntax

ACCT_ARCHIVE_AGE = days

Description

Enables automatic archiving of LSF accounting log files, and specifies the archive interval. LSF archives the current log file if the length of time from its creation date exceeds the specified number of days.

See also

Default

Undefined (no limit to the age of lsb.acct).

ACCT_ARCHIVE_SIZE

Syntax

ACCT_ARCHIVE_SIZE = kilobytes

Description

Enables automatic archiving of LSF accounting log files, and specifies the archive threshold. LSF archives the current log file if its size exceeds the specified number of kilobytes.

See also

Default

Undefined (no limit to the size of lsb.acct).

ACCT_ARCHIVE_TIME

Syntax

ACCT_ARCHIVE_TIME = hh:mm

Description

Enables automatic archiving of LSF accounting log file lsb.acct, and specifies the time of day to archive the current log file.

See also

Default

Undefined (no time set for archiving lsb.acct).

CHUNK_JOB_DURATION

Syntax

CHUNK_JOB_DURATION = minutes

Description

Specifies a CPU limit or run limit for jobs submitted to a chunk job queue to be chunked.

When CHUNK_JOB_DURATION is set, the CPU limit or run limit set in the queue (CPULIMIT or RUNLMIT) or specified at job submission (-c or -W bsub options) must be less than or equal to CHUNK_JOB_DURATION for jobs to be chunked.

If CHUNK_JOB_DURATION is set, jobs are not chunked if:

If CHUNK_JOB_DURATION is set, chunk jobs are accepted regardless of the value of CPULIMIT or RUNLIMIT.

The value of CHUNK_JOB_DURATION is displayed by bparams -l.

Examples

Default

Undefined

CLEAN_PERIOD

Syntax

CLEAN_PERIOD = seconds

Description

For non-repetitive jobs, the amount of time that job records for jobs that have finished or have been killed are kept in mbatchd core memory after they have finished.

Users can still see all jobs after they have finished using the bjobs command.

For jobs that finished more than CLEAN_PERIOD seconds ago, use the bhist command.

Default

3600 (1 hour)

CPU_TIME_FACTOR

Syntax

CPU_TIME_FACTOR = number

Description

Used only with fairshare scheduling. CPU time weighting factor.

In the calculation of a user's dynamic share priority, this factor determines the relative importance of the cumulative CPU time used by a user's jobs.

Default

0.7

COMMITTED_RUN_TIME_FACTOR

Syntax

COMMITTED_RUN_TIME_FACTOR = number

Description

Used only with fairshare scheduling. Committed run time weighting factor.

In the calculation of a user's dynamic priority, this factor determines the relative importance of the committed run time in the calculation. If the -W option of bsub is not specified at job submission and a RUNLIMIT has not been set for the queue, the committed run time is not considered.

Valid Values

Any positive number between 0.0 and 1.0

Default

0.0

DEFAULT_HOST_SPEC

Syntax

DEFAULT_HOST_SPEC = host_name | host_model

Description

The default CPU time normalization host for the cluster.

The CPU factor of the specified host or host model will be used to normalize the CPU time limit of all jobs in the cluster, unless the CPU time normalization host is specified at the queue or job level.

Default

Undefined

DEFAULT_PROJECT

Syntax

DEFAULT_PROJECT = project_name

Description

The name of the default project. Specify any string.

When you submit a job without specifying any project name, and the environment variable LSB_DEFAULTPROJECT is not set, LSF automatically assigns the job to this project.

Default

default

DEFAULT_QUEUE

Syntax

DEFAULT_QUEUE = queue_name ...

Description

Space-separated list of candidate default queues (candidates must already be defined in lsb.queues).

When you submit a job to LSF without explicitly specifying a queue, and the environment variable LSB_DEFAULTQUEUE is not set, LSF puts the job in the first queue in this list that satisfies the job's specifications subject to other restrictions, such as requested hosts, queue status, etc.

Default

Undefined. When a user submits a job to LSF without explicitly specifying a queue, and there are no candidate default queues defined (by this parameter or by the user's environment variable LSB_DEFAULTQUEUE), LSF automatically creates a new queue named default, using the default configuration, and submits the job to that queue.

DISABLE_UACCT_MAP

Syntax

DISABLE_UACCT_MAP = y | Y

Description

Specify y or Y to disable user-level account mapping.

Default

Undefined

EADMIN_TRIGGER_DURATION

Description

Defines how often LSF_SERVERDIR/eadmin is invoked once a job exception is detected. Used in conjunction with job exception handling parameters JOB_OVERRUN and JOB_UNDERRUN in lsb.queues.

Example

EADMIN_TRIGGER_DURATION=20

Default

5 minutes

ENABLE_HIST_RUN_TIME

Syntax

ENABLE_HIST_RUN_TIME = y | Y

Description

Used only with fairshare scheduling. If set, enables the use of historical run time in the calculation of fairshare scheduling priority.

Default

Undefined

ENABLE_USER_RESUME

Syntax

ENABLE_USER_RESUME = Y | N

Description

Defines job resume permissions.

When this parameter is defined:

Default

Undefined (users cannot resume jobs suspended by administrator)

EVENT_UPDATE_INTERVAL

Syntax

EVENT_UPDATE_INTERVAL = seconds

Description

Used with duplicate logging of event and accounting log files. LSB_LOCALDIR in lsf.conf must also be specified. Specifies how often to back up the data and synchronize the directories (LSB_SHAREDIR and LSB_LOCALDIR).

The directories are always synchronized when data is logged to the files, or when mbatchd is started on the first LSF master host.

Use this parameter if NFS traffic is too high and you want to reduce network traffic.

Valid Values

1 to INFINIT_INT

INFINIT_INT is defined in lsf.h

Default

Undefined

See also

See lsf.conf under LSB_LOCALDIR.

HIST_HOURS

Syntax

HIST_HOURS = hours

Description

Used only with fairshare scheduling. Determines a rate of decay for cumulative CPU time and historical run time.

To calculate dynamic user priority, LSF scales the actual CPU time using a decay factor, so that 1 hour of recently-used time is equivalent to 0.1 hours after the specified number of hours has elapsed.

To calculate dynamic user priority with historical run time, LSF scales the accumulated run time of finished jobs using the same decay factor, so that 1 hour of recently-used time is equivalent to 0.1 hours after the specified number of hours has elapsed.

When HIST_HOURS=0, CPU time accumulated by running jobs is not decayed.

Default

5

JOB_ACCEPT_INTERVAL

Syntax

JOB_ACCEPT_INTERVAL = integer

Description

The number you specify is multiplied by the value of lsb.params MBD_SLEEP_TIME (60 seconds by default). The result of the calculation is the number of seconds to wait after dispatching a job to a host, before dispatching a second job to the same host.

If 0 (zero), a host may accept more than one job. By default, there is no limit to the total number of jobs that can run on a host, so if this parameter is set to 0, a very large number of jobs might be dispatched to a host all at once. This can overload your system to the point that it will be unable to create any more processes. It is not recommended to set this parameter to 0.

JOB_ACCEPT_INTERVAL set at the queue level (lsb.queues) overrides JOB_ACCEPT_INTERVAL set at the cluster level (lsb.params).

Default

1

JOB_ATTA_DIR

Syntax

JOB_ATTA_DIR = directory

Description

The shared directory in which mbatchd saves the attached data of messages posted with the bpost command.

Use JOB_ATTA_DIR if you use bpost(1) and bread(1)to transfer large data files between jobs and want to avoid using space in LSB_SHAREDDIR. By default, the bread(1) command reads attachment data from the JOB_ATTA_DIR directory.

JOB_ATTA_DIR should be shared by all hosts in the cluster, so that any potential LSF master host can reach it. Like LSB_SHAREDIR, the directory should be owned and writable by the primary LSF administrator. The directory must have at least 1 MB of free space.

The attached data will be stored under the directory in the format:

JOB_ATTA_DIR/timestamp.jobid.msgs/msg$msgindex

On UNIX, specify an absolute path. For example:

JOB_ATTA_DIR=/opt/share/lsf_work

On Windows, specify a UNC path or a path with a drive letter. For example:

JOB_ATTA_DIR=\\HostA\temp\lsf_workor
JOB_ATTA_DIR=D:\temp\lsf_work

After adding JOB_ATTA_DIR to lsb.params, use badmin reconfig to reconfigure your cluster.

Valid values

JOB_ATTA_DIR can be any valid UNIX or Windows path up to a maximum length of 256 characters.

Default

Undefined

If JOB_ATTA_DIR is not specified, job message attachments are saved in LSB_SHAREDIR/info/.

JOB_DEP_LAST_SUB

Description

Used only with job dependency scheduling.

If set to 1, whenever dependency conditions use a job name that belongs to multiple jobs, LSF evaluates only the most recently submitted job.

Otherwise, all the jobs with the specified name must satisfy the dependency condition.

Default

Undefined

JOB_EXIT_RATE_DURATION

Description

Defines how long LSF waits before checking the job exit rate for a host. Used in conjunction with EXIT_RATE in lsb.hosts for LSF host exception handling.

If the job exit rate is exceeded for the period specified by JOB_EXIT_RATE_DURATION, LSF invokes LSF_SERVERDIR/eadmin to trigger a host exception.

Example

JOB_EXIT_RATE_DURATION=5

Default

10 minutes

JOB_PRIORITY_OVER_TIME

Syntax

JOB_PRIORITY_OVER_TIME = increment/interval

Description

JOB_PRIORITY_OVER_TIME enables automatic job priority escalation when MAX_USER_PRIORITY is also defined.

Valid Values

increment

Specifies the value used to increase job priority every interval minutes. Valid values are positive integers.

interval

Specifies the frequency, in minutes, to increment job priority. Valid values are positive integers.

Default

Undefined

Example

JOB_PRIORITY_OVER_TIME=3/20

Specifies that every 20 minute interval increment to job priority of pending jobs by 3.

See also

MAX_USER_PRIORITY.

JOB_SCHEDULING_INTERVAL

Syntax

JOB_SCHEDULING_INTERVAL = seconds

Description

Time interval at which mbatchd sends jobs for scheduling to the scheduling daemon mbschd along with any collected load information.

Default

5 seconds

JOB_SPOOL_DIR

Syntax

JOB_SPOOL_DIR = dir

Description

Specifies the directory for buffering batch standard output and standard error for a job.

When JOB_SPOOL_DIR is defined, the standard output and standard error for the job is buffered in the specified directory.

Files are copied from the submission host to a temporary file in the directory specified by the JOB_SPOOL_DIR on the execution host. LSF removes these files when the job completes.

If JOB_SPOOL_DIR is not accessible or does not exist, files are spooled to the default job output directory $HOME/.lsbatch.

For bsub -is and bsub -Zs, JOB_SPOOL_DIR must be readable and writable by the job submission user, and it must be shared by the master host and the submission host. If the specified directory is not accessible or does not exist, and JOB_SPOOL_DIR is specified, bsub -is cannot write to the default directory LSB_SHAREDIR/cluster_name/lsf_indir, and bsub -Zs cannot write to the default directory LSB_SHAREDIR/cluster_name/lsf_cmddir, and the job will fail.

As LSF runs jobs, it creates temporary directories and files under JOB_SPOOL_DIR. By default, LSF removes these directories and files after the job is finished. See bsub(1) for information about job submission options that specify the disposition of these files.

On UNIX, specify an absolute path. For example:

JOB_SPOOL_DIR=/home/share/lsf_spool

On Windows, specify a UNC path or a path with a drive letter. For example:

JOB_SPOOL_DIR=\\HostA\share\spooldir

or

JOB_SPOOL_DIR=D:\share\spooldir

In a mixed UNIX/Windows cluster, specify one path for the UNIX platform and one for the Windows platform. Separate the two paths by a pipe character (|):

JOB_SPOOL_DIR=/usr/share/lsf_spool | \\HostA\share\spooldir

Valid value

JOB_SPOOL_DIR can be any valid path up to a maximum length of 256 characters. This maximum path length includes the temporary directories and files that the LSF system creates as jobs run. The path you specify for JOB_SPOOL_DIR should be as short as possible to avoid exceeding this limit.

Default

Undefined

Batch job output (standard output and standard error) is sent to the .lsbatch directory on the execution host:

JOB_TERMINATE_INTERVAL

Syntax

JOB_TERMINATE_INTERVAL = seconds

Description

UNIX only.

Specifies the time interval in seconds between sending SIGINT, SIGTERM, and SIGKILL when terminating a job. When a job is terminated, the job is sent SIGINT, SIGTERM, and SIGKILL in sequence with a sleep time of JOB_TERMINATE_INTERVAL between sending the signals. This allows the job to clean up if necessary.

Default

10

MAX_ACCT_ARCHIVE_FILE

Syntax

MAX_ACCT_ARCHIVE_FILE = integer

Description

Enables automatic deletion of archived LSF accounting log files and specifies the archive limit.

Compatibility

ACCT_ARCHIVE_SIZE or ACCT_ARCHIVE_AGE should also be defined.

Example

MAX_ACCT_ARCHIVE_FILE=10

LSF maintains the current lsb.acct and up to 10 archives. Every time the old lsb.acct.9 becomes lsb.acct.10, the old lsb.acct.10 gets deleted.

See also

Default

Undefined (no deletion of lsb.acct.n files).

MAX_JOB_ARRAY_SIZE

Syntax

MAX_JOB_ARRAY_SIZE = integer

Description

Specifies the maximum number of jobs in a job array that can be created by a user for a single job submission. The maximum number of jobs in a job array cannot exceed this value.

A large job array allows a user to submit a large number of jobs to the system with a single job submission.

Specify an integer value from 1 to 65534.

Default

1000

MAX_JOB_ATTA_SIZE

Syntax

MAX_JOB_ATTA_SIZE = integer | 0

Specify any number less than 20000.

Description

Maximum attached data size, in KB, that can be transferred to a job.

Maximum size for data attached to a job with the bpost(1) command. Useful if you use bpost(1) and bread(1) to transfer large data files between jobs and you want to limit the usage in the current working directory.

0 indicates that jobs cannot accept attached data files.

Default

Undefined. LSF does not set a maximum size of job attachments.

MAX_JOBID

Syntax

MAX_JOBID = integer

Description

The job ID limit. The job ID limit is the highest job ID that LSF will ever assign, and also the maximum number of jobs in the system.

By default, LSF assigns job IDs up to 6 digits. This means that no more than 999999 jobs can be in the system at once.

Specify any integer from 999999 to 9999999 (for practical purposes, any seven- digit integer).

You cannot lower the job ID limit, but you can raise it to seven digits. This means you can have more jobs in the system, and the job ID numbers will roll over less often.

LSF assigns job IDs in sequence. When the job ID limit is reached, the count rolls over, so the next job submitted gets job ID "1". If the original job 1 remains in the system, LSF skips that number and assigns job ID "2", or the next available job ID. If you have so many jobs in the system that the low job IDs are still in use when the maximum job ID is assigned, jobs with sequential numbers could have totally different submission times.

By raising the job ID limit, you allow more time for old jobs to leave the system, and make it more likely that numbers can be assigned in sequence without conflicting with existing jobs.

Example

MAX_JOBID=1234567

Default

999999

MAX_JOBINFO_QUERY_PERIOD

Syntax

MAX_JOBINFO_QUERY_PERIOD = integer

Description

Maximum time for job information query commands (e.g., bjobs) to wait.

When the time arrives, the query command processes exit, and all associated threads are terminated.

If the parameter is not defined, query command processes will wait for all threads to finish.

Specify a multiple of MBD_REFRESH_TIME.

Valid values

Any positive integer greater than or equal to one (1)

Default

Undefined

See also

See lsf.conf under LSB_BLOCK_JOBINFO_TIMEOUT.

MAX_JOB_MSG_NUM

Syntax

MAX_JOB_MSG_NUM = integer | 0

Description

Maximum number of message slots for each job. Maximum number of messages that can be posted to a job with the bpost(1) command.

0 indicates that jobs cannot accept external messages.

Default

128

MAX_JOB_NUM

Syntax

MAX_JOB_NUM = integer

Description

The maximum number of finished jobs whose events are to be stored in the lsb.events log file.

Once the limit is reached, mbatchd starts a new event log file. The old event log file is saved as lsb.events.n, with subsequent sequence number suffixes incremented by 1 each time a new log file is started. Event logging continues in the new lsb.events file.

Default

1000

MAX_PREEXEC_RETRY

Syntax

MAX_PREEXEC_RETRY = integer

Description

MultiCluster job forwarding model only. The maximum number of times to attempt the pre-execution command of a job from a remote cluster.

If the job's pre-execution command fails all attempts, the job is returned to the submission cluster.

MAX_SBD_CONNS

Syntax

MAX_SBD_CONNS = integer

Description

The maximum number of file descriptors mbatchd can have open and connected concurrently to sbatchd

Controls the maximum number of connections that can maintained to sbatchds in the system. Many sites require more than 32 connections.

The value should not exceed the file descriptor limit of the root (the usual limit is 1024). Setting it equal or larger than this limit can cause mbatchd to constantly die because mbatchd allocates all file descriptors to sbatchd connection. This could cause mbatchd to run out of descriptors, which results in an mbatchd fatal error, such as failure to open lsb.events.

Example

Reasonable settings are:

Default

32

MAX_SBD_FAIL

Syntax

MAX_SBD_FAIL = integer

Description

The maximum number of retries for reaching a non-responding slave batch daemon, sbatchd.

The interval between retries is defined by MBD_SLEEP_TIME. If mbatchd fails to reach a host and has retried MAX_SBD_FAIL times, the host is considered unavailable. When a host becomes unavailable, mbatchd assumes that all jobs running on that host have exited and that all rerunnable jobs (jobs submitted with the bsub -r option) are scheduled to be rerun on another host.

Default

3

MAX_SCHED_STAY

Syntax

MAX_SCHED_STAY = integer

Description

The time in seconds the mbatchd has for scheduling pass.

Default

3

MAX_USER_PRIORITY

Syntax

MAX_USER_PRIORITY = integer

Description

Enables user-assigned job priority and specifies the maximum job priority a user can assign to a job.

LSF administrators can assign a job priority higher than the specified value.

Compatibility

User-assigned job priority changes the behavior of btop and bbot.

Example

MAX_USER_PRIORITY=100

Specifies that 100 is the maximum job priority that can be specified by a user.

Default

Undefined

See also

MBD_REFRESH_TIME

Syntax

MBD_REFRESH_TIME = seconds

Description

Time interval, in seconds, at which mbatchd will fork a new child mbatchd to service query requests to keep information sent back to clients updated. A child mbatchd processes query requests creating threads.

MBD_REFRESH_TIME applies only to UNIX platforms that support thread programming.

MBD_REFRESH_TIME works in conjunction with LSB_QUERY_PORT in lsf.conf. The child mbatchd continues to listen to the port number specified by LSB_QUERY_PORT and creates threads to service requests until the job changes status, a new job is submitted, or MBD_REFRESH_TIME has expired.

The value of this parameter must be between 5 and 300. Any values specified out of this range are ignored, and the system default value is applied.

The bjobs command may not display up-to-date information if two consecutive query commands are issued before a child mbatchd expires because child mbatchd job information is not updated. If you use the bjobs command and do not get up-to-date information, you may need to decrease the value of this parameter. Note, however, that the lower the value of this parameter, the more you negatively affect performance.

The number of concurrent requests is limited by the number of concurrent threads that a process can have. This number varies by platform:

Default

5 seconds if not defined or if defined value is less than 5; 300 seconds if defined value is more than 300

MBD_SLEEP_TIME

Syntax

MBD_SLEEP_TIME = seconds

Description

Used in conjunction with the parameters SLOT_RESERVE, MAX_SBD_FAIL.

Amount of time in seconds used for calculating parameter values.

Default

60

MC_RECLAIM_DELAY

Syntax

MC_RECLAIM_DELAY = minutes

Description

MultiCluster resource leasing model only. The reclaim interval (how often to reconfigure shared leases) in minutes.

Shared leases are defined by Type=shared in the lsb.resources HostExport section.

Default

10

MC_PENDING_REASON_PKG_SIZE

Syntax

MC_PENDING_REASON_PKG_SIZE = kilobytes | 0

Description

MultiCluster job forwarding model only. Pending reason update package size, in KB. Defines the maximum amount of pending reason data this cluster will send to submission clusters in one cycle.

Specify the keyword 0 (zero) to disable the limit and allow any amount of data in one package.

Default

512

MC_PENDING_REASON_UPDATE_INTERVAL

Syntax

MC_PENDING_REASON_UPDATE_INTERVAL = seconds | 0

Description

MultiCluster job forwarding model only. Pending reason update interval, in seconds. Defines how often this cluster will update submission clusters about the status of pending MultiCluster jobs.

Specify the keyword 0 (zero) to disable pending reason updating between clusters.

Default

300

MC_RUSAGE_UPDATE_INTERVAL

Syntax

MC_RUSAGE_UPDATE_INTERVAL = seconds

Description

MultiCluster only. Enables resource use updating for MultiCluster jobs running on hosts in the cluster and specifies how often to send updated information to the submission or consumer cluster.

Default

300

NO_PREEMPT_RUN_TIME

Syntax

NO_PREEMPT_RUN_TIME = run_time

Description

If set, jobs have been running for the specified number of minutes or longer will not be preempted. Run time is wall-clock time, not normalized run time.

You must define a run limit for the job, either at job level by bsub -W option or in the queue by configuring RUNLIMIT in lsb.queues.

NO_PREEMPT_FINISH_TIME

Syntax

NO_PREEMPT_FINISH_TIME = finish_time

Description

If set, jobs that will finish within the specified number of minutes will not be preempted. Run time is wall-clock time, not normalized run time.

You must define a run limit for the job, either at job level by bsub -W option or in the queue by configuring RUNLIMIT in lsb.queues.

NQS_QUEUES_FLAGS

Syntax

NQS_QUEUES_FLAGS = integer

Description

For Cray NQS compatibility only. Used by LSF to get the NQS queue information.

If the NQS version on a Cray is NQS 1.1, 80.42 or NQS 71.3, this parameter does not need to be defined.

For other versions of NQS on Cray, define both NQS_QUEUES_FLAGS and NQS_REQUESTS_FLAGS.

To determine the value of this parameter, run the NQS qstat command. The value of Npk_int[1] in the output is the value you need for this parameter. Refer to the NQS chapter in Administering Platform LSF for more details.

Default

Undefined

NQS_REQUESTS_FLAGS

Syntax

NQS_REQUESTS_FLAGS = integer

Description

For Cray NQS compatibility only.

If the NQS version on a Cray is NQS 80.42 or NQS 71.3, this parameter does not need to be defined.

If the version is NQS 1.1 on a Cray, set this parameter to 251918848. This is the is the qstat flag which LSF uses to retrieve requests on Cray in long format.

For other versions of NQS on a Cray, run the NQS qstat command. The value of Npk_int[1] in the output is the value you need for this parameter. Refer to the NQS chapter in Administering Platform LSF for more details.

Default

Undefined

PEND_REASON_UPDATE_INTERVAL

Syntax

PEND_REASON_UPDATE_INTERVAL = seconds

Description

Time interval that defines how often pending reasons are calculated by the scheduling daemon mbschd.

Default

30 seconds

PEND_REASON_MAX_JOBS

Syntax

PEND_REASON_MAX_JOBS = integer

Description

Number of jobs for each user per queue for which pending reasons are calculated by the scheduling daemon mbschd. Pending reasons are calculated at a time period set by PEND_REASON_UPDATE_INTERVAL.

Default

20 jobs

PG_SUSP_IT

Syntax

PG_SUSP_IT = seconds

Description

The time interval that a host should be interactively idle (it > 0) before jobs suspended because of a threshold on the pg load index can be resumed.

This parameter is used to prevent the case in which a batch job is suspended and resumed too often as it raises the paging rate while running and lowers it while suspended. If you are not concerned with the interference with interactive jobs caused by paging, the value of this parameter may be set to 0.

Default

180 (seconds)

PREEMPTABLE_RESOURCES

Syntax

PREEMPTABLE_RESOURCES = resource_name...

Description

LicenseMaximizer only. Enables license preemption when preemptive scheduling is enabled (has no effect if PREEMPTIVE is not also specified) and specifies the licenses that will be preemption resources. Specify shared numeric resources, static or decreasing, that LSF is configured to release (RELEASE=Y in lsf.shared, which is the default).

You must also configure LSF's preemption action to make the preempted application releases its licenses. To kill preempted jobs instead of suspending them, set TERMINATE_WHEN=PREEMPT in lsb.queues, or set JOB_CONTROLS in lsb.queues and specify brequeue as the SUSPEND action.

Default

Undefined (if preemptive scheduling is configured, LSF preempts on job slots only)

PREEMPT_FOR

Syntax

PREEMPT_FOR = [HOST_JLU | USER_JLP | GROUP_MAX | GROUP_JLP | MINI_JOB |LEAST_RUN_TIME]...

Description

If preemptive scheduling is enabled, this parameter can change the behavior of job slot limits and can also enable the optimized preemption mechanism for parallel jobs.

Specify a space-separated list of the following keywords:

Job slot limits specified at the queue level always count suspended jobs.

Default

Undefined. If preemptive scheduling is configured, the default preemption mechanism is used to preempt parallel jobs, and suspended jobs are ignored for the following limits only:

PREEMPTION_WAIT_TIME

Syntax

PREEMPTION_WAIT_TIME = seconds

Description

LicenseMaximizer only. You must also specify PREEMPTABLE_RESOURCES in lsb.params).

The amount of time LSF waits, after preempting jobs, for preemption resources to become available. Specify at least 300 seconds.

If LSF does not get the resources after this time, LSF might preempt more jobs.

Default

300 (5 minutes)

RESOURCE_RESERVE_PER_SLOT

Syntax

RESOURCE_RESERVE_PER_SLOT = y | Y

Description

If Y, mbatchd reserves resources based on job slots instead of per-host.

By default, mbatchd only reserves static resources for parallel jobs on a per- host basis. For example, by default, the command:

% bsub -n 4 -R "rusage[mem=500]" -q reservation my_job

requires the job to reserve 500 MB on each host where the job runs.

Some parallel jobs need to reserve resources based on job slots, rather than by host. In this example, if per-slot reservation is enabled by RESOURCE_RESERVE_PER_SLOT, the job my_job must reserve 500 MB of memory for each job slot (4 * 500 = 2 GB) on the host in order to run.

If RESOURCE_RESERVE_PER_SLOT is set, the following command reserves the resource static_resource on all 4 job slots instead of only 1 on the host where the job runs:

bsub -n 4 -R "static_resource > 0 rusage[static_resource=1]" 
myjob

Default

Undefined (reserve resources per-host)

RUN_JOB_FACTOR

Syntax

RUN_JOB_FACTOR = number

Description

Used only with fairshare scheduling. Job slots weighting factor.

In the calculation of a user's dynamic share priority, this factor determines the relative importance of the number of job slots reserved and in use by a user.

Default

3.0

RUN_TIME_FACTOR

Syntax

RUN_TIME_FACTOR = number

Description

Used only with fairshare scheduling. Run time weighting factor.

In the calculation of a user's dynamic share priority, this factor determines the relative importance of the total run time of a user's running jobs.

Default

0.7

SBD_SLEEP_TIME

Syntax

SBD_SLEEP_TIME = seconds

Description

The interval at which LSF checks the load conditions of each host, to decide whether jobs on the host must be suspended or resumed.

The job-level resource usage information is updated at a maximum frequency of every SBD_SLEEP_TIME seconds.

The update is done only if the value for the CPU time, resident memory usage, or virtual memory usage has changed by more than 10 percent from the previous update or if a new process or process group has been created.

Default

30

SYSTEM_MAPPING_ACCOUNT

Syntax

SYSTEM_MAPPING_ACCOUNT = user_account

Description

LSF Windows Workgroup installations only. User account to which all Windows workgroup user accounts are mapped.

Default

Undefined

USER_ADVANCE_RESERVATION

USER_ADVANCE_RESERVATION in lsb.params is obsolete. Use the ResourceReservation section configuration in lsb.resources to configure advance reservation policies for your cluster.

[ Top ]


SEE ALSO

lsf.conf(5), lsb.params(5), lsb.hosts(5), lsb.users(5), bsub(1)

[ Top ]


[ Platform Documentation ] [ Title ] [ Contents ] [ Previous ] [ Next ] [ Index ]


      Date Modified: February 24, 2004
Platform Computing: www.platform.com

Platform Support: support@platform.com
Platform Information Development: doc@platform.com

Copyright © 1994-2004 Platform Computing Corporation. All rights reserved.