Learn more about Platform products at http://www.platform.com

[ Platform Documentation ] [ Title ] [ Contents ] [ Previous ] [ Next ] [ Index ]



lsf.conf


Installation of and operation of LSF is controlled by the lsf.conf file. This chapter explains the contents of the lsf.conf file.

Contents

[ Top ]


About lsf.conf

The lsf.conf file is created during installation by the LSF setup program, and records all the settings chosen when LSF was installed. The lsf.conf file dictates the location of the specific configuration files and operation of individual servers and applications.

The lsf.conf file is used by LSF and applications built on top of it. For example, information in lsf.conf is used by LSF daemons and commands to locate other configuration files, executables, and network services. lsf.conf is updated, if necessary, when you upgrade to a new version.

This file can also be expanded to include application-specific parameters.

Location

The default location of lsf.conf is in /etc. This default location can be overridden when necessary by either the environment variable LSF_ENVDIR or the command line option -d available to some of the applications.

Format

Each entry in lsf.conf has one of the following forms:

NAME=VALUE
NAME=
NAME="STRING1 STRING2 ..."

The equal sign = must follow each NAME even if no value follows and there should be no space beside the equal sign.

A value that contains multiple strings separated by spaces must be enclosed in quotation marks.

Lines starting with a pound sign (#) are comments and are ignored. Do not use #if as this is reserved syntax for time-based configuration.

[ Top ]


Parameters

LSB_API_CONNTIMEOUT

Syntax

LSB_API_CONNTIMEOUT=time_seconds

Description

The timeout in seconds when connecting to the Batch system.

Valid Values

Any positive integer or zero

Default

10

See also

LSB_API_RECVTIMEOUT

LSB_API_RECVTIMEOUT

Syntax

LSB_API_RECVTIMEOUT=time_seconds

Description

Timeout in seconds when waiting for a reply from the Batch system.

Valid values

Any positive integer or zero

Default

10

See also

LSB_API_CONNTIMEOUT

LSB_BLOCK_JOBINFO_TIMEOUT

Syntax

LSB_BLOCK_JOBINFO_TIMEOUT=time_minutes

Description

Timeout in minutes for job information query commands (e.g., bjobs).

Valid values

Any positive integer

Default

Undefined (no timeout)

See also

See lsb.params under MAX_JOBINFO_QUERY_PERIOD.

LSB_CHUNK_RUSAGE

Syntax

LSB_CHUNK_RUSAGE=y

Description

Applies only to chunk jobs. When set, sbatchd contacts PIM to retrieve resource usage information to enforce resource usage limits on chunk jobs.

By default, resource usage limits are not enforced for chunk jobs because chunk jobs are typically too short to allow LSF to collect resource usage.

If LSB_CHUNK_RUSAGE=Y is defined, limits may not be enforced for chunk jobs that take less than a minute to run.

Default

Undefined. No resource usage is collected for chunk jobs.

LSB_CMD_LOG_MASK

Syntax

LSB_CMD_LOG_MASK=log_level

Description

Specifies the logging level of error messages from LSF commands.

To specify the logging level of error messages for LSF commands, use LSF_CMD_LOG_MASK. To specify the logging level of error messages for LSF daemons, use LSF_LOG_MASK.

LSB_CMD_LOG_MASK sets the log level and is used in combination with LSB_DEBUG_CMD, which sets the log class for LSF batch commands. For example:

LSB_CMD_LOG_MASK=LOG_DEBUG 
LSB_DEBUG_CMD="LC_TRACE LC_EXEC" 

Batch commands log error messages in different levels so that you can choose to log all messages, or only log messages that are deemed critical. The level specified by LSB_CMD_LOG_MASK determines which messages are recorded and which are discarded. All messages logged at the specified level or higher are recorded, while lower level messages are discarded.

For debugging purposes, the level LOG_DEBUG contains the fewest number of debugging messages and is used for basic debugging. The level LOG_DEBUG3 records all debugging messages, and can cause log files to grow very large; it is not often used. Most debugging is done at the level LOG_DEBUG2.

The commands log to the syslog facility unless LSB_CMD_LOGDIR is set.

Valid values

The log levels from highest to lowest are:

Default

LOG_WARNING

See also

LSB_CMD_LOGDIR, LSB_DEBUG, LSB_DEBUG_CMD, LSB_TIME_CMD, LSF_CMD_LOGDIR, LSF_CMD_LOG_MASK, LSF_LOG_MASK, LSF_LOGDIR, LSF_TIME_CMD

LSB_CMD_LOGDIR

Syntax

LSB_CMD_LOGDIR=path

Description

Specifies the path to the Batch command log files.

Default

/tmp

See also

LSB_CMD_LOG_MASK, LSB_DEBUG, LSB_DEBUG_CMD, LSB_TIME_CMD, LSF_CMD_LOGDIR, LSF_CMD_LOG_MASK, LSF_LOG_MASK, LSF_LOGDIR, LSF_TIME_CMD

LSB_CONFDIR

Syntax

LSB_CONFDIR=path

Description

Specifies the path to the directory containing the LSF configuration files.

The configuration directories are installed under LSB_CONFDIR.

Configuration files for each cluster are stored in a subdirectory of LSB_CONFDIR. This subdirectory contains several files that define user and host lists, operation parameters, and queues.

All files and directories under LSB_CONFDIR must be readable from all hosts in the cluster. LSB_CONFDIR/cluster_name/configdir must be owned by the LSF administrator.

CAUTION


Do not redefine this parameter after LSF has been installed.

Default

LSF_CONFDIR/lsbatch

See also

LSF_CONFDIR

LSB_CRDIR

Syntax

LSB_CRDIR=path

Description

Specifies the path and directory to the checkpointing executables on systems that support kernel-level checkpointing. LSB_CRDIR specifies the directory containing the chkpnt and restart utility programs that sbatchd uses to checkpoint or restart a job.

For example:

LSB_CRDIR=/usr/bin

If your platform supports kernel-level checkpointing, and if you want to use the utility programs provided for kernel-level checkpointing, set LSB_CRDIR to the location of the utility programs.

Default

Undefined

If undefined, the system uses /bin.

LSB_DEBUG

Syntax

LSB_DEBUG=1 | 2

Description

Sets the LSF batch system to debug.

If defined, LSF runs in single user mode:

When LSB_DEBUG is defined, LSF will not look in the system services database for port numbers. Instead, it uses the port numbers defined by the parameters LSB_MBD_PORT/LSB_SBD_PORT in lsf.conf. If these parameters are not defined, it uses port number 40000 for mbatchd and port number 40001 for sbatchd.

You should always specify 1 for this parameter unless you are testing LSF.

Can also be defined from the command line.

Valid values

Default

Undefined

See also

LSB_DEBUG, LSB_DEBUG_CMD, LSB_DEBUG_MBD, LSB_DEBUG_NQS, LSB_DEBUG_SBD, LSB_DEBUG_SCH, LSF_DEBUG_LIM, LSF_DEBUG_RES, LSF_LIM_PORT, LSF_RES_PORT, LSB_MBD_PORT, LSB_SBD_PORT, LSF_LOGDIR, LSF_LIM_DEBUG, LSF_RES_DEBUG

LSB_DEBUG_CMD

Syntax

LSB_DEBUG_CMD=log_class

Description

Sets the debugging log class for commands and APIs.

Specifies the log class filtering that will be applied to LSF batch commands or the API. Only messages belonging to the specified log class are recorded.

LSB_DEBUG_CMD sets the log class and is used in combination with LSB_CMD_LOG_MASK, which sets the log level. For example:

LSB_CMD_LOG_MASK=LOG_DEBUG 
LSB_DEBUG_CMD="LC_TRACE LC_EXEC" 

Debugging is turned on when you define both parameters.

The daemons log to the syslog facility unless LSB_CMD_LOGDIR is defined.

To specify multiple log classes, use a space-separated list enclosed by quotation marks. For example:

LSB_DEBUG_CMD="LC_TRACE LC_EXEC"

Can also be defined from the command line.

Valid values

Valid log classes are:

Default

Undefined

See also

LSB_CMD_LOG_MASK, LSB_CMD_LOGDIR, LSB_DEBUG, LSB_DEBUG_MBD, LSB_DEBUG_NQS, LSB_DEBUG_SBD, LSB_DEBUG_SCH, LSF_DEBUG_LIM, LSF_DEBUG_RES, LSF_LIM_PORT, LSF_RES_PORT, LSB_MBD_PORT, LSB_SBD_PORT, LSF_LOGDIR, LSF_LIM_DEBUG, LSF_RES_DEBUG

LSB_DEBUG_MBD

Syntax

LSB_DEBUG_MBD=log_class

Description

Sets the debugging log class for mbatchd.

Specifies the log class filtering that will be applied to mbatchd. Only messages belonging to the specified log class are recorded.

LSB_DEBUG_MBD sets the log class and is used in combination with LSF_LOG_MASK, which sets the log level. For example:

LSF_LOG_MASK=LOG_DEBUG 
LSB_DEBUG_MBD="LC_TRACE LC_EXEC"

To specify multiple log classes, use a space-separated list enclosed in quotation marks. For example:

LSB_DEBUG_MBD="LC_TRACE LC_EXEC"

You need to restart the daemons after setting LSB_DEBUG_MBD for your changes to take effect.

If you use the command badmin mbddebug to temporarily change this parameter without changing lsf.conf, you will not need to restart the daemons.

The daemons log to the syslog facility unless LSF_LOGDIR is defined.

Valid Values

Valid log classes are the same as for LSB_DEBUG_CMD except for the log classes LC_ELIM and LC_JARRAY which cannot be used with LSB_DEBUG_MBD. See LSB_DEBUG_CMD.

Default

Undefined

See also

LSF_LOG_MASK, LSF_LOGDIR, LSB_DEBUG, LSB_DEBUG_CMD, LSB_DEBUG_MBD, LSB_DEBUG_NQS, LSB_DEBUG_SBD, LSB_DEBUG_SCH, LSF_DEBUG_LIM, LSF_DEBUG_RES, LSF_LIM_PORT, LSF_RES_PORT, LSB_MBD_PORT, LSB_SBD_PORT, LSF_LOGDIR, LSF_LIM_DEBUG, LSF_RES_DEBUG, badmin mbddebug

LSB_DEBUG_NQS

Syntax

LSB_DEBUG_NQS=log_class

Description

Sets the log class for debugging the NQS interface.

Specifies the log class filtering that will be applied to NQS. Only messages belonging to the specified log class are recorded.

LSB_DEBUG_NQS sets the log class and is used in combination with LSF_LOG_MASK, which sets the log level. For example:

LSF_LOG_MASK=LOG_DEBUG
LSB_DEBUG_NQS="LC_TRACE LC_EXEC" 

Debugging is turned on when you define both parameters.

To specify multiple log classes, use a space-separated list enclosed in quotation marks. For example:

LSB_DEBUG_NQS="LC_TRACE LC_EXEC"

The daemons log to the syslog facility unless LSF_LOGDIR is defined.

This parameter can also be defined from the command line.

Valid values

For a list of valid log classes, see LSB_DEBUG_CMD.

Default

Undefined

See also

LSB_DEBUG_CMD, LSF_CMD_LOGDIR, LSF_CMD_LOG_MASK, LSF_LOG_MASK, LSF_LOGDIR

LSB_DEBUG_SBD

Syntax

LSB_DEBUG_SBD=log_class

Description

Sets the debugging log class for sbatchd.

Specifies the log class filtering that will be applied to sbatchd. Only messages belonging to the specified log class are recorded.

LSB_DEBUG_SBD sets the log class and is used in combination with LSF_LOG_MASK, which sets the log level. For example:

LSF_LOG_MASK=LOG_DEBUG
LSB_DEBUG_SBD="LC_TRACE LC_EXEC" 

To specify multiple log classes, use a space-separated list enclosed in quotation marks. For example:

LSB_DEBUG_SBD="LC_TRACE LC_EXEC"

You need to restart the daemons after setting LSB_DEBUG_SBD for your changes to take effect.

If you use the command badmin sbddebug to temporarily change this parameter without changing lsf.conf, you will not need to restart the daemons.

The daemons log to the syslog facility unless LSF_LOGDIR is defined.

Valid values

Valid log classes are the same as for LSB_DEBUG_CMD except for the log classes LC_ELIM and LC_JARRAY which cannot be used with LSB_DEBUG_SBD. See LSB_DEBUG_CMD.

Default

Undefined

See also

LSB_DEBUG_MBD, LSF_CMD_LOGDIR, LSF_CMD_LOG_MASK, LSF_LOG_MASK, LSF_LOGDIR, badmin

LSB_DEBUG_SCH

Syntax

LSB_DEBUG_SCH=log_class

Description

Sets the debugging log class for mbschd.

Specifies the log class filtering that will be applied to mbschd. Only messages belonging to the specified log class are recorded.

LSB_DEBUG_SCH sets the log class and is used in combination with LSF_LOG_MASK, which sets the log level. For example:

LSF_LOG_MASK=LOG_DEBUG 
LSB_DEBUG_SCH="LC_SCHED"

To specify multiple log classes, use a space-separated list enclosed in quotation marks. For example:

LSB_DEBUG_SCH="LC_SCHED LC_TRACE LC_EXEC"

You need to restart the daemons after setting LSB_DEBUG_SCH for your changes to take effect.

The daemons log to the syslog facility unless LSF_LOGDIR is defined.

Valid Values

Valid log classes are the same as for LSB_DEBUG_CMD except for the log classes LC_ELIM and LC_JARRAY which cannot be used with LSB_DEBUG_SCH, and LC_HPC, which is only valid for LSB_DEBUG_SCH. See LSB_DEBUG_CMD.

Default

Undefined

See also

LSB_DEBUG_MBD, LSB_DEBUG_SBD, LSF_CMD_LOGDIR, LSF_CMD_LOG_MASK, LSF_LOG_MASK, LSF_LOGDIR, badmin

LSF_DYNAMIC_HOST_TIMEOUT

Syntax

LSF_DYNAMIC_HOST_TIMEOUT=number[m | M]

Description

Defines a timeout value for when a dynamically added host becomes unavailable. A dynamic host will be removed if a host becomes unavailable after the specified number of hours.

By default, the timeout is in hours. To specify a timeout in minutes, append "m" or "M" to the value. For example:

LSF_DYNAMIC_HOST_TIMEOUT=15M   # 15 minutes
LSF_DYNAMIC_HOST_TIMEOUT=120M  # 120 minutes (2h)

The minimum value is 10 minutes. If you define a value smaller than 10 minutes, LSF sets the timeout to 10 minutes and logs a warning message in the lim.log file.

Example

LSF_DYNAMIC_HOST_TIMEOUT=1 

A dynamically added host will be removed once it becomes unavailable for more than one hour.

Default

No time out (host is never dynamically removed)

LSB_ECHKPNT_METHOD

Syntax

LSB_ECHKPNT_METHOD=method_name

Description

Name of custom echkpnt and erestart methods.

Can also be defined as an environment variable, or specified through the bsub -k option.

The name you specify here will be used for both your custom echkpnt and erestart programs. You must assign your custom echkpnt and erestart programs the name echkpnt.method_name and erestart.method_name. The programs echkpnt.method_name and erestart.method_name. must be in LSF_SERVERDIR or in the directory specified by LSB_ECHKPNT_METHOD_DIR.

Do not define LSB_ECHKPNT_METHOD=default as default is a reserved keyword to indicate to use LSF's default echkpnt and erestart methods. You can however, specify bsub -k "my_dir method=default" my_job to indicate that you want to use LSF's default checkpoint and restart methods.

When this parameter is undefined in lsf.conf or as an environment variable and no custom method is specified at job submission through bsub -k, LSF uses echkpnt.default and erestart.default to checkpoint and restart jobs.

When this parameter is defined, LSF uses the custom checkpoint and restart methods specified.

Limitations

The method name and directory (LSB_ECHKPNT_METHOD_DIR) combination must be unique in the cluster.

For example, you may have two echkpnt applications with the same name such as echkpnt.mymethod but what differentiates them is the different directories defined with LSB_ECHKPNT_METHOD_DIR. It is the cluster administrator's responsibility to ensure that method name and method directory combinations are unique in the cluster.

Default

Undefined; LSF uses echkpnt.default and erestart.default to checkpoint and restart jobs

See also

LSB_ECHKPNT_METHOD_DIR, LSB_ECHKPNT_KEEP_OUTPUT

LSB_ECHKPNT_METHOD_DIR

Syntax

LSB_ECHKPNT_METHOD_DIR=path

Description

Absolute path name of the directory in which custom echkpnt and erestart programs are located.

The checkpoint method directory should be accessible by all users who need to run the custom echkpnt and erestart programs.

Can also be defined as an environment variable.

Default

Undefined; LSF searches in LSF_SERVERDIR for custom echkpnt and erestart programs

See also

LSB_ECHKPNT_METHOD, LSB_ECHKPNT_KEEP_OUTPUT

LSB_ECHKPNT_KEEP_OUTPUT

Syntax

LSB_ECHKPNT_KEEP_OUTPUT=y|Y

Description

Saves the standard output and standard error of custom echkpnt and erestart methods to:

Can also be defined as an environment variable.

Default

Undefined; standard error and standard output messages from custom echkpnt and erestart programs is directed to /dev/null and discarded by LSF.

See also

LSB_ECHKPNT_METHOD, LSB_ECHKPNT_METHOD_DIR

LSB_ESUB_METHOD

To specify a mandatory esub method that applies to all job submissions, you can configure LSB_ESUB_METHOD in lsf.conf.

LSB_ESUB_METHOD specifies the name of the esub method used in addition to any methods specified in the bsub -a option.

For example, LSB_ESUB_METHOD="dce fluent" defines DCE as the mandatory security system, and FLUENT as the mandatory application used on all jobs.

Syntax

LSB_ESUB_METHOD="method_name [method_name] ..."

Description

Specifies the name of the mandatory esub method. Can also be defined as an environment variable.

When this parameter is defined, LSF uses the specified esub method, where method_name is one of:

The name you specify is used to invoke the appropriate esub program. The esub and esub.xxx programs must be located in LSF_SERVERDIR.

Example

LSB_ESUB_METHOD="dce fluent" defines DCE as the mandatory security system, and FLUENT as the mandatory application used on all jobs.

Limitations

LSF does not detect conflicting method specifications. For example, you can specify either openmp or pvm, but not both. If LSB_ESUB_METHOD="openmp" and bsub -a pvm is specified at job submission, the job may fail or be rejected.

If multiple esub methods are specified, and the return value is LSB_ABORT_VALUE, esub exits without running the remaining esub methods and returns LSB_ABORT_VALUE.

Default

Undefined

LSB_INTERACT_MSG_ENH

Syntax

LSB_INTERACT_MSG_ENH=y | Y

Description

If set, enables enhanced messaging for interactive batch jobs. To disable interactive batch job messages, set LSB_INTERACT_MSG_ENH to any value other than y or Y; for example, LSB_INTERACT_MSG_ENH=N.

Default

Undefined

See also

LSB_INTERACT_MSG_INTVAL

LSB_INTERACT_MSG_INTVAL

Syntax

LSB_INTERACT_MSG_INTVAL=time_seconds

Description

Specifies the update interval in seconds for interactive batch job messages. LSB_INTERACT_MSG_INTVAL is ignored if LSB_INTERACT_MSG_ENH is not set.

Job information that LSF uses to get the pending or suspension reason is updated according to the value of PEND_UPDATE_INTERVAL in lsb.params.

Default

Undefined. If LSB_INTERACT_MSG_INTVAL is set to an incorrect value, the default update interval is 60 seconds.

See also

LSB_INTERACT_MSG_ENH in lsf.conf

LSB_JOB_CPULIMIT

Syntax

LSB_JOB_CPULIMIT=y | n

Description

Determines whether the CPU limit is a per-process limit enforced by the OS or whether it is a per-job limit enforced by LSF:

This parameter applies to CPU limits set when a job is submitted with bsub -c, and to CPU limits set for queues by CPULIMIT in lsb.queues.

The setting of LSB_JOB_CPULIMIT has the following effect on how the limit is enforced:

When LSB_JOB_CPULIMIT is LSF-enforced per-job limit OS-enforced per-process limit
y
Enabled
Disabled
n
Disabled
Enabled
undefined
Enabled
Enabled

Default

Undefined

Notes

To make LSB_JOB_CPULIMIT take effect, use the command badmin hrestart all to restart all sbatchds in the cluster.

Changing the default Terminate job control action--You can define a different terminate action in lsb.queues with the parameter JOB_CONTROLS if you do not want the job to be killed. For more details on job controls, see Administering Platform LSF.

Limitations

If a job is running and the parameter is changed, LSF is not able to reset the type of limit enforcement for running jobs.

See also

lsb.queues(5), bsub(1), JOB_TERMINATE_INTERVAL in lsb.params, LSB_MOD_ALL_JOBS in lsf.conf.

LSB_JOB_MEMLIMIT

Syntax

LSB_JOB_MEMLIMIT=y | n

Description

Determines whether the memory limit is a per-process limit enforced by the OS or whether it is a per-job limit enforced by LSF.

This parameter applies to memory limits set when a job is submitted with bsub -M mem_limit, and to memory limits set for queues with MEMLIMIT in lsb.queues.

The setting of LSB_JOB_MEMLIMIT has the following effect on how the limit is enforced:

When LSB_JOB_MEMLIMIT is LSF-enforced per-job limit OS-enforced per-process limit
y
Enabled
Disabled
n or undefined
Disabled
Enabled

Default

Undefined; per-process memory limit enforced by the OS; per-job memory limit enforced by LSF disabled

Notes

To make LSB_JOB_MEMLIMIT take effect, use the command badmin hrestart all to restart all sbatchds in the cluster.

If LSB_JOB_MEMLIMIT is set, it overrides the setting of the parameter LSB_MEMLIMIT_ENFORCE. The parameter LSB_MEMLIMIT_ENFORCE is ignored.

The difference between LSB_JOB_MEMLIMIT set to y and LSB_MEMLIMIT_ENFORCE set to y is that with LSB_JOB_MEMLIMIT, only the per-job memory limit enforced by LSF is enabled. The per-process memory limit enforced by the OS is disabled. With LSB_MEMLIMIT_ENFORCE set to y, both the per-job memory limit enforced by LSF and the per-process memory limit enforced by the OS are enabled.

Changing the default Terminate job control action--You can define a different Terminate action in lsb.queues with the parameter JOB_CONTROLS if you do not want the job to be killed. For more details on job controls, see Administering Platform LSF.

Limitations

If a job is running and the parameter is changed, LSF is not able to reset the type of limit enforcement for running jobs.

See also

LSB_MEMLIMIT_ENFORCE, LSB_MOD_ALL_JOBS, lsb.queues(5), bsub(1), JOB_TERMINATE_INTERVAL in lsb.params

LSB_LOCALDIR

Syntax

LSB_LOCALDIR=path

Description

Enables duplicate logging.

Specify the path to a local directory that exists only on the first LSF master host (the first host configured in lsf.cluster.cluster_name). LSF puts the primary copies of the event and accounting log files in this directory. LSF puts the duplicates in LSB_SHAREDIR.

Example

LSB_LOCALDIR=/usr/share/lsbatch/loginfo

Default

Undefined

See also

LSB_SHAREDIR in LSB_SHAREDIR.

EVENT_UPDATE_INTERVAL in EVENT_UPDATE_INTERVAL

LSF_LOCAL_RESOURCES

Syntax

LSF_LOCAL_RESOURCES=resource ...

Description

Defines instances of local resources residing on the slave host.

When the slave host calls the master host to add itself, it also reports its local resources. The local resources to be added must be defined in lsf.shared.

If the same resource is already defined in lsf.shared as default or all, it cannot be added as a local resource. The shared resource overrides the local one.


LSF_LOCAL_RESOURCES is usually set in the slave.config file during installation. If LSF_LOCAL_RESOURCES are already defined in a local lsf.conf on the slave host, lsfinstall does not add resources you define in LSF_LOCAL_RESOURCES in slave.config. You should not have duplicate LSF_LOCAL_RESOURCES entries in lsf.conf. If local resources are defined more than once, only the last definition is valid.

IMPORTANT


Resources must already be mapped to hosts in the ResourceMap section of lsf.cluster.cluster_name. If the ResourceMap section does not exist, local resources are not added.

Example

LSF_LOCAL_RESOURCES=[resourcemap 1*verilog] [resource linux]

Default

Undefined

LSB_MAILPROG

Syntax

LSB_MAILPROG=file_name

Description

Path and file name of the mail program used by the Batch system to send email. This is the electronic mail program that LSF will use to send system messages to the user. When LSF needs to send email to users it invokes the program defined by LSB_MAILPROG in lsf.conf. You can write your own custom mail program and set LSB_MAILPROG to the path where this program is stored.

The LSF administrator can set the parameter as part of cluster reconfiguration. Provide the name of any mail program. For your convenience, LSF provides the sendmail mail program, which supports the sendmail protocol on UNIX.

In a mixed cluster, you can specify different programs for Windows and UNIX. You can set this parameter during installation on Windows. For your convenience, LSF provides the lsmail.exe mail program, which supports SMTP and Microsoft Exchange Server protocols on Windows. If lsmail is specified, the parameter LSB_MAILSERVER must also be specified.

If you change your mail program, the LSF administrator must restart sbatchd on all hosts to retrieve the new value.

UNIX

By default, LSF uses /usr/lib/sendmail to send email to users. LSF calls LSB_MAILPROG with two arguments; one argument gives the full name of the sender, and the other argument gives the return address for mail.

LSB_MAILPROG must read the body of the mail message from the standard input. The end of the message is marked by end-of-file. Any program or shell script that accepts the arguments and input, and delivers the mail correctly, can be used.

LSB_MAILPROG must be executable by any user.

Windows

If LSB_MAILPROG is not defined, no email is sent.

Examples

LSB_MAILPROG=lsmail.exe
LSB_MAILPROG=/serverA/tools/lsf/bin/unixhost.exe 

Default

/usr/lib/sendmail (UNIX)

blank (Windows)

See also

LSB_MAILSERVER, LSB_MAILTO

LSB_MAILSERVER

Syntax

LSB_MAILSERVER=mail_protocol:mail_server

Description

Part of mail configuration on Windows.

This parameter only applies when lsmail is used as the mail program (LSB_MAILPROG = lsmail.exe).Otherwise, it is ignored.

Both mail_protocol and mail_server must be indicated.

Set this parameter to either SMTP or Microsoft Exchange protocol (SMTP or EXCHANGE) and specify the name of the host that is the mail server.

This parameter is set during installation of LSF on Windows or is set or modified by the LSF administrator.

If this parameter is modified, the LSF administrator must restart sbatchd on all hosts to retrieve the new value.

Examples

LSB_MAILSERVER = EXCHANGE:Host2@company.com
LSB_MAILSERVER = SMTP:MailHost

Default

Undefined

See also

LSB_MAILPROG

LSB_MAILSIZE_LIMIT

Syntax

LSB_MAILSIZE_LIMIT=email_size_in_KB

Description

Limits the size of the email containing job output information.

The system sends job information such as CPU, process and memory usage, job output, and errors in email to the submitting user account. Some batch jobs can create large amounts of output. To prevent large job output files from interfering with your mail system, use LSB_MAILSIZE_LIMIT to set the maximum size in KB of the email containing the job information. Specify a positive integer.

If the size of the job output email exceeds LSB_MAILSIZE_LIMIT, the output is saved to a file under JOB_SPOOL_DIR or to the default job output directory if JOB_SPOOL_DIR is undefined. The email informs users of where the job output is located.

If the -o option of bsub is used, the size of the job output is not checked against LSB_MAILSIZE_LIMIT.

If you use a custom mail program specified by the LSB_MAILPROG parameter that can use the LSB_MAILSIZE environment variable, it is not necessary to configure LSB_MAILSIZE_LIMIT.

Default

By default, LSB_MAILSIZE_LIMIT is not enabled. No limit is set on size of batch job output email.

See also

LSB_MAILPROG, LSB_MAILTO

LSB_MAILTO

Syntax

LSB_MAILTO=mail_account

Description

LSF sends electronic mail to users when their jobs complete or have errors, and to the LSF administrator in the case of critical errors in the LSF system. The default is to send mail to the user who submitted the job, on the host on which the daemon is running; this assumes that your electronic mail system forwards messages to a central mailbox.

The LSB_MAILTO parameter changes the mailing address used by LSF. LSB_MAILTO is a format string that is used to build the mailing address.

Common formats are:

All other characters (including any other `!') are copied exactly.

If this parameter is modified, the LSF administrator must restart sbatchd on all hosts to retrieve the new value.

Default

!U

See also

LSB_MAILPROG, LSB_MAILSIZE_LIMIT

LSB_MAX_NQS_QUEUES

Syntax

LSB_MAX_NQS_QUEUES=nqs_queues

Description

The maximum number of NQS queues allowed in the LSF cluster. Required for LSF to work with NQS. You must restart mbatchd if you change the value of LSB_MAX_NQS_QUEUES.

The total number of NQS queues configured by NQS_QUEUES in lsb.queues cannot exceed the value of LSB_MAX_NQS_QUEUES. NQS queues in excess of the maximum queues are ignored.

If you do not define LSB_MAX_NQS_QUEUES or define an incorrect value, LSF-NQS interoperation is disabled.

Valid Values

Any positive integer

Default

None

LSB_MBD_PORT

See LSF_LIM_PORT, LSF_RES_PORT, LSB_MBD_PORT, LSB_SBD_PORT.

LSB_MC_CHKPNT_RERUN

Syntax

LSB_MC_CHKPNT_RERUN=y | n

Description

For checkpointable MultiCluster jobs, if a restart attempt fails, the job will be rerun from the beginning (instead of from the last checkpoint) without administrator or user intervention.

The submission cluster does not need to forward the job again. The execution cluster reports the job's new pending status back to the submission cluster, and the job is dispatched to the same host to restart from the beginning

Default

n

LSB_MC_INITFAIL_MAIL

Syntax

LSB_MC_INITFAIL_MAIL=y | n

Description

MultiCluster job forwarding model only. Specify y to make LSF email the job owner when a job is suspended after reaching the retry threshold.

Default

n

LSB_MC_INITFAIL_RETRY

Syntax

LSB_MC_INITFAIL_MAIL=integer

Description

MultiCluster job forwarding model only. Defines the retry threshold and causes LSF to suspend a job that repeatedly fails to start. For example, specify 2 retry attempts to make LSF attempt to start a job 3 times before suspending it.

Default

5

LSB_MEMLIMIT_ENFORCE

Syntax

LSB_MEMLIMIT_ENFORCE=y | n

Description

Specify y to enable LSF memory limit enforcement.

If enabled, LSF sends a signal to kill all processes that exceed queue-level memory limits set by MEMLIMIT in lsb.queues or job-level memory limits specified by bsub -M mem_limit.

Otherwise, LSF passes memory limit enforcement to the OS. UNIX operating systems that support RLIMIT_RSS for setrlimit() can apply the memory limit to each process.

The following operating systems do not support memory limit at the OS level:

Default

Undefined. LSF passes memory limit enforcement to the OS.

See also

lsb.queues(5)

LSB_MIG2PEND

Syntax

LSB_MIG2PEND=0 | 1

Description

Applies only to migrating jobs.

If 1, requeues migrating jobs instead of restarting or rerunning them on the next available host. Requeues the jobs in the PEND state, in order of the original submission time, unless LSB_REQUEUE_TO_BOTTOM is also defined.

If you do not want migrating jobs to be run or restarted immediately, set LSB_MBD_MIG2PEND so that migrating jobs are considered as pending jobs and inserted in the pending jobs queue.

If you want migrating jobs to be considered as pending jobs but you want them to be placed at the bottom of the queue without considering submission time, define both LSB_MBD_MIG2PEND and LSB_REQUEUE_TO_BOTTOM.

Also considers job priority when requeuing jobs.

Does not work with MultiCluster.

Default

Undefined

See also

LSB_REQUEUE_TO_BOTTOM

LSB_MOD_ALL_JOBS

Syntax

LSB_MOD_ALL_JOBS=y | Y

Description

If set, enables bmod to modify resource limits and location of job output files for running jobs.

After a job has been dispatched, the following modifications can be made:

To modify the CPU limit or the memory limit of running jobs, the parameters LSB_JOB_CPULIMIT=Y and LSB_JOB_MEMLIMIT=Y must be defined in lsf.conf.

Default

Undefined

See also

LSB_JOB_CPULIMIT, LSB_JOB_MEMLIMIT

LSB_NCPU_ENFORCE

Description

When set to 1, enables parallel fairshare (considers the number of CPUs when calculating dynamic priority).

Default

Undefined

LSB_NQS_PORT

Syntax

LSB_NQS_PORT=port_number

Description

Required for LSF to work with NQS.

TCP service port to use for communication with NQS.

Where defined

This parameter can alternatively be set as an environment variable or in the services database such as /etc/services.

Example

LSB_NQS_PORT=607

Default

Undefined

LSB_QUERY_PORT

Syntax

LSB_QUERY_PORT=port_number

Description

Optional. Applies only to UNIX platforms that support thread programming.

This parameter is recommended for busy clusters with many jobs and frequent query requests to increase mbatchd performance when you use the bjobs command.

This may indirectly increase overall mbatchd performance.

The port_number is the TCP/IP port number to be used by mbatchd to only service query requests from the LSF system. mbatchd checks the query port during initialization.

If LSB_QUERY_PORT is not defined:

If LSB_QUERY_PORT is defined:

mbatchd prepares this port for connection.The default behavior of mbatchd changes, a child mbatchd is forked, and the child mbatchd creates threads to process requests.

mbatchd responds to requests by forking one child mbatchd. As soon as mbatchd has forked a child mbatchd, the child mbatchd takes over and listens on the port to process more query requests. For each request, the child mbatchd creates a thread to process it.

The child mbatchd continues to listen to the port number specified by LSB_QUERY_PORT and creates threads to service requests until the job changes status, a new job is submitted, or the time specified in MBD_REFRESH_TIME in lsb.params has passed (see MBD_REFRESH_TIME for more details). At this time, the parent mbatchd sends a message to the child mbatchd to exit.

The interval used by mbatchd for forking new child mbatchds is specified by the parameter MBD_REFRESH_TIME in lsb.params.

Operating system support


See the Online Support area of the Platform Computing Web site at www.platform.com for the latest information about operating systems that support multithreaded mbatchd.

Default

Undefined

See also

MBD_REFRESH_TIME in MBD_REFRESH_TIME.

LSB_REQUEUE_TO_BOTTOM

Syntax

LSB_REQUEUE_TO_BOTTOM=0 | 1

Description

Optional. If 1, requeues automatically requeued jobs to the bottom of the queue instead of to the top. Also requeues migrating jobs to the bottom of the queue if LSB_MIG2PEND is defined.

Does not work with MultiCluster.

Default

Undefined

See also

REQUEUE_EXIT_VALUES in REQUEUE_EXIT_VALUES, LSB_MIG2PEND in LSB_MIG2PEND

LSF_RSH

Syntax

LSF_RSH=command [command_opions]

Description

Specifies shell commands to use when the following LSF commands require remote execution:

By default, rsh is used for these commands. Use LSF_RSH to enable support for ssh.

Default

Undefined

Example

To use an ssh command before trying rsh for LSF commands, specify:

LSF_RSH=ssh -o "PasswordAuthentication no" -o "StrictHostKeyChecking no"

ssh options such as PasswordAuthentication and StrictHostKeyChecking can also be configured in the global SSH_ETC/ssh_config file or $HOME/.ssh/config.

See also

ssh(1) ssh_config(5)

LSB_SBD_PORT

See LSF_LIM_PORT, LSF_RES_PORT, LSB_MBD_PORT, LSB_SBD_PORT.

LSB_SET_TMPDIR

Syntax

LSB_SET_TMPDIR=[y|n]

If y, LSF sets the TMPDIR environment variable, overwriting the current value with /tmp/job_ID.

Default

n

LSB_SHAREDIR

Syntax

LSB_SHAREDIR=dir

Description

Directory in which the job history and accounting logs are kept for each cluster. These files are necessary for correct operation of the system. Like the organization under LSB_CONFDIR, there is one subdirectory for each cluster.

The LSB_SHAREDIR directory must be owned by the LSF administrator. It must be accessible from all hosts that can potentially become the master host, and must allow read and write access from the master host.

The LSB_SHAREDIR directory typically resides on a reliable file server.

Default

LSF_INDEP/work

See also

LSB_LOCALDIR

LSB_SUB_COMMANDNAME

Syntax

LSB_SUB_COMMANDNAME=y

Description

If set, enables esub to use the variable LSB_SUB_COMMAND_LINE in the esub job parameter file specified by the $LSB_SUB_PARM_FILE environment variable.

The LSB_SUB_COMMAND_LINE variable carries the value of the bsub command argument, and is used when esub runs.

Example

esub contains:

/bin/sh
. $LSB_SUB_PARM_FILE
if [ $LSB_SUB_COMMAND_LINE = "netscape" ]; then
echo "netscape is not allowed to run in batch mode"
exit $LSB_SUB_ABORT_VALUE
fi

LSB_SUB_COMMAND_LINE is defined in $LSB_SUB_PARM_FILE as:

LSB_SUB_COMMAND_LINE=netscape

A job submitted with:

bsub netscape ...

Causes esub to echo the message:

netscape is not allowed to run in batch mode

Default

Undefined

See also

LSB_SUB_COMMAND_LINE, LSB_SUB_PARM_FILE

LSB_SHORT_HOSTLIST

Syntax

LSB_SHORT_HOSTLIST=1

Description

Displays an abbreviated list of hosts in bjobs and bhist for a parallel job where multiple processes of a job are running on a host. Multiple processes are displayed in the following format:

processes*hostA

For example, if a parallel job is running 5 processes on hostA, the information is displayed in the following manner:

5*hostA

Setting this parameter may improve mbatchd restart performance and accelerate event replay.

Default

Undefined

LSB_SIGSTOP

Syntax

LSB_SIGSTOP=signal_name | signal_value

Description

Specifies the signal sent by the SUSPEND action in LSF. You can specify a signal name or a number.

If LSB_SIGSTOP is set to anything other than SIGSTOP, the SIGTSTP signal that is normally sent by the SUSPEND action is not sent.

If this parameter is undefined, by default the SUSPEND action in LSF sends the following signals to a job:

The same set of signals is not supported on all UNIX systems. To display a list of the symbolic names of the signals (without the SIG prefix) supported on your system, use the kill -l command.

Example

LSB_SIGSTOP=SIGKILL

In this example, the SUSPEND action sends the three default signals sent by the TERMINATE action (SIGINT, SIGTERM, and SIGKILL) 10 seconds apart.

Default

Undefined. Default SUSPEND action in LSF is sent.

LSB_STDOUT_DIRECT

Syntax

LSB_STDOUT_DIRECT=y | Y

Description

When set, and used with the -o or -e options of bsub, redirects standard output or standard error from the job directly to a file as the job runs.

If LSB_STDOUT_DIRECT is not set and you use the bsub -o option, the standard output of a job is written to a temporary file and copied to the file you specify after the job finishes.

LSB_STDOUT_DIRECT is not supported on Windows.

Default

Undefined

LSB_TIME_CMD

Syntax

LSB_TIME_CMD=timimg_level

Description

The timing level for checking how long batch commands run.

Time usage is logged in milliseconds; specify a positive integer.

Example: LSB_TIME_CMD=1

Default

Undefined

See also

LSB_TIME_MBD, LSB_TIME_SBD, LSF_TIME_LIM, LSF_TIME_RES

LSB_TIME_MBD

Syntax

LSB_TIME_MBD=timing_level

Description

The timing level for checking how long mbatchd routines run.

Time usage is logged in milliseconds; specify a positive integer.

Example: LSB_TIME_MBD=1

Default

Undefined

See also

LSB_TIME_CMD, LSB_TIME_SBD, LSF_TIME_LIM, LSF_TIME_RES

LSB_TIME_SBD

Syntax

LSB_TIME_SBD=timing_level

Description

The timing level for checking how long sbatchd routines run.

Time usage is logged in milliseconds; specify a positive integer.

Example: LSB_TIME_SBD=1

Default

Undefined

See also

LSB_TIME_CMD, LSB_TIME_MBD, LSF_TIME_LIM, LSF_TIME_RES

LSB_TIME_SCH

Syntax

LSB_TIME_SCH=timing_level

Description

The timing level for checking how long mbschd routines run.

Time usage is logged in milliseconds; specify a positive integer.

Example: LSB_TIME_SCH=1

Default

Undefined

LSB_UTMP

Syntax

LSB_UTMP=y | Y

Description

If set, enables registration of user and account information for interactive batch jobs submitted with bsub -Ip or bsub -Is. To disable utmp file registration, set LSB_UTMP to any value other than y or Y; for example, LSB_UTMP=N.

LSF registers interactive batch jobs the job by adding a entries to the utmp file on the execution host when the job starts. After the job finishes, LSF removes the entries for the job from the utmp file.

Limitations

Registration of utmp file entries is supported only on SGI IRIX (6.4 and later).

utmp file registration is not supported in a MultiCluster environment.

Because interactive batch jobs submitted with bsub -I are not associated with a pseudo-terminal, utmp file registration is not supported for these jobs.

Default

Undefined

LSF_AFS_CELLNAME

Syntax

LSF_AFS_CELLNAME=AFS_cell_name

Description

Must be defined to AFS cell name if the AFS file system is in use.

Example:

LSF_AFS_CELLNAME=cern.ch

Default

Undefined

LSF_AM_OPTIONS

Syntax

LSF_AM_OPTIONS=AMFIRST | AMNEVER

Description

Determines the order of file path resolution when setting the user's home directory.

This variable is rarely used but sometimes LSF does not properly change the directory to the user's home directory when the user's home directory is automounted. Setting LSF_AM_OPTIONS forces the Batch system to change directory to $HOME before attempting to automount the user's home.

When this parameter is undefined or set to AMFIRST, LSF:

When this parameter is set to AMNEVER, LSF:

Valid Values

The two values are AMFIRST and AMNEVER

Default

Undefined; same as AMFIRST

LSF_API_CONNTIMEOUT

Syntax

LSF_API_CONNTIMEOUT=time_seconds

Description

Timeout when connecting to LIM.

Default

5

See also

LSF_API_RECVTIMEOUT

LSF_API_RECVTIMEOUT

Syntax

LSF_API_RECVTIMEOUT=time_seconds

Description

Timeout when receiving a reply from LIM.

Default

20

See also

LSF_API_CONNTIMEOUT

LSF_AUTH

Syntax

LSF_AUTH=eauth | ident

Description

Optional. Determines the type of authentication used by LSF.

External user authentication is configured automatically during installation (LSF_AUTH=eauth). If LSF_AUTH is not defined, privileged ports (setuid) authentication is used. This is the mechanism most UNIX remote utilities use.

External authentication is the only way to provide security for clusters that contain Windows hosts.

If this parameter is changed, all LSF daemons must be shut down and restarted by running lsf_daemons start on each LSF server host so that all daemons use the new authentication method.

When LSF uses privileged ports for user authentication, LSF commands must be installed as setuid programs owned by root to operate correctly. If the commands are installed in an NFS-mounted shared file system, the file system must be mounted with setuid execution allowed (that is, without the nosuid option). See the man page for the mount command for more details.

Windows does not have the concept of setuid binaries and does not restrict access to privileged ports, so the undefined method does not provide any security on Windows.

Valid values

Default

eauth

LSF_AUTH_DAEMONS

Syntax

LSF_AUTH_DAEMONS=any_value

Description

Enables daemon authentication, as long as LSF_AUTH in lsf.conf is set to eauth. Daemons will call eauth to authenticate each other.

Default

Undefined

LSF_BINDIR

Syntax

LSF_BINDIR=dir

Description

Directory in which all LSF user commands are installed.

Default

LSF_MACHDEP/bin

LSF_CMD_LOGDIR

Syntax

LSF_CMD_LOGDIR=path

Description

The path to the log files used for debugging LSF commands.

This parameter can also be set from the command line.

Default

/tmp

See also

LSB_CMD_LOG_MASK, LSB_CMD_LOGDIR, LSB_DEBUG, LSB_DEBUG_CMD, LSB_TIME_CMD, LSF_CMD_LOG_MASK, LSF_LOG_MASK, LSF_LOGDIR, LSF_TIME_CMD

LSF_CMD_LOG_MASK

Syntax

LSF_CMD_LOG_MASK=log_level

Description

Specifies the logging level of error messages from LSF commands.

For example:

LSF_CMD_LOG_MASK=LOG_DEBUG

To specify the logging level of error messages, use LSB_CMD_LOG_MASK. To specify the logging level of error messages for LSF daemons, use LSF_LOG_MASK.

LSF commands log error messages in different levels so that you can choose to log all messages, or only log messages that are deemed critical. The level specified by LSF_CMD_LOG_MASK determines which messages are recorded and which are discarded. All messages logged at the specified level or higher are recorded, while lower level messages are discarded.

For debugging purposes, the level LOG_DEBUG contains the fewest number of debugging messages and is used for basic debugging. The level LOG_DEBUG3 records all debugging messages, and can cause log files to grow very large; it is not often used. Most debugging is done at the level LOG_DEBUG2.

The commands log to the syslog facility unless LSF_CMD_LOGDIR is set.

Valid values

The log levels from highest to lowest are:

Default

LOG_WARNING

See also

LSB_CMD_LOG_MASK, LSB_CMD_LOGDIR, LSB_DEBUG, LSB_DEBUG_CMD, LSB_TIME_CMD, LSF_CMD_LOGDIR, LSF_LOG_MASK, LSF_LOGDIR, LSF_TIME_CMD

LSF_CONF_RETRY_INT

Syntax

LSF_CONF_RETRY_INT=time_seconds

Description

The number of seconds to wait between unsuccessful attempts at opening a configuration file (only valid for LIM). This allows LIM to tolerate temporary access failures.

Default

30

See also

LSF_CONF_RETRY_MAX

LSF_CONF_RETRY_MAX

Syntax

LSF_CONF_RETRY_MAX=integer

Description

The maximum number of unsuccessful attempts at opening a configuration file (only valid for LIM). This allows LIM to tolerate temporary access failures.

Default

0

See also

LSF_CONF_RETRY_INT

LSF_CONFDIR

Syntax

LSF_CONFDIR=dir

Description

Directory in which all LSF configuration files are installed. These files are shared throughout the system and should be readable from any host. This directory can contain configuration files for more than one cluster.

The files in the LSF_CONFDIR directory must be owned by the primary LSF administrator, and readable by all LSF server hosts.

Default

LSF_INDEP/conf

See also

LSB_CONFDIR

LSF_DAEMON_WRAP

Syntax

LSF_DAEMON_WRAP=y | Y

Description

Applies only to DCE/DFS and AFS environments; if you are installing LSF on a DCE or AFS environment, set this parameter to y or Y.

When this parameter is set to y or Y, mbatchd, sbatchd, and RES run the executable daemons.wrap in LSF_SERVERDIR.

Default

Undefined

LSF_DEBUG_LIM

Syntax

LSF_DEBUG_LIM=log_class

Description

Sets the log class for debugging LIM.

Specifies the log class filtering that will be applied to LIM. Only messages belonging to the specified log class are recorded.

The LSF_DEBUG_LIM sets the log class and is used in combination with LSF_LOG_MASK, which sets the log level. For example:

LSF_LOG_MASK=LOG_DEBUG
LSF_DEBUG_LIM=LC_TRACE 

You need to restart the daemons after setting LSF_DEBUG_LIM for your changes to take effect.

If you use the command lsadmin limdebug to temporarily change this parameter without changing lsf.conf, you will not need to restart the daemons.

The daemons log to the syslog facility unless LSF_LOGDIR is defined.

To specify multiple log classes, use a space-separated list enclosed in quotation marks. For example:

LSF_DEBUG_LIM="LC_TRACE LC_EXEC"

This parameter can also be defined from the command line.

Valid values

Valid log classes are:

Default

Undefined

See also

LSF_DEBUG_RES, LSF_CMD_LOGDIR, LSF_CMD_LOG_MASK, LSF_LOG_MASK, LSF_LOGDIR

LSF_DEBUG_RES

Syntax

LSF_DEBUG_RES=log_class

Description

Sets the log class for debugging RES.

Specifies the log class filtering that will be applied to RES. Only messages belonging to the specified log class are recorded.

LSF_DEBUG_RES sets the log class and is used in combination with LSF_LOG_MASK, which sets the log level. For example:

LSF_LOG_MASK=LOG_DEBUG 
LSF_DEBUG_RES=LC_TRACE 

To specify multiple log classes, use a space-separated list enclosed in quotation marks. For example:

LSF_DEBUG_RES="LC_TRACE LC_EXEC"

You need to restart the daemons after setting LSF_DEBUG_RES for your changes to take effect.

If you use the command lsadmin resdebug to temporarily change this parameter without changing lsf.conf, you will not need to restart the daemons.

The daemons log to the syslog facility unless LSF_LOGDIR is defined.

This parameter can also be defined from the command line.

Valid Values

For a list of valid log classes see LSF_DEBUG_LIM

Default

Undefined

See also

LSF_DEBUG_LIM, LSF_CMD_LOGDIR, LSF_CMD_LOG_MASK, LSF_LOG_MASK, LSF_LOGDIR

LSF_DEFAULT_EXTSCHED

Syntax

LSF_DEFAULT_EXTSCHED="external_scheduler_options"

Description

Default application-specific external scheduling options for the job. If set, and the job is submitted without -extsched options, the options specified in LSF_DEFAULT_EXTSCHED are used.

To enable jobs to accept external scheduler options, set LSF_ENABLE_EXTSCHEDULER=y in lsf.conf.

You can specify only one type of external scheduler option in a single external_scheduler_options string.

For example, SGI IRIX hosts and AlphaServer SC hosts running RMS can exist in the same cluster, but they accept different external scheduler options. Use external scheduler options to define job requirements for either IRIX cpusets OR RMS, but not both. Your job will run either on IRIX or RMS. If external scheduler options are not defined, the job may run on IRIX but it will not run on an RMS host.

The options set by bsub -extsched override options set by LSF_DEFAULT_EXTSCHED.

Use DEFAULT_EXTSCHED in lsb.queues to set default external scheduler options for a queue.

To make certain external scheduler options mandatory for all jobs submitted to a queue, specify MANDATORY_EXTSCHED in lsb.queues with the external scheduler options you need or your jobs.

Default

Undefined

LSF_DHCP_ENV

Syntax

LSF_DHCP_ENV=y

Description

If defined, enables dynamic IP addressing for all LSF client hosts in the cluster.

Dynamic IP addressing is not supported across clusters in a MultiCluster environment.

Default

Undefined

LSF_ENABLE_CSA

Syntax

LSF_ENABLE_CSA=y | Y

Description

If set, enables LSF to write records for LSF jobs to IRIX 6.5.9 Comprehensive System Accounting facility (CSA).

The IRIX 6.5.9 Comprehensive System Accounting facility (CSA) writes an accounting record for each process in the pacct file, which is usually located in the /var/adm/acct/day directory. IRIX system administrators then use the csabuild command to organize and present the records on a job by job basis.

When LSF_ENABLE_CSA is set, for each job run on the IRIX system, LSF writes an LSF-specific accounting record to CSA when the job starts, and when the job finishes. LSF daemon accounting in CSA starts and stops with the LSF daemon.

To disable IRIX CSA accounting, remove LSF_ENABLE_CSA from lsf.conf.

See the IRIX 6.5.9 resource administration documentation for information about CSA.

Setting up IRIX CSA

  1. Define the LSF_ENABLE_CSA parameter in lsf.conf:
    ...
    LSF_ENABLE_CSA=Y
    ...
    
  2. Set the following parameters in /etc/csa.conf to on:
    • CSA_START
    • WKMG_START
  3. Run the csaswitch command to turn on the configuration changes in /etc/csa.conf.

See the IRIX 6.5.9 resource administration documentation for information about the csaswitch command.

Information written to the pacct file

LSF writes the following records to the pacct file when a job starts and when it exits:

Default

Undefined

LSF_ENABLE_EXTSCHEDULER

Syntax

LSF_ENABLE_EXTSCHEDULER=y | Y

Description

If set, enables mbatchd external scheduling.

Default

Undefined

LSF_ENVDIR

Syntax

LSF_ENVDIR=dir

Description

Directory containing the lsf.conf file.

By default, lsf.conf is installed by creating a shared copy in LSF_CONFDIR and adding a symbolic link from /etc/lsf.conf to the shared copy. If LSF_ENVDIR is set, the symbolic link is installed in LSF_ENVDIR/lsf.conf.

The lsf.conf file is a global environment configuration file for all LSF services and applications. The LSF default installation places the file in LSF_CONFDIR.

Default

/etc

LSF_EVENT_PROGRAM

Syntax

LSF_EVENT_PROGRAM=event_program_name

Description

Specifies the name of the LSF event program to use.

If a full path name is not provided, the default location of this program is LSF_SERVERDIR.

If a program that does not exist is specified, event generation will not work.

If this parameter is undefined, the default name is genevent on UNIX

If this parameter is undefined, the default name is genevent.exe on Windows.

Default

Undefined

LSF_EVENT_RECEIVER

Syntax

LSF_EVENT_RECEIVER=event_receiver_program_name

Description

Specifies the LSF event receiver and enables event generation.

Any string may be used as the LSF event receiver; this information is not used by LSF to enable the feature but is only passed as an argument to the event program.

If LSF_EVENT_PROGRAM specifies a program that does not exist, event generation will not work.

If this parameter is undefined, event generation is disabled.

Default

Undefined

LSF_HPC_EXTENSIONS

Syntax

LSF_HPC_EXTENSIONS="extension_name ..."

Description

Enables Platform LSF/HPC extensions for compressed host name list in lsb.events and lsb.acct records, and shortened PID list in bjobs output.

Valid values

The following extension names are supported:

Default

Undefined

LSF_ID_PORT

Syntax

LSF_ID_PORT=port_number

Description

The network port number used to communicate with the authentication daemon when LSF_AUTH is set to ident.

LSF_INCLUDEDIR

Syntax

LSF_INCLUDEDIR=dir

Description

Directory under which the LSF API header files lsf.h and lsbatch.h are installed.

Default

LSF_INDEP/include

See also

LSF_INDEP

LSF_INDEP

Syntax

LSF_INDEP=dir

Description

Specifies the default top-level directory for all machine-independent LSF files.

This includes man pages, configuration files, working directories, and examples. For example, defining LSF_INDEP as /usr/share/lsf/mnt places man pages in /usr/share/lsf/mnt/man, configuration files in /usr/share/lsf/mnt/conf, and so on.

The files in LSF_INDEP can be shared by all machines in the cluster.

As shown in the following list, LSF_INDEP is incorporated into other LSF environment variables.

Default

/usr/share/lsf/mnt

See also

LSF_MACHDEP, LSB_SHAREDIR, LSF_CONFDIR, LSF_INCLUDEDIR, LSF_MANDIR, XLSF_APPDIR

LSF_INTERACTIVE_STDERR

Syntax

LSF_INTERACTIVE_STDERR=y | n

Description

Separates stderr from stdout for interactive tasks and interactive batch jobs.

This is useful to redirect output to a file with regular operators instead of the bsub -e err_file and -o out_file options.

This parameter can also be enabled or disabled as an environment variable.

WARNING


If you enable this parameter globally in lsf.conf, check any custom scripts that manipulate stderr and stdout.

When this parameter is undefined or set to n, the following are written to stdout on the submission host for interactive tasks and interactive batch jobs:

The following are written to stderr on the submission host for interactive tasks and interactive batch jobs:

When this parameter is set to y, the following are written to stdout on the submission host for interactive tasks and interactive batch jobs:

Default

Undefined

Notes

When this parameter is set, the change affects interactive tasks and interactive batch jobs run with the following commands:

Limitations

See also

LSF_NIOS_DEBUG, LSF_CMD_LOGDIR

LSF_IRIX_BESTCPUS

Syntax

LSF_IRIX_BESTCPUS=y | Y

Description

If set, enables the best-fit algorithm for IRIX cpusets

Default

Undefined

LSF_LIBDIR

Syntax

LSF_LIBDIR=dir

Description

Specifies the directory in which the LSF libraries are installed. Library files are shared by all hosts of the same type.

Default

LSF_MACHDEP/lib

LSF_LICENSE_FILE

Syntax

LSF_LICENSE_FILE=file_name... | port_number@host_name

Description

Specifies one or more demo or FLEXlm-based permanent license files used by LSF.

The value for LSF_LICENSE_FILE can be either of the following:

Multiple license files should be quoted and must be separated by a pipe character (|).

Windows example:

LSF_LICENSE_FILE="C:\licenses\license1|C:\licenses\license2|D:\mydir\license3"

Multiple files may be kept in the same directory, but each one must reference a different license server. When checking out a license, LSF searches the servers in the order in which they are listed, so it checks the second server when there are no more licenses available from the first server.

If this parameter is not defined, LSF assumes the default location.

Default

If you installed LSF with a default installation, the license file is installed in the LSF configuration directory (LSF_CONFDIR/license.dat).

If you installed LSF with a custom installation, you specify the license installation directory. The default is the LSF configuration directory (LSF_SERVERDIR for the custom installation).

If you installed FLEXlm separately from LSF to manage other software licenses, the default FLEXlm installation puts the license file in the following location:

LSF_LIM_DEBUG

Syntax

LSF_LIM_DEBUG=1 | 2

Description

Sets LSF to debug mode.

If LSF_LIM_DEBUG is defined, LIM operates in single user mode. No security checking is performed, so LIM should not run as root.

LIM will not look in the services database for the LIM service port number. Instead, it uses port number 36000 unless LSF_LIM_PORT has been defined.

Specify 1 for this parameter unless you are testing LSF.

Valid Values

Default

Undefined

See also

LSF_RES_DEBUG, LSF_CMD_LOGDIR, LSF_CMD_LOG_MASK, LSF_LOG_MASK, LSF_LOGDIR

LSF_LIM_PLUGINDIR

Syntax

LSF_LIM_PLUGINDIR=path

Description

The path to liblimvcl.so. Used only with SUN HPC.

Default

Path to LSF_LIBDIR

See also

LSF_RES_PLUGINDIR

LSF_LIM_PORT, LSF_RES_PORT, LSB_MBD_PORT, LSB_SBD_PORT

Syntax

Example: LSF_LIM_PORT=port_number

Description

TCP service ports to use for communication with the LSF daemons.

If port parameters are undefined, LSF obtains the port numbers by looking up the LSF service names in the /etc/services file or the NIS (UNIX). If it is not possible to modify the services database, you can define these port parameters to set the port numbers.

With careful use of these settings along with the LSF_ENVDIR and PATH environment variables, it is possible to run two versions of the LSF software on a host, selecting between the versions by setting the PATH environment variable to include the correct version of the commands and the LSF_ENVDIR environment variable to point to the directory containing the appropriate lsf.conf file.

Default

On UNIX, the default is to get port numbers from the services database.

On Windows, these parameters are mandatory.

Default port number values are:

LSF_LIM_SOL27_PLUGINDIR

Syntax

LSF_LIM_SOL27_PLUGINDIR=path

Description

The path to liblimvcl.so. Used only with Solaris2.7.

Default

Path to LSF_LIBDIR

See also

LSF_RES_SOL27_PLUGINDIR

LSF_LOG_MASK

Syntax

LSF_LOG_MASK=message_log_level

Description

Specifies the logging level of error messages for LSF daemons.

For example:

LSF_LOG_MASK=LOG_DEBUG

To specify the logging level of error messages, use LSB_CMD_LOG_MASK. To specify the logging level of error messages for LSF commands, use LSF_CMD_LOG_MASK.

On UNIX, this is similar to syslog. All messages logged at the specified level or higher are recorded; lower level messages are discarded. The LSF_LOG_MASK value can be any log priority symbol that is defined in syslog.h (see syslog(8)).

The log levels in order from highest to lowest are:

The most important LSF log messages are at the LOG_ERR or LOG_WARNING level. Messages at the LOG_INFO and LOG_DEBUG level are only useful for debugging.

Although message log level implements similar functionalities to UNIX syslog, there is no dependency on UNIX syslog. It works even if messages are being logged to files instead of syslog.

LSF logs error messages in different levels so that you can choose to log all messages, or only log messages that are deemed critical. The level specified by LSF_LOG_MASK determines which messages are recorded and which are discarded. All messages logged at the specified level or higher are recorded, while lower level messages are discarded.

For debugging purposes, the level LOG_DEBUG contains the fewest number of debugging messages and is used for basic debugging. The level LOG_DEBUG3 records all debugging messages, and can cause log files to grow very large; it is not often used. Most debugging is done at the level LOG_DEBUG2.

In versions prior to LSF 4.0, you needed to restart the daemons after setting LSF_LOG_MASK in order for your changes to take effect.

LSF 4.0 implements dynamic debugging, which means you do not need to restart the daemons after setting a debugging environment variable.

The daemons log to the syslog facility unless LSF_LOGDIR is defined.

Default

LOG_WARNING

See also

LSB_CMD_LOG_MASK, LSB_CMD_LOGDIR, LSB_DEBUG, LSB_DEBUG_CMD, LSB_DEBUG_NQS, LSB_TIME_CMD, LSF_CMD_LOGDIR, LSF_CMD_LOG_MASK, LSF_DEBUG_LIM, LSB_DEBUG_MBD, LSF_DEBUG_RES, LSB_DEBUG_SBD, LSB_DEBUG_SCH, LSF_LOG_MASK, LSF_LOGDIR, LSF_TIME_CMD

LSF_LOG_MASK_WIN

Syntax

LSF_LOG_MASK_WIN=message_log_level

Description

Allows you to reduce the information logged to the LSF Windows event log files. Messages of lower severity than the specified level are discarded.

For all LSF files, the types of messages saved depends on LSF_LOG_MASK, so the threshold for the Windows event logs is either LSF_LOG_MASK or LSF_LOG_MASK_WIN, whichever is higher. LSF_LOG_MASK_WIN is ignored if LSF_LOG_MASK is set to a higher level.

The LSF event log files for Windows are:

The log levels you can specify for this parameter, in order from highest to lowest, are:

Default

LOG_ERR

See also

LSF_LOG_MASK

LSF_LOGDIR

Syntax

LSF_LOGDIR=dir

Description

Required if you use Windows.

Error messages from all servers are logged into files in this directory. To effectively use debugging, set LSF_LOGDIR to a directory such as /tmp. This can be done in your own environment from the shell or in lsf.conf.

Windows

If a server is unable to write in this directory, LSF attempts to write in the following directories, in this order:

UNIX

If a server is unable to write in this directory, the error logs are created in /tmp on UNIX.

If LSF_LOGDIR is not defined, then syslog is used to log everything to the system log using the LOG_DAEMON facility. The syslog facility is available by default on most UNIX systems. The /etc/syslog.conf file controls the way messages are logged and the files they are logged to. See the man pages for the syslogd daemon and the syslog function for more information.

Default

Undefined

On UNIX, if undefined, log messages go to syslog.

On Windows, if undefined, no logging is performed.

See also

LSB_CMD_LOG_MASK, LSB_CMD_LOGDIR, LSB_DEBUG, LSB_DEBUG_CMD, LSB_TIME_CMD, LSF_CMD_LOGDIR, LSF_CMD_LOG_MASK, LSF_LOG_MASK, LSF_TIME_CMD

Files

LSF_MACHDEP

Syntax

LSF_MACHDEP=dir

Description

Specifies the directory in which machine-dependent files are installed. These files cannot be shared across different types of machines.

In clusters with a single host type, LSF_MACHDEP is usually the same as LSF_INDEP. The machine dependent files are the user commands, daemons, and libraries. You should not need to modify this parameter.

As shown in the following list, LSF_MACHDEP is incorporated into other LSF variables.

Default

/usr/share/lsf

See also

LSF_INDEP

LSF_MANDIR

Syntax

LSF_MANDIR=dir

Description

Directory under which all man pages are installed.

The man pages are placed in the man1, man3, man5, and man8 subdirectories of the LSF_MANDIR directory. This is created by the LSF installation process, and you should not need to modify this parameter.

Man pages are installed in a format suitable for BSD-style man commands.

For most versions of UNIX, you should add the directory LSF_MANDIR to your MANPATH environment variable. If your system has a man command that does not understand MANPATH, you should either install the man pages in the /usr/man directory or get one of the freely available man programs.

Default

LSF_INDEP/man

LSF_MASTER_LIST

Syntax

LSF_MASTER_LIST="host_name ..."

Description

Optional. Defines a list of hosts that are candidates to become the master host for the cluster.

Listed hosts must be defined in lsf.cluster.cluster_name.

Host names are separated by spaces.

Whenever you reconfigure, only master LIM candidates read lsf.shared and lsf.cluster.cluster_name to get updated information. The elected master LIM sends configuration information to slave LIMs.


Master candidate hosts should share LSF configuration and binaries.


To dynamically add or remove hosts, you must define LSF_MASTER_LIST.

Default

Undefined

LSF_MC_NON_PRIVILEGED_PORTS

Syntax

LSF_MC_NON_PRIVILEGED_PORTS=Y

Description

MultiCluster only. If this parameter is enabled in one cluster, it must be enabled in all clusters.

Specify Y to make LSF daemons use non-privileged ports for communication across clusters.

Compatibility

This disables privileged port daemon authentication, which is a security feature. If security is a concern, you should use eauth for LSF daemon authentication (see LSF_AUTH_DAEMONS in lsf.conf).

Default

Undefined (LSF daemons use privileged port authentication)

LSF_MISC

Syntax

LSF_MISC=dir

Description

Directory in which miscellaneous machine independent files, such as example source programs and scripts, are installed.

Default

LSF_CONFDIR/misc

LSF_NIOS_DEBUG

Syntax

LSF_NIOS_DEBUG=1

Description

Turns on NIOS debugging for interactive jobs.

If LSF_NIOS_DEBUG=1, NIOS debug messages are written to standard error.

This parameter can also be defined as an environment variable.

When LSF_NIOS_DEBUG and LSF_CMD_LOGDIR are defined, NIOS debug messages are logged in nios.log.host_name. in the location specified by LSF_CMD_LOGDIR.

If LSF_NIOS_DEBUG is defined, and the directory defined by LSF_CMD_LOGDIR is inaccessible, NIOS debug messages are logged to /tmp/nios.log.host_name instead of stderr.

On Windows, NIOS debug messages are also logged to the temporary directory.

Default

Undefined

See also

LSF_CMD_LOGDIR, LSF_CMD_LOG_MASK, LSF_LOG_MASK, LSF_LOGDIR

LSF_NIOS_JOBSTATUS_INTERVAL

Syntax

LSF_NIOS_JOBSTATUS_INTERVAL=time_minutes

Description

Applies only to interactive batch jobs.

Time interval at which NIOS polls mbatchd to check if a job is still running. Used to retrieve a job's exit status in the case of an abnormal exit of NIOS, due to a network failure for example.

Use this parameter if you run interactive jobs and you have scripts that depend on an exit code being returned.

When this parameter is undefined and a network connection is lost, mbatchd cannot communicate with NIOS and the return code of a job is not retrieved.

When this parameter is defined, before exiting, NIOS polls mbatchd on the interval defined by LSF_NIOS_JOBSTATUS_INTERVAL to check if a job is still running. NIOS continues to poll mbatchd until it receives an exit code or mbatchd responds that the job does not exist (if the job has already been cleaned from memory for example).

If an exit code cannot be retrieved, NIOS generates an error message and the code -11.

Valid Values

Any integer greater than zero

Default

Undefined

Notes

Set this parameter to large intervals such as 15 minutes or more so that performance is not negatively affected if interactive jobs are pending for too long. NIOS always calls mbatchd on the defined interval to confirm that a job is still pending and this may add load to mbatchd.

See also

Environment variable LSF_NIOS_PEND_TIMEOUT

LSF_NIOS_RES_HEARTBEAT

Syntax

LSF_NIOS_RES_HEARTBEAT=time_minutes

Description

Applies only to interactive non-parallel batch jobs.

Defines how long NIOS waits before sending a message to RES to determine if the connection is still open.

Use this parameter to ensure NIOS exits when a network failure occurs instead of waiting indefinitely for notification that a job has been completed. When a network connection is lost, RES cannot communicate with NIOS and as a result, NIOS does not exit.

When this parameter is defined, if there has been no communication between RES and NIOS for the defined period of time, NIOS sends a message to RES to see if the connection is still open. If the connection is no longer available, NIOS exits.

Valid values

Any integer greater than zero

Default

Undefined

Notes

The time you set this parameter to depends how long you want to allow NIOS to wait before exiting. Typically, it can be a number of hours or days. Too low a number may add load to the system.

LSF_PAM_HOSTLIST_USE

Syntax

LSF_PAM_HOSTLIST_USE=unique

Description

Used to start applications that use both OpenMP and MPI.

Valid values

unique

Default

Undefined

Notes

You can submit a job to Platform Parallel and LSF will reserve the correct number of processors and PAM will start only 1 process per host. For example, to reserve 32 processors and run on 4 processes per host, resulting in the use of 8 hosts:

% bsub -n 32 -R "span[ptile=4]" pam yourOpenMPJob

Where defined

This parameter can alternatively be set as an environment variable. For example:

setenv LSF_PAM_HOSTLIST_USE unique

Product

Platform Parallel

LSF_PAM_NUMPROC_OPTION

Syntax

LSF_PAM_NUMPROC_OPTION=y | n

Description

Allows bsub -n and pam -n options to be used together.

If set, you can use both bsub -n and pam -n in the same job submission. The pam -n option specifies the number of tasks that PAM should start within the number of processors reserved by bsub -n.

The number specified in the pam -n option should be less than or equal to the number specified by bsub -n. If The number of task specified in the pam -n option is greater than the number specified by bsub -n, the pam -n is ignored.

If LSF_PAM_NUMPROC_OPTION=N, pam -n is ignored.

Example

% bsub -n 5 pam -n 2 -mpi a.out

5 processors are reserved for the job, but PAM only starts 2 parallel tasks. The 2 parallel tasks will spawn threads to take remaining reserved processors.

Default

Y; pam -n is enabled when used as a bsub option

LSF_PAM_PLUGINDIR

Syntax

LSF_PAM_PLUGINDIR=path

Description

The path to libpamvcl.so. Used with SUN HPC and Platform Parallel.

Default

Path to LSF_LIBDIR

See also

LSF_RES_PLUGINDIR

LSF_PAM_USE_ASH

Syntax

LSF_PAM_USE_ASH=y | Y

Description

Enables LSF to use the SGI IRIX Array Session Handles (ASH) to propagate signals to the parallel jobs.

See the IRIX system documentation and the array_session(5) man page for more information about array sessions.

Default

Undefined

LSF_PIM_INFODIR

Syntax

LSF_PIM_INFODIR=path

Description

The path to where PIM writes the pim.info.host_name file.

Specifies the path to where the process information is stored. The process information resides in the file pim.info.host_name. The PIM also reads this file when it starts so that it can accumulate the resource usage of dead processes for existing process groups.

Default

Undefined. If undefined, the system uses /tmp.

LSF_PIM_SLEEPTIME

Syntax

LSF_PIM_SLEEPTIME=time_seconds

Description

The reporting period for PIM.

PIM updates the process information every 15 minutes unless an application queries this information. If an application requests the information, PIM will update the process information every LSF_PIM_SLEEPTIME seconds. If the information is not queried by any application for more than 5 minutes, the PIM will revert back to the 15 minute update period.

Default

15

LSF_PIM_SLEEPTIME_UPDATE

Syntax

LSF_PIM_SLEEPTIME_UPDATE=y | n

Description

UNIX only.

Use this parameter to improve job throughput and reduce a job's start time if there are many jobs running simultaneously on a host. This parameter reduces communication traffic between sbatchd and PIM on the same host.

When this parameter is undefined or set to n, sbatchd queries PIM as needed for job process information.

When this parameter is defined, sbatchd does not query PIM immediately as it needs information--sbatchd will only query PIM every LSF_PIM_SLEEPTIME seconds.

Limitations

When this parameter is defined:

Default

Undefined

LSF_RES_ACCT

Syntax

LSF_RES_ACCT=time_milliseconds | 0

Description

If this parameter is defined, RES will log information for completed and failed tasks by default (see lsf.acct(5)).

The value for LSF_RES_ACCT is specified in terms of consumed CPU time (milliseconds). Only tasks that have consumed more than the specified CPU time will be logged.

If this parameter is defined as LSF_RES_ACCT=0, then all tasks will be logged.

For those tasks that consume the specified amount of CPU time, RES generates a record and appends the record to the task log file lsf.acct.host_name. This file is located in the LSF_RES_ACCTDIR directory.

If this parameter is not defined, the LSF administrator must use the lsadmin command (see lsadmin(8)) to turn task logging on after RES has started.

Default

Undefined

See also

LSF_RES_ACCTDIR

LSF_RES_ACCTDIR

Syntax

LSF_RES_ACCTDIR=dir

Description

The directory in which the RES task log file lsf.acct.host_name is stored.

If LSF_RES_ACCTDIR is not defined, the log file is stored in the /tmp directory.

Default

(UNIX)/tmp

(Windows) C:\temp

See also

LSF_RES_ACCT

LSF_RES_DEBUG

Syntax

LSF_RES_DEBUG=1 | 2

Description

Sets RES to debug mode.

If LSF_RES_DEBUG is defined, the Remote Execution Server (RES) will operate in single user mode. No security checking is performed, so RES should not run as root. RES will not look in the services database for the RES service port number. Instead, it uses port number 36002 unless LSF_RES_PORT has been defined.

Specify 1 for this parameter unless you are testing RES.

Valid values

Default

Undefined

See also

LSF_LIM_DEBUG, LSF_CMD_LOGDIR, LSF_CMD_LOG_MASK, LSF_LOG_MASK, LSF_LOGDIR

LSF_RES_PLUGINDIR

Syntax

LSF_RES_PLUGINDIR=path

Description

The path to lsbresvcl.so. Used only with SUN HPC.

Default

Path to LSF_LIBDIR

See also

LSF_PAM_PLUGINDIR, LSF_LIM_PLUGINDIR

LSF_RES_PORT

See LSF_LIM_PORT, LSF_RES_PORT, LSB_MBD_PORT, LSB_SBD_PORT.

LSF_RES_RLIMIT_UNLIM

Syntax

LSF_RES_RLIMIT_UNLIM=cpu | fsize | data | stack | core | vmem

Description

(LSF Base only) By default, RES sets the hard limits for a remote task to be the same as the hard limits of the local process. This parameter specifies those hard limits which are to be set to unlimited, instead of inheriting those of the local process.

Valid values are cpu, fsize, data, stack, core, and vmem, for CPU, file size, data size, stack, core size, and virtual memory limits, respectively.

Example

The following example sets the CPU, core size, and stack hard limits to be unlimited for all remote tasks:

LSF_RES_RLIMIT_UNLIM="cpu core stack"

Default

Undefined

See also

LSF_LIM_SOL27_PLUGINDIR

LSF_RES_SOL27_PLUGINDIR

Syntax

LSF_RES_SOL27_PLUGINDIR=path

Description

The path to libresvcl.so. Used only used with Solaris2.7.

If you want to link a 64-bit object with RES, then you should set LSF_RES_SOL27_PLUGINDIR.

Default

Path to LSF_LIBDIR

LSF_RES_TIMEOUT

Syntax

LSF_RES_TIMEOUT=time_seconds

Description

Timeout when communicating with RES.

Default

15

LSF_ROOT_REX

Syntax

LSF_ROOT_REX=local

Description

UNIX only.

Allows root remote execution privileges (subject to identification checking) on remote hosts, for both interactive and batch jobs. Causes RES to accept requests from the superuser (root) on remote hosts, subject to identification checking.

If LSF_ROOT_REX is undefined, remote execution requests from user root are refused.

Theory

Sites that have separate root accounts on different hosts within the cluster should not define LSF_ROOT_REX. Otherwise, this setting should be based on local security policies.

The lsf.conf file is host-type specific and not shared across different platforms. You must make sure that lsf.conf for all your host types are changed consistently.

Default

Undefined (root execution is not allowed)

See also

LSF_TIME_CMD, LSF_AUTH

LSF_SECUREDIR

Syntax

LSF_SECUREDIR=path

Description

(Windows only; mandatory if using lsf.sudoers) Path to the directory that contains the file lsf.sudoers (shared on an NTFS file system).

LSF_SERVER_HOSTS

Syntax

LSF_SERVER_HOSTS="host_name ..."

Description

Defines one or more server hosts that the application should contact to find a Load Information Manager (LIM). This is used on client hosts on which no LIM is running on the local host. LSF server hosts are hosts that run LSF daemons and provide loading-sharing services. Client hosts are hosts that only run LSF commands or applications but do not provide services to any hosts.

If LSF_SERVER_HOSTS is not defined, the application tries to contact the LIM on the local host.

The host names in LSF_SERVER_HOSTS must be enclosed in quotes and separated by white space. For example:

LSF_SERVER_HOSTS="hostA hostD hostB"

The length of the parameter string must be less then 4096 characters.

Default

Undefined

LSF_SERVERDIR

Syntax

LSF_SERVERDIR=dir

Description

Directory in which all server binaries and shell scripts are installed.

These include lim, res, nios, sbatchd, mbatchd, and mbschd. If you use elim, eauth, eexec, esub, etc, they are also installed in this directory.

Default

LSF_MACHDEP/etc

See also

LSB_ECHKPNT_METHOD_DIR

LSF_SHELL_AT_USERS

Syntax

LSF_SHELL_AT_USERS="user_name user_name ..."

Description

Applies to lstcsh only. Specifies users who are allowed to use @ for host redirection. Users not specified with this parameter cannot use host redirection in lstcsh.

If this parameter is undefined, all users are allowed to use @ for host redirection in lstcsh.

Default

Undefined

LSF_STRICT_CHECKING

Syntax

LSF_STRICT_CHECKING=Y

Description

If set, enables more strict checking of communications between LSF daemons and between LSF commands and daemons when LSF is used in an untrusted environment, such as a public network like the Internet.

If you enable this parameter, you must enable it in the entire cluster, as it affects all communications within LSF. If it is used in a MultiCluster environment, it must be enabled in all clusters, or none. Ensure that all binaries and libraries are upgraded to LSF Version 6.0, including LSF_BINDIR, LSF_SERVERDIR and LSF_LIBDIR directories, if you enable this parameter.

If your site uses any programs that use the LSF base and batch APIs, or LSF MPI (Message Passing Interface), they need to be recompiled using the LSF Version 6.0 APIs before they can work properly with this option enabled.

IMPORTANT


You must shut down the entire cluster before enabling or disabling this parameter.


If LSF_STRICT_CHECKING is defined, and your cluster has slave hosts that are dynamically added, LSF_STRICT_CHECKING must be configured in the local lsf.conf on all slave hosts.

Valid value

Set to Y to enable this feature.

Default

Undefined. LSF is secure in trusted environments.

LSF_STRIP_DOMAIN

Syntax

LSF_STRIP_DOMAIN=domain_suffix [:domain_suffix ...]

Description

(Optional) If all of the hosts in your cluster can be reached using short host names, you can configure LSF to use the short host names by specifying the portion of the domain name to remove. If your hosts are in more than one domain or have more than one domain name, you can specify more than one domain suffix to remove, separated by a colon (:).

For example, given this definition of LSF_STRIP_DOMAIN,

LSF_STRIP_DOMAIN=.foo.com:.bar.com

LSF accepts hostA, hostA.foo.com, and hostA.bar.com as names for host hostA, and uses the name hostA in all output. The leading period `.' is required.

Example:

LSF_STRIP_DOMAIN=.platform.com:.generic.com

In the above example, LSF accepts hostA, hostA.platform.com, and hostA.generic.com as names for hostA, and uses the name hostA in all output.

Setting this parameter only affects host names displayed through LSF, it does not affect DNS host lookup.

Default

Undefined

LSF_TIME_CMD

Syntax

LSF_TIME_CMD=timimg_level

Description

The timing level for checking how long LSF commands run. Time usage is logged in milliseconds; specify a positive integer.

Default

Undefined

See also

LSB_TIME_MBD, LSB_TIME_SBD, LSB_TIME_CMD, LSF_TIME_LIM, LSF_TIME_RES

LSF_TIME_LIM

Syntax

LSF_TIME_LIM=timing_level

Description

The timing level for checking how long LIM routines run.

Time usage is logged in milliseconds; specify a positive integer.

Default

Undefined

See also

LSB_TIME_CMD, LSB_TIME_MBD, LSB_TIME_SBD, LSF_TIME_RES

LSF_TIME_RES

Syntax

LSF_TIME_RES=timing_level

Description

The timing level for checking how long RES routines run.

Time usage is logged in milliseconds; specify a positive integer.

Default

Undefined

See also

LSB_TIME_CMD, LSB_TIME_MBD, LSB_TIME_SBD, LSF_TIME_LIM

LSF_TMPDIR

Syntax

LSF_TMPDIR=dir

Description

Specifies the path and directory for temporary job output.

When LSF_TMPDIR is defined in lsf.conf, LSF creates a temporary directory under the directory specified by LSF_TMPDIR on the execution host when a job is started and sets the temporary directory environment variable for the job.

When LSF_TMPDIR is defined as an environment variable, it overrides the LSF_TMPDIR specified in lsf.conf. LSF removes the temporary directory and the files that it contains when the job completes.

The name of the temporary directory has the following format:

$LSF_TMPDIR/job_ID.tmpdir

On UNIX, the directory has the permission 0700.

After adding LSF_TMPDIR to lsf.conf, use badmin hrestart all to reconfigure your cluster.

This parameter can also be specified from the command line.

Valid values

Specify any valid path up to a maximum length of 256 characters. The 256 character maximum path length includes the temporary directories and files that the system creates as jobs run. The path that you specify for LSF_TMPDIR should be as short as possible to avoid exceeding this limit.

UNIX

Specify an absolute path. For example:

LSF_TMPDIR=/usr/share/lsf_tmp

Windows

Specify a UNC path or a path with a drive letter. For example:

LSF_TMPDIR=\\HostA\temp\lsf_tmpor
LSF_TMPDIR=D:\temp\lsf_tmp

Default

By default, LSF_TMPDIR is not enabled. If LSF_TMPDIR is not specified either in the environment or in lsf.conf, this parameter is defined as follows:

LSF_TOPD_PORT

Syntax

LSF_TOPD_PORT=port_number

Description

UDP port used for communication between the LSF cpuset topology daemon (topd) and the cpuset ELIM. Used with SGI IRIX cpuset support.

Default

Undefined

LSF_TOPD_WORKDIR

Syntax

LSF_TOPD_WORKDIR=directory

Description

Directory to store the IRIX cpuset permission file and the event file for the cpuset topology daemon (topd). Used with SGI IRIX cpuset support.

You should avoid using /tmp or any other directory that is automatically cleaned up by the system. Unless your installation has restrictions on the LSB_SHAREDIR directory, you should use the default for LSF_TOPD_WORKDIR.

Default

LSB_SHAREDIR/topd_dir.port_number

Where port_number is the value you set for LSF_TOPD_PORT.

LSF_ULDB_DOMAIN

Syntax

LSF_ULDB_DOMAIN=domain_name

Description

LSF_ULDB_DOMAIN specifies the name of the LSF domain in the ULDB domain directive. A domain definition of name domain_name must be configured in the IRIX jlimit.in input file.

Used with IRIX 6.5.8 User Limits Database (ULDB). Configures LSF so that jobs submitted to a host with the IRIX job limits option installed are subject to the job limits configured in the IRIX User Limits Database (ULDB).

The ULDB contains job limit information that system administrators use to control access to a host on a per user basis. The job limits in the ULDB override the system default values for both job limits and process limits. When a ULDB domain is configured, the limits will be enforced as IRIX job limits.

If the ULDB domain specified in LSF_ULDB_DOMAIN is not valid or does not exist, LSF uses the limits defined in the domain named batch. If the batch domain does not exist, then the system default limits are set.

When an LSF job is submitted, an IRIX job is created, and the job limits in the ULDB are applied.

Next, LSF resource usage limits are enforced for the IRIX job under which the LSF job is running. LSF limits override the corresponding IRIX job limits. The ULDB limits are used for any LSF limits that are not defined. If the job reaches the IRIX job limits, the action defined in the IRIX system is used.

IRIX job limits in the ULDB apply only to batch jobs.

See the IRIX 6.5.8 resource administration documentation for information about configuring ULDB domains in the jlimit.in file.

LSF resource usage limits controlled by ULDB

Increasing the default MEMLIMIT for ULDB

In some pre-defined LSF queues, such as normal, the default MEMLIMIT is set to 5000 (5 MB). However, if ULDB is enabled (LSF_ULDB_DOMAIN is defined) the MEMLIMIT should be set greater than 8000 in lsb.queues.

Example ULDB domain configuration

The following steps enable the ULDB domain LSF for user user1:

  1. Define the LSF_ULDB_DOMAIN parameter in lsf.conf:
    ...
    LSF_ULDB_DOMAIN=LSF
    ...
    

    Note that you can set the LSF_ULDB_DOMAIN to include more than one domain. For example:

    LSF_ULDB_DOMAIN="lsf:batch:system"
    
  2. Configure the domain directive LSF in the jlimit.in file:
    domain <LSF> {                           # domain for LSF 
            jlimit_numproc_cur = unlimited
            jlimit_numproc_max = unlimited   # JLIMIT_NUMPROC 
            jlimit_nofile_cur = unlimited
            jlimit_nofile_max = unlimited    # JLIMIT_NOFILE 
            jlimit_rss_cur = unlimited
            jlimit_rss_max = unlimited       # JLIMIT_RSS 
            jlimit_vmem_cur = 128M
            jlimit_vmem_max = 256M           # JLIMIT_VMEM 
            jlimit_data_cur = unlimited
            jlimit_data_max =unlimited       # JLIMIT_DATA 
            jlimit_cpu_cur = 80
            jlimit_cpu_max = 160             # JLIMIT_CPU 
    } 
    
  3. Configure the user limit directive for user1 in the jlimit.in file:
    user user1 { 
            LSF { 
               jlimit_data_cur = 128M 
               jlimit_data_max = 256M 
             } 
    } 
    
  4. Use the IRIX genlimits command to create the user limits database:
    genlimits -l -v
    

Default

Undefined

LSF_USE_HOSTEQUIV

Description

(UNIX only; optional)

If LSF_USE_HOSTEQUIV is defined, RES and mbatchd call the ruserok(3) function to decide if a user is allowed to run remote jobs.

The ruserok(3) function checks in the /etc/hosts.equiv file and the user's $HOME/.rhosts file to decide if the user has permission to execute remote jobs.

If LSF_USE_HOSTEQUIV is not defined, all normal users in the cluster can execute remote jobs on any host.

If LSF_ROOT_REX is set, root can also execute remote jobs with the same permission test as for normal users.

Default

Undefined

LSF_USER_DOMAIN

Syntax

LSF_USER_DOMAIN=domain_name |.

Description

Set during LSF installation or setup. If you modify this parameter in an existing cluster, you probably have to modify passwords and configuration files also.

Windows or mixed UNIX-Windows clusters only.

Enables default user mapping, and specifies the LSF user domain. The period (.) specifies local accounts, not domain accounts.

If this parameter is undefined, the default user mapping is not enabled. You can still configure user mapping at the user or system level. User account mapping is required to run cross-platform jobs in a UNIX-Windows mixed cluster.

Default

LSF_VPLUGIN

Syntax

LSF_VPLUGIN=path

Description

The full path to the vendor MPI library libxmpi.so. Used with Platform Parallel and MPI.

For PAM to access the SGI MPI libxmpi.so library, the file permission mode must be 755 (-rwxr-xr-x).

Examples

Default

Undefined

XLSF_APPDIR

Syntax

XLSF_APPDIR=dir

Description

(UNIX only; optional) Directory in which X application default files for LSF products are installed.

The LSF commands that use X look in this directory to find the application defaults. Users do not need to set environment variables to use the Platform LSF X applications. The application default files are platform-independent.

Default

LSF_INDEP/misc

XLSF_UIDDIR

Syntax

XLSF_UIDDIR=dir

Description

(UNIX only) Directory in which Motif User Interface Definition files are stored.

These files are platform-specific.

Default

LSF_LIBDIR/uid

[ Top ]


[ Platform Documentation ] [ Title ] [ Contents ] [ Previous ] [ Next ] [ Index ]


      Date Modified: February 24, 2004
Platform Computing: www.platform.com

Platform Support: support@platform.com
Platform Information Development: doc@platform.com

Copyright © 1994-2004 Platform Computing Corporation. All rights reserved.