[ Platform Documentation ] [ Title ] [ Contents ] [ Previous ] [ Next ] [ Index ]
This is the cluster configuration file. There is one for each cluster, called
lsf.cluster.
cluster_name. The cluster_name suffix is the name of the cluster defined in the Cluster section oflsf.shared
. All LSF hosts are listed in this file, along with the list of LSF administrators and the installed LSF features.This file is typically installed in the directory defined by LSF_ENVDIR.
The
lsf
.cluster.
cluster_
name file contains two types of configuration information:
- Cluster definition information--affects all LSF applications. Defines cluster administrators, hosts that make up the cluster, attributes of each individual host such as host type or host model, and resources using the names defined in
lsf.shared
.- LIM policy information--affects applications that rely on LIM job placement policy. Defines load sharing and job placement policies provided by LIM.
- NewIndex Section
- Parameters Section
- ClusterAdmins Section
- Host Section
- ResourceMap Section
- RemoteClusters Section
[ Top ]
NewIndex Section
The NewIndex section in
lsf.cluster.
cluster_name is obsolete. To achieve the same effect, use the Resource section of thelsf.shared
file to define a dynamic numeric resource, and use thedefault
keyword in the LOCATION field of the ResourceMap section oflsf.cluster.
cluster_name.[ Top ]
Parameters Section
(Optional) This section contains miscellaneous parameters for the LIM.
ADJUST_DURATION
ADJUST_DURATION =
integerInteger reflecting a multiple of EXINTERVAL that controls the time period during which load adjustment is in effect
The
lsplace
(1
) andlsloadadj
(1
) commands artificially raise the load on a selected host. This increase in load decays linearly to 0 over time.3
ELIM_POLL_INTERVAL
ELIM_POLL_INTERVAL =
time_in_secondsTime interval, in seconds, in which the LIM daemon samples load information
This parameter only needs to be set if an ELIM is being used to report information more frequently than every 5 seconds.
5 seconds
ELIMARGS
ELIMARGS =
cmd_line_argsSpecifies any necessary command-line arguments for the external LIM on startup
This parameter is ignored if no external load indices are configured.
None
EXINTERVAL
EXINTERVAL =
time_in_secondsTime interval, in seconds, at which the LIM daemons exchange load information
On extremely busy hosts or networks, or in clusters with a large number of hosts, load may interfere with the periodic communication between LIM daemons. Setting EXINTERVAL to a longer interval can reduce network load and slightly improve reliability, at the cost of slower reaction to dynamic load changes.
15 seconds
FLOAT_CLIENTS
FLOAT_CLIENTS =
number_of_floating_client_licensesSets the size of your license pool in the cluster
When the master LIM starts, up to number_of_floating_client_licenses will be checked out for use as floating client licenses. If fewer licenses are available than specified by number_of_floating_client_licenses, only the available licenses will be checked out and used.
If FLOAT_CLIENTS is not specified in
lsf.cluster.
cluster_name or there is an error in eitherlicense.dat
or inlsf.cluster
.cluster_name, the floating LSF client license feature is disabled.
When the LSF floating client feature is enabled, any host will be able to submit jobs to the cluster. You can limit which hosts can be LSF floating clients with the parameter FLOAT_CLIENTS_ADDR_RANGE inlsf.cluster.
cluster_name.
Although LSF Floating Client requires a license, LSF_Float_Client does not need to be added to the PRODUCTS line. LSF_Float_Client also cannot be added as a resource for specific hosts already defined inlsf.cluster.
cluster_name. Should these lines be present, they are ignored by LSF.
Undefined
FLOAT_CLIENTS_ADDR_RANGE
FLOAT_CLIENTS_ADDR_RANGE =
IP_address ...Optional. IP address or range of addresses, in dotted quad notation (
nnn.nnn.nnn.nnn
), of domains from which floating client hosts can submit requests. Multiple ranges can be defined, separated by spaces.If the value of this parameter is undefined, there is no security and any host can be an LSF floating client.
If a value is defined, security is enabled. If there is an error in the configuration of this variable, by default, no host will be allowed to be an LSF floating client.
When this parameter is defined, client hosts that do not belong to the domain will be denied access.
If a requesting host belongs to an IP address that falls in the specified range, the host will be accepted to become an LSF floating client.
IP addresses are separated by spaces, and considered "OR" alternatives.
The asterisk (*) character indicates any value is allowed.
The dash (-) character indicates an explicit range of values. For example 1-4 indicates 1,2,3,4 are allowed.
Open ranges such as *-30, or 10-*, are allowed.
If a range is specified with less fields than an IP address such as 10.161, it is considered as 10.161.*.*.
Address ranges are validated at configuration time so they must conform to the required format. If any address range is not in the correct format, no host will be accepted as an LSF floating client and a error message will be logged in the LIM log.
This parameter is limited to 255 characters.
After you configure FLOAT_CLIENTS_ADDR_RANGE, check the
lim.log.
host_name file to make sure this parameter is correctly set. If this parameter is not set or is wrong, this will be indicated in the log file.
FLOAT_CLIENTS_ADDR_RANGE=100
All client hosts with a domain address starting with 100 will be allowed access.
FLOAT_CLIENTS_ADDR_RANGE=100-110.34.1-10.4-56
All client hosts belonging to a domain with an address having the first number between 100 and 110, then 34, then a number between 1 and 10, then, a number between 4 and 56 will be allowed access.
Example: 100.34.9.45, 100.34.1.4, 102.34.3.20, etc.FLOAT_CLIENTS_ADDR_RANGE=100.172.1.13 100.*.30-54 124.24-*.1.*-34
All client hosts belonging to a domain with the address 100.172.1.13 will be allowed access. All client hosts belonging to domains starting with 100, then any number, then a range of 30 to 54 will be allowed access. All client hosts belonging to domains starting with 124, then from 24 onward, then 1, then from 0 to 34 will be allowed access.
FLOAT_CLIENTS_ADDR_RANGE=12.23.45.*
All client hosts belonging to domains starting with 12.23.45 are allowed.
FLOAT_CLIENTS_ADDR_RANGE=100.*43
The
*
character can only be used to indicate any value. In this example, an error will be inserted in the LIM log and no hosts will be accepted to become LSF floating clients.FLOAT_CLIENTS_ADDR_RANGE=100.*43 100.172.1.13
Although one correct address range is specified, because *43 is incorrect format, the entire line is considered invalid. An error will be inserted in the LIM log and no hosts will be accepted to become LSF floating clients.
Undefined. No security is enabled. Any host in any domain is allowed access to LSF floating client licenses.
HOST_INACTIVITY_LIMIT
HOST_INACTIVITY_LIMIT =
integerInteger reflecting a multiple of EXINTERVAL that controls the maximum time a slave LIM will take to send its load information to the master LIM as well as the frequency at which the master LIM will send a heartbeat message to its slaves.
A slave LIM can send its load information any time from EXINTERVAL to (HOST_INACTIVITY_LIMIT-2)*EXINTERVAL seconds. A master LIM will send a master announce to each host at least every EXINTERVAL*HOST_INACTIVITY_LIMIT seconds.
5
LSF_ELIM_BLOCKTIME
LSF_ELIM_BLOCKTIME=
secondsUNIX only
Maximum amount of time LIM waits for a load update string from the ELIM or MELIM if it is not immediately available.
Use this parameter to add fault-tolerance to LIM when using ELIMs. If there is an error in the ELIM or some situation arises that the ELIM cannot send the entire load update string to the LIM, LIM will not wait indefinitely for load information from ELIM. After the time period specified by LSF_ELIM_BLOCKTIME, the LIM writes the last string sent by ELIM in its log file (
lim.log.
host_name) and restarts the ELIM.For example, if LIM is expecting 3 name-value-pairs, such as:
3 tmp2 49.5 nio 367.0 licenses 3If after the time period specified by LSF_ELIM_BLOCKTIME LIM has only received the following:
3 tmp2 47.5
LIM writes whatever was received last (3 tmp2 47.5) in the log file and restarts the ELIM.
Non-negative integers
A value of 0 indicates that LIM will not wait at all to receive information from ELIM--it expects to receive the entire load string at once.
So, if for example, your ELIM writes value-pairs with 1 second intervals between them, and you collect 12 load indices, you need to allow at least 12 seconds for the ELIM to complete writing an entire load string. So you would define LSF_ELIM_BLOCKTIME to 15 or 20 seconds for example.
2 seconds
LSF_ELIM_RESTARTS to limit how many times the ELIM can be restarted.
LSF_ELIM_DEBUG
LSF_ELIM_DEBUG=y
UNIX only
This parameter is useful to view which load information an ELIM or MELIM is collecting and to add fault-tolerance to LIM.
When this parameter is set to
y
:
- All load information received by LIM from the ELIMor MELIM is logged in the LIM log file (
lim.log.
host_name).- If LSF_ELIM_BLOCKTIME is undefined, whenever there is an error in the ELIM or some situation arises that the ELIM cannot send the entire load update string to the LIM, LIM does not wait indefinitely for load information from ELIM. After 2 seconds, the LIM restarts the ELIM.
For example, LIM is expecting 3 name-value-pairs, such as:
3 tmp2 47.5 nio 344.0 licenses 5However, LIM only receives the following from ELIM:
3 tmp2 47.5LIM waits 2 seconds after the last value is received and if no more information is received, LIM restarts the ELIM.
If LSF_ELIM_BLOCKTIME is defined, the LIM waits for the specified amount of time before restarting the ELIM instead of the 2 seconds.
Undefined--if LSF_ELIM_DEBUG is undefined, load information sent from ELIM to LIM is not logged. In addition, if LSF_ELIM_BLOCKTIME is undefined, LIM waits indefinitely to receive load information from ELIM.
LSF_ELIM_BLOCKTIME to configure how long LIM waits before restarting the ELIM.
LSF_ELIM_RESTARTS to limit how many times the ELIM can be restarted.
LSF_ELIM_RESTARTS
LSF_ELIM_RESTARTS=
integerUNIX only
LSF_ELIM_BLOCKTIME or LSF_ELIM_DEBUG must be defined in conjunction with LSF_ELIM_RESTARTS.
Defines the maximum number of times an ELIM or MELIM can be restarted.
When this parameter is defined:
- If LIM attempts to retrieve load information from the ELIM and there is an error such as an invalid value for example, LIM restarts the ELIM.
If the error is consistent and LIM keeps restarting the ELIM, LSF_ELIM_RESTARTS limits how many times the ELIM can be restarted to prevent an ongoing loop.
Non-negative integers
Undefined; the number of ELIM restarts is unlimited
LSF_ELIM_BLOCKTIME, LSF_ELIM_DEBUG
LSF_HOST_ADDR_RANGE
LSF_HOST_ADDR_RANGE =
IP_address ...Optional. Identifies the range of IP addresses that are allowed to be LSF hosts tat can be dynamically added to or removed from the cluster.
If the value of this parameter is undefined, any host can be dynamically added to the cluster.
If a value is defined, security for dynamically adding and removing hosts is enabled, and only hosts with IP addresses within the specified range can be added to or removed from a cluster dynamically.
Specify an IP address or range of addresses, in dotted quad notation (
nnn.nnn.nnn.nnn
). Multiple ranges can be defined, separated by spaces.If there is an error in the configuration of this variable (for example, an address range is not in the correct format), no host will be allowed to join the cluster dynamically and a error message will be logged in the LIM log. Address ranges are validated at configuration time so they must conform to the required format.
If a requesting host belongs to an IP address that falls in the specified range, the host will be accepted to become an LSF host.
IP addresses are separated by spaces, and considered "OR" alternatives.
The asterisk (*) character indicates any value is allowed.
The dash (-) character indicates an explicit range of values. For example 1-4 indicates 1,2,3,4 are allowed.
Open ranges such as *-30, or 10-*, are allowed.
If a range is specified with less fields than an IP address such as 10.161, it is considered as 10.161.*.*.
This parameter is limited to 255 characters.
After you configure LSF_HOST_ADDR_RANGE, check the
lim.log.
host_name file to make sure this parameter is correctly set. If this parameter is not set or is wrong, this will be indicated in the log file.
LSF_HOST_ADDR_RANGE=100
All hosts with a domain address starting with 100 will be allowed access.
LSF_HOST_ADDR_RANGE=100-110.34.1-10.4-56
All hosts belonging to a domain with an address having the first number between 100 and 110, then 34, then a number between 1 and 10, then, a number between 4 and 56 will be allowed access.
Example: 100.34.9.45, 100.34.1.4, 102.34.3.20, etc.LSF_HOST_ADDR_RANGE=100.172.1.13 100.*.30-54 124.24-*.1.*-34
All hosts belonging to a domain with the address 100.172.1.13 will be allowed access. All hosts belonging to domains starting with 100, then any number, then a range of 30 to 54 will be allowed access. All hosts belonging to domains starting with 124, then from 24 onward, then 1, then from 0 to 34 will be allowed access.
LSF_HOST_ADDR_RANGE=12.23.45.*
All hosts belonging to domains starting with 12.23.45 are allowed.
LSF_HOST_ADDR_RANGE=100.*43
The
*
character can only be used to indicate any value. The format of this example is incorrect, and an error will be inserted in the LIM log and no hosts will be able to join the cluster dynamically.LSF_HOST_ADDR_RANGE=100.*43 100.172.1.13
Although one correct address range is specified, because *43 is incorrect format, the entire line is considered invalid. An error will be inserted in the LIM log and no hosts will be able to join the cluster dynamically.
Undefined. No security is enabled. Any host in any domain can join the LSF cluster dynamically.
MASTER_INACTIVITY_LIMIT
MASTER_INACTIVITY_LIMIT =
integerAn integer reflecting a multiple of EXINTERVAL. A slave will attempt to become master if it does not hear from the previous master after (HOST_INACTIVITY_LIMIT +host_number*MASTER_INACTIVITY_LIMIT)*EXINTERVAL seconds, where host_number is the position of the host in
lsf.cluster.
cluster_name
.The master host is host_number 0.
2
PROBE_TIMEOUT
PROBE_TIMEOUT =
time_in_secondsSpecifies the timeout in seconds to be used for the
connect
(2) system callBefore taking over as the master, a slave LIM will try to connect to the last known master via TCP.
2 seconds
PRODUCTS
PRODUCTS =
product_name ...Specifies the LSF products and features that the cluster will run (you must also have a license for every product you want to run). The list of items is separated by space.
The PRODUCTS parameter is set automatically during installation to include core features. Here are some of the optional products and features that can be specified:
LSF_Base LSF_Manager LSF_Sched_Fairshare LSF_Sched_Preemption LSF_Sched_Parallel LSF_Sched_Resource_Reservation
RETRY_LIMIT
RETRY_LIMIT =
integerInteger reflecting a multiple of EXINTERVAL that controls the number of retries a master or slave LIM makes before assuming that the slave or master is unavailable.
If the master does not hear from a slave for HOST_INACTIVITY_LIMIT exchange intervals, it will actively poll the slave for RETRY_LIMIT exchange intervals before it will declare the slave as unavailable. If a slave does not hear from the master for HOST_INACTIVITY_LIMIT exchange intervals, it will actively poll the master for RETRY_LIMIT intervals before assuming that the master is down.
2
[ Top ]
ClusterAdmins Section
(Optional) The
ClusterAdmins
section defines the LSF administrators for the cluster. The only keyword is ADMINISTRATORS.If the
ClusterAdmins
section is not present, the default LSF administrator is root. Usingroot
as the primary LSF administrator is not recommended.ADMINISTRATORS
ADMINISTRATORS =
administrator_name ...Specify UNIX user names.
You can also specify UNIX user group name, Windows user names, and Windows user group names.
The first administrator of the expanded list is considered the primary LSF administrator. The primary administrator is the owner of the LSF configuration files, as well as the working files under
LSB_SHAREDIR/
cluster_name. If the primary administrator is changed, make sure the owner of the configuration files and the files underLSB_SHAREDIR/
cluster_name are changed as well.Administrators other than the primary LSF administrator have the same privileges as the primary LSF administrator except that they do not have permission to change LSF configuration files. They can perform clusterwide operations on jobs, queues, or hosts in the system.
For flexibility, each cluster may have its own LSF administrators, identified by a user name, although the same administrators can be responsible for several clusters.
Use the
-l
option of thelsclusters
(1
) command to display all of the administrators within a cluster.Windows domain
- If the specified user or user group is a domain administrator, member of the
Power Users
group or a group with domain administrative privileges, the specified user or user group must belong to the LSF user domain.- If the specified user or user group is a user or user group with a lower degree of privileges than outlined in the previous point, the user or user group must belong to the LSF user domain and be part of the Global Admins group.
Windows workgroup
- If the specified user or user group is not a workgroup administrator, member of the
Power Users
group, or a group with administrative privileges on each host, the specified user or user group must belong to the Local Admins group on each host.For backwards compatibility, ClusterManager and Manager are synonyms for
ClusterAdmins
and ADMINISTRATORS respectively. It is possible to have both sections present in the samelsf.cluster.
cluster_namefile to allow daemons from different LSF versions to share the same file.
The following gives an example of a cluster with two LSF administrators. The user listed first, user2, is the primary administrator.
Begin ClusterAdmins ADMINISTRATORS = user2 user7 End ClusterAdminslsfadmin
[ Top ]
Host Section
The Host section is the last section in
lsf.cluster.
cluster_nameand is the only required section. It lists all the hosts in the cluster and gives configuration information for each host.
The order in which the hosts are listed in this section is important, because the first host listed becomes the LSF master host. Since the master LIM makes all placement decisions for the cluster, it should be on a fast machine.
The LIM on the first host listed becomes the master LIM if this host is up; otherwise, that on the second becomes the master if its host is up, and so on. Also, to avoid the delays involved in switching masters if the first machine goes down, the master should be on a reliable machine. It is desirable to arrange the list such that the first few hosts in the list are always in the same subnet. This avoids a situation where the second host takes over as master when there are communication problems between subnets.
Configuration information is of two types:
- Some fields in a host entry simply describe the machine and its configuration.
- Other fields set thresholds for various resources.
Example Host section
This example Host section contains descriptive and threshold information for three hosts:
Begin Host HOSTNAME model type server r1m pg tmp RESOURCES RUNWINDOW hostA SparcIPC Sparc 1 3.5 15 0 (sunos frame) () hostD Sparc10 Sparc 1 3.5 15 0 (sunos) (5:18:30-1:8:30) hostD ! ! 1 2.0 10 0 () () End HostDescriptive fields
The following fields are required in the Host section:
The following fields are optional:
HOSTNAME
Official name of the host as returned by
hostname
(1)The name must be listed in
lsf.shared
as belonging to this cluster.model
Host model
The name must be defined in the HostModel section of
lsf.shared
. This determines the CPU speed scaling factor applied in load and placement calculations.Optionally, the ! keyword for the model or type column, indicates that the host model or type is to be automatically detected by the LIM running on the host.
nd
Number of local disks
This corresponds to the ndisks static resource. On most host types, LSF automatically determines the number of disks, and the nd parameter is ignored.
nd should only count local disks with file systems on them. Do not count either disks used only for swapping or disks mounted with NFS.
The number of disks determined by the LIM, or 1 if the LIM cannot determine this
RESOURCES
The static Boolean resources available on this host
The resource names are strings defined in the Resource section of
lsf.shared
. You may list any number of resources, enclosed in parentheses and separated by blanks or tabs. For example:(fs frame hpux)REXPRI
UNIX only
Default execution priority for interactive remote jobs run under the RES
The range is from -20 to 20. REXPRI corresponds to the BSD-style nice value used for remote jobs. For hosts with System V-style nice values with the range 0 - 39, a REXPRI of -20 corresponds to a nice value of 0, and +20 corresponds to 39. Higher values of REXPRI correspond to lower execution priority; -20 gives the highest priority, 0 is the default priority for login sessions, and +20 is the lowest priority.
0
RUNWINDOW
Dispatch window for interactive tasks.
When the host is not available for remote execution, the host status is
lockW
(locked by run window). LIM does not schedule interactive tasks on hosts locked by dispatch windows. Run windows only apply to interactive tasks placed by LIM. The LSF batch system uses its own (optional) host dispatch windows to control batch job processing on batch server hosts.A dispatch window consists of one or more time windows in the format begin_time-end_time. No blanks can separate begin_time and end_time. Time is specified in the form [day
:
]hour[:
minute]. If only one field is specified, LSF assumes it is an hour. Two fields are assumed to be hour:
minute. Use blanks to separate time windows.Always accept remote jobs
server
Indicates whether the host can receive jobs from other hosts
Specify 1 if the host can receive jobs from other hosts; specify 0 otherwise. If server is set to 0, the host is an LSF client. Client hosts do not run the LSF daemons. Client hosts can submit interactive and batch jobs to an LSF cluster, but they cannot execute jobs sent from other hosts.
1
type
Host type as defined in the HostType section of
lsf.shared
The strings used for host types are determined by the system administrator: for example, SUNSOL, DEC, or HPPA. The host type is used to identify binary- compatible hosts.
The host type is used as the default resource requirement. That is, if no resource requirement is specified in a placement request, the task is run on a host of the same type as the sending host.
Often one host type can be used for many machine models. For example, the host type name SUNSOL6 might be used for any computer with a SPARC processor running SunOS 6. This would include many Sun models and quite a few from other vendors as well.
Optionally, the ! keyword for the model or type column, indicates that the host model or type is to be automatically detected by the LIM running on the host.
Threshold fields
The LIM uses these thresholds in determining whether to place remote jobs on a host. If one or more LSF load indices exceeds the corresponding threshold (too many users, not enough swap space, etc.), then the host is regarded as busy, and LIM will not recommend jobs to that host.
The CPU run queue length threshold values (r15s, r1m, and r15m) are taken as effective queue lengths as reported by
lsload -E
.All of these fields are optional; you only need to configure thresholds for load indices that you wish to use for determining whether hosts are busy. Fields that are not configured are not considered when determining host status. The keywords for the threshold fields are not case sensitive.
Thresholds can be set for any of the following:
- The built-in LSF load indexes (r15s, r1m, r15m, ut, pg, it, io, ls, swp, mem, tmp)
- External load indexes defined in the Resource section of
lsf.shared
[ Top ]
ResourceMap Section
The ResourceMap section defines shared resources in your cluster. This section specifies the mapping between shared resources and their sharing hosts. When you define resources in the Resources section of
lsf.shared
, there is no distinction between a shared and non-shared resource. By default, all resources are not shared and are local to each host. By defining the ResourceMap section, you can define resources that are shared by all hosts in the cluster or define resources that are shared by only some of the hosts in the cluster.This section must appear after the Host section of
lsf.cluster.
cluster_name, because it has a dependency on host names defined in the Host section.ResourceMap section structure
The first line consists of the keywords RESOURCENAME and LOCATION. Subsequent lines describe the hosts that are associated with each configured resource.
Example ResourceMap section
Begin ResourceMap RESOURCENAME LOCATION verilog (5@[all]) local ([host1 host2] [others]) End ResourceMapThe resource
verilog
must already be defined in the RESOURCE section of thelsf.shared
file. It is a static numeric resource shared by all hosts. The value forverilog
is 5. The resourcelocal
is a numeric shared resource that contains two instances in the cluster. The first instance is shared by two machines,host1
andhost2
. The second instance is shared by all other hosts.Resources defined in the ResourceMap section can be viewed by using the
-s
option of thelshosts
(for static resource) andlsload
(for dynamic resource) commands.LOCATION
Defines the hosts that share the resource
For a static resource, you must define an initial value here as well. Do not define a value for a dynamic resource.
instance is a list of host names that share an instance of the resource. The reserved words
all
,others
, anddefault
can be specified for the instance:
all
--Indicates that there is only one instance of the resource in the whole cluster and that this resource is shared by all of the hostsUse the not operator (~) to exclude hosts from the
all
specification. For example:(2@[all ~host3 ~host4])means that 2 units of the resource are shared by all server hosts in the cluster made up of host1 host2 ... hostn, except for host3 and host4. This is useful if you have a large cluster but only want to exclude a few hosts.
The parentheses are required in the specification. The not operator can only be used with the
all
keyword. It is not valid with the keywordsothers
anddefault
.others
--Indicates that the rest of the server hosts not explicitly listed in the LOCATION field comprise one instance of the resourceFor example:
2@[host1] 4@[others]indicates that there are 2 units of the resource on
host1
and 4 units of the resource shared by all other hosts.default
--Indicates an instance of a resource on each host in the clusterThis specifies a special case where the resource is in effect not shared and is local to every host.
default
means at each host. Normally, you should not need to usedefault,
because by default all resources are local to each host. You might want to use ResourceMap for a non-shared static resource if you need to specify different values for the resource on different hosts.RESOURCENAME
Name of the resource
This resource name must be defined in the Resource section of
lsf.shared
. You must specify at least a name and description for the resource, using the keywords RESOURCENAME and DESCRIPTION.
- A resource name cannot begin with a number.
- A resource name cannot contain any of the following characters:
: . ( ) [ + - * / ! & | < > @ =- A resource name cannot be any of the following reserved names:
cpu cpuf io logins ls idle maxmem maxswp maxtmp type model status it mem ncpus ndisks pg r15m r15s r1m swap swp tmp ut- Resource names are case sensitive
- Resource names can be up to 29 characters in length
[ Top ]
RemoteClusters Section
Optional. This section is used only in a MultiCluster environment. By default, the local cluster can obtain information about all other clusters specified in
lsf.shared
. The RemoteClusters section limits the clusters that the local cluster can obtain information about.The RemoteClusters section is required if you want to configure cluster equivalency, cache interval, daemon authentication across clusters, or if you want to run parallel jobs across clusters. To maintain compatibility in this case, make sure the list includes all clusters specified in
lsf.shared
, even if you only configure the default behavior for some of the clusters.The first line consists of keywords. CLUSTERNAME is mandatory and the other parameters are optional.
Subsequent lines configure the remote cluster.
Example RemoteClusters section
Begin RemoteClusters CLUSTERNAME EQUIV CACHE_INTERVAL RECV_FROM AUTH cluster1 Y 60 Y KRB cluster2 N 60 Y - cluster4 N 60 N PKI End RemoteClustersCLUSTERNAME
Remote cluster name
Defines the Remote Cluster list. Specify the clusters you want the local cluster will recognize. Recognized clusters must also be defined in
lsf.shared
. Additional clusters listed inlsf.shared
but not listed here will be ignored by this cluster.EQUIV
Specify `Y' to make the remote cluster equivalent to the local cluster. Otherwise, specify `N'. The master LIM considers all equivalent clusters when servicing requests from clients for load, host, or placement information.
EQUIV changes the default behavior of LSF commands and utilities and causes them to automatically return load (
lsload
(1
)), host (lshosts
(1
)), or placement (lsplace
(1
)) information about the remote cluster as well as the local cluster, even when you don't specify a cluster name.CACHE_INTERVAL
Specify the load information cache threshold, in seconds. The host information threshold is twice the value of the load information threshold.
To reduce overhead and avoid updating information from remote clusters unnecessarily, LSF displays information in the cache, unless the information in the cache is older than the threshold value.
60 (seconds)
RECV_FROM
Specifies whether the local cluster accepts parallel jobs that originate in a remote cluster
RECV_FROM does not affect regular or interactive batch jobs.
Specify `Y' if you want to run parallel jobs across clusters. Otherwise, specify `N'.
Y
AUTH
Defines the preferred authentication method for LSF daemons communicating across clusters. Specify the same method name that is used to identify the corresponding
eauth
program (eauth
.method_name). If the remote cluster does not prefer the same method, LSF uses default security between the two clusters.- (only privileged port (setuid) authentication is used between clusters)
[ Top ]
[ Platform Documentation ] [ Title ] [ Contents ] [ Previous ] [ Next ] [ Index ]
Date Modified: February 24, 2004
Platform Computing: www.platform.com
Platform Support: support@platform.com
Platform Information Development: doc@platform.com
Copyright © 1994-2004 Platform Computing Corporation. All rights reserved.