Learn more about Platform products at http://www.platform.com

[ Platform Documentation ] [ Title ] [ Contents ] [ Previous ] [ Next ] [ Index ]



bacct


displays accounting statistics about finished jobs

SYNOPSIS

bacct [-b | -l] [-d] [-e] [-w] [-C time0,time1] [-D time0,time1] [-f logfile_name] [-m host_name ...] [-N host_name | -N host_model | -N CPU_factor] [-P project_name ...] [-q queue_name ...] [-sla service_class_name ...] [-S time0,time1] [-u user_name ... | -u all] [-x] [job_ID ...]

bacct -U reservation_ID ... | -U all [-u user_name ... | -u all]

bacct [-h | -V]

DESCRIPTION

By default, displays accounting statistics for all finished jobs (with a DONE or EXIT status) submitted by the user who invoked the command, on all hosts, projects, and queues in the LSF system.

By default, bacct displays statistics for all jobs logged in the current LSF accounting log file: LSB_SHAREDIR/cluster_name/logdir/lsb.acct (see lsb.acct(5)).

By default, CPU time is not normalized.

If neither -l nor -b is present, displays the fields in SUMMARY only (see OUTPUT).

Statistics not reported by bacct but of interest to individual system administrators can be generated by directly using awk(1) or perl(1) to process the lsb.acct file.

All times are in seconds.

When combined with the -U option, -u is interpreted as the user name of the reservation creator. For example:

% bacct -U all -u user2

Shows all the advance reservations created by user user2.

Without the -u option, bacct -U shows all advance reservation information about jobs submitted by the user.

In a MultiCluster environment, advance reservation information is only logged in the execution cluster, so bacct displays advance reservation information for local reservations only. You cannot see information about remote reservations.

Throughput Calculation

The throughput (T) of the LSF system, certain hosts, or certain queues is calculated by the formula:

T = N/(ET-BT)

where:

You can use the option -C time0,time1 to specify the Start time as time0 and the End time as time1. In this way, you can examine throughput during a specific time period.

Jobs involved in the throughput calculation are only those being logged (that is, with a DONE or EXIT status). Jobs that are running, suspended, or that have never been dispatched after submission are not considered, because they are still in the LSF system and not logged in lsb.acct.

The total throughput of the LSF system can be calculated by specifying -u all without any of the -m, -q, -S, -D or job_ID options. The throughput of certain hosts can be calculated by specifying -u all without the -q, -S, -D or job_ID options. The throughput of certain queues can be calculated by specifying -u all without the -m, -S, -D or job_ID options.

OPTIONS

-b

Brief format. Displays accounting statistics in brief format. See OUTPUT for a description of information that is displayed.

-d

Displays accounting statistics for only successfully completed jobs (with a DONE status).

-e

Displays accounting statistics for only exited jobs (with an EXIT status).

-l

Long format. Displays additional accounting statistics. See OUTPUT for a description of information that is displayed.

-w

Wide format. Displays accounting statistics in a wide format. No truncation is performed.

-C time0,time1

Displays accounting statistics for only jobs that completed or exited during the specified time interval. Reads lsb.acct and all archived log files (lsb.acct.n) unless -f is also used.

The time format is the same as in bhist(1).

-D time0,time1

Displays accounting statistics for only jobs dispatched during the specified time interval. Reads lsb.acct and all archived log files (lsb.acct.n) unless -f is also used.

The time format is the same as in bhist(1).

-f logfile_name

Searches only the specified job log file for accounting statistics. Specify either an absolute or relative path.

Useful for offline analysis.

-m host_name ...

Displays accounting statistics for only jobs dispatched to the specified hosts.

If a list of hosts is specified, host names must be separated by spaces and enclosed in quotation marks (") or (').

-N host_name | -N host_model | -N CPU_factor

Normalizes CPU time by the CPU factor of the specified host or host model, or by the specified CPU factor.

If you use bacct offline by indicating a job log file, you must specify a CPU factor.

Use lsinfo to get host model and CPU factor information.

-P project_name ...

Displays accounting statistics for only jobs belonging to the specified projects. If a list of projects is specified, project names must be separated by spaces and enclosed in quotation marks (") or (').

-q queue_name ...

Displays accounting statistics only for jobs submitted to the specified queues.

If a list of queues is specified, queue names must be separated by spaces and enclosed in quotation marks (") or (').

-S time0,time1

Displays accounting statistics for only jobs submitted during the specified time interval. Reads lsb.acct and all archived log files (lsb.acct.n) unless -f is also used.

The time format is the same as in bhist(1).

-sla service_class_name

Displays accounting statistics for jobs that ran under the specified service class.

Use bsla to display the properties of service classes configured in LSB_CONFDIR/cluster_name/configdir/lsb.serviceclasses (see lsb.serviceclasses(5)) and dynamic information about the state of each service class.

-U reservation_ID ... | -U all

Displays accounting statistics for the specified advance reservation IDs, or for all reservation IDs if the keyword all is specified.

A list of reservation IDs must be separated by spaces and enclosed in quotation marks (") or (').

In a MultiCluster environment, you cannot see information about remote reservations. You cannot specify a remote reservation ID, and the keyword all only displays information about reservations in the local cluster.

-u user_name ...|-u all

Displays accounting statistics only for jobs submitted by the specified users, or by all users if the keyword all is specified.

If a list of users is specified, user names must be separated by spaces and enclosed in quotation marks (") or ('). You can specify both user names and user IDs in the list of users.

-x

Displays jobs that have triggered a job exception (overrun, underrun, idle). Use with the -l option to show the exception status for individual jobs.

job_ID ...

Displays accounting statistics for only jobs with the specified job IDs.

This option overrides all other options except -b, -l, -f, -h, and -V. If the reserved job ID 0 is used, it will be ignored.

-h

Prints command usage to stderr and exits.

-V

Prints LSF release version to stderr and exits.

OUTPUT

SUMMARY (default format)

Statistics on jobs. The following fields are displayed:

The total, average, minimum, and maximum statistics are on all specified jobs.

The wait time is the elapsed time from job submission to job dispatch.

The turnaround time is the elapsed time from job submission to job completion.

The hog factor is the amount of CPU time consumed by a job divided by its turnaround time.

The throughput is the number of completed jobs divided by the time period to finish these jobs (jobs/hour). For more details, see DESCRIPTION.

Brief Format (-b)

In addition to the default format SUMMARY, displays the following fields:

U/UID

Name of the user who submitted the job. If LSF fails to get the user name by getpwuid(3), the user ID is displayed.

QUEUE

Queue to which the job was submitted.

SUBMIT_TIME

Time when the job was submitted.

CPU_T

CPU time consumed by the job.

WAIT

Wait time of the job.

TURNAROUND

Turnaround time of the job.

FROM

Host from which the job was submitted.

EXEC_ON

Host or hosts to which the job was dispatched to run.

JOB_NAME

Name of the job (see bsub(1)).

Long Format (-l)

In addition to the fields displayed by default in SUMMARY and by -b, displays the following fields:

JOBID

Identifier that LSF assigned to the job.

PROJECT_NAME

Project name assigned to the job.

STATUS

Status that indicates the job was either successfully completed (DONE) or exited (EXIT).

DISPAT_TIME

Time when the job was dispatched to run on the execution hosts.

COMPL_TIME

Time when the job exited or completed.

HOG_FACTOR

Average hog factor, equal to "CPU time" / "turnaround time".

MEM

Maximum resident memory usage of all processes in a job, in kilobytes.

SWAP

Maximum virtual memory usage of all processes in a job, in kilobytes.

CWD

Current working directory of the job.

INPUT_FILE

File from which the job reads its standard input (see bsub(1)).

OUTPUT_FILE

File to which the job writes its standard output (see bsub(1)).

ERR_FILE

File in which the job stores its standard error output (see bsub(1)).

EXCEPTION STATUS

Possible values for the exception status of a job include:

idle

The job is consuming less CPU time than expected. The job idle factor (CPU time/runtime) is less than the configured JOB_IDLE threshold for the queue and a job exception has been triggered.

overrun

The job is running longer than the number of minutes specified by the JOB_OVERRUN threshold for the queue and a job exception has been triggered.

underrun

The job finished sooner than the number of minutes specified by the JOB_UNDERRUN threshold for the queue and a job exception has been triggered.

Advance Reservations (-U)

Displays the following fields:

RSVID

Advance reservation ID assigned by brsvadd command

TYPE

Type of reservation: user or system

CREATOR

User name of the advance reservation creator, who submitted the brsvadd command

USER

User name of the advance reservation user, who submitted the job with bsub -U

NCPUS

Number of CPUs reserved

RSV_HOSTS

List of hosts for which processors are reserved, and the number of processors reserved

TIME_WINDOW

Time window for the reservation.

EXAMPLES

Default format

% bacct 

Accounting information about jobs that are: 
  - submitted by users user1. 
  - accounted on all projects.
  - completed normally or exited.
  - executed on all hosts.
  - submitted to all queues.
  - accounted on all service classes.
------------------------------------------------------------------------------

SUMMARY:      ( time unit: second ) 
 Total number of done jobs:      60      Total number of exited jobs:   118
 Total CPU time consumed:    1011.5      Average CPU time consumed:     5.7
 Maximum CPU time of a job:   991.4      Minimum CPU time of a job:     0.0
 Total wait time in queues: 134598.0
 Average wait time in queue:  756.2
 Maximum wait time in queue: 7069.0      Minimum wait time in queue:    0.0
 Average turnaround time:      3585 (seconds/job)
 Maximum turnaround time:     77524      Minimum turnaround time:         6
 Average hog factor of a job:  0.00 ( cpu time / turnaround time )
 Maximum hog factor of a job:  0.56      Minimum hog factor of a job:  0.00
 Total throughput:             0.67 (jobs/hour)  during  266.18 hours
 Beginning time:       Aug  8 15:48      Ending time:          Aug 19 17:59

Jobs that have triggered job exceptions

% bacct -x -l

Accounting information about jobs that are: 
  - submitted by users user1, 
  - accounted on all projects.
  - completed normally or exited
  - executed on all hosts.
  - submitted to all queues.
  - accounted on all service classes.
------------------------------------------------------------------------------

Job <1743>, User <user1>, Project <default>, Status <DONE>, Queue <normal>,  
Command
                     <sleep 30>
Mon Aug 11 18:16:17: Submitted from host <hostB>, CWD <$HOME/jobs>, Output File
                     </dev/null>;
Mon Aug 11 18:17:22: Dispatched to <hostC>;
Mon Aug 11 18:18:54: Completed <done>.

 EXCEPTION STATUS:  underrun 

Accounting information about this job:
     CPU_T     WAIT     TURNAROUND   STATUS     HOG_FACTOR    MEM    SWAP
      0.19       65            157     done         0.0012     4M      5M
------------------------------------------------------------------------------

Job <1948>, User <user1>, Project <default>, Status <DONE>, Queue <normal>,
Command 
                     <sleep 550>
Tue Aug 12 14:15:03: Submitted from host <hostB>, CWD <$HOME/jobs>, Output File
                     </dev/null>;
Tue Aug 12 14:15:15: Dispatched to <hostC>;
Tue Aug 12 14:25:08: Completed <done>.

 EXCEPTION STATUS:  overrun  idle 

Accounting information about this job:
     CPU_T     WAIT     TURNAROUND   STATUS     HOG_FACTOR    MEM    SWAP
      0.20       12            605     done         0.0003     4M      5M
------------------------------------------------------------------------------


Job <1949>, User <user1>, Project <default>, Status <DONE>, Queue <normal>,
Command 
                     <sleep 400>
Tue Aug 12 14:26:11: Submitted from host <hostB>, CWD <$HOME/jobs>, Output File
                     </dev/null>;
Tue Aug 12 14:26:18: Dispatched to <hostC>;
Tue Aug 12 14:33:16: Completed <done>.

 EXCEPTION STATUS:  idle 

Accounting information about this job:
     CPU_T     WAIT     TURNAROUND   STATUS     HOG_FACTOR    MEM    SWAP
      0.17        7            425     done         0.0004     4M      5M

Job <719[14]>, Job Name <test[14]>, User <user1>, Project <default>, Status
                     <EXIT>, Queue <normal>, Command </home/user1/job1>
Mon Aug 18 20:27:44: Submitted from host <hostB>, CWD <$HOME/jobs>, Output File
                     </dev/null>;
Mon Aug 18 20:31:16: [14] dispatched to <hostA>;
Mon Aug 18 20:31:18: Completed <exit>.

 EXCEPTION STATUS:  underrun 

Accounting information about this job:
     CPU_T     WAIT     TURNAROUND   STATUS     HOG_FACTOR    MEM    SWAP
      0.19      212            214     exit         0.0009     2M      4M
------------------------------------------------------------------------------

SUMMARY:      ( time unit: second ) 
 Total number of done jobs:      45      Total number of exited jobs:    56
 Total CPU time consumed:    1009.1      Average CPU time consumed:    10.0
 Maximum CPU time of a job:   991.4      Minimum CPU time of a job:     0.1
 Total wait time in queues: 116864.0
 Average wait time in queue: 1157.1
 Maximum wait time in queue: 7069.0      Minimum wait time in queue:    7.0
 Average turnaround time:      1317 (seconds/job)
 Maximum turnaround time:      7070      Minimum turnaround time:        10
 Average hog factor of a job:  0.01 ( cpu time / turnaround time )
 Maximum hog factor of a job:  0.56      Minimum hog factor of a job:  0.00
 Total throughput:             0.59 (jobs/hour)  during  170.21 hours
 Beginning time:       Aug 11 18:18      Ending time:          Aug 18 20:31

Advance reservation accounting information

% bacct -U user1#2
Accounting for:
  - advanced reservation IDs: user1#2
  - advanced reservations created by user1
-----------------------------------------------------------------------------
RSVID       TYPE      CREATOR    USER    NCPUS       RSV_HOSTS     TIME_WINDOW
user1#2     user        user1   user1      1           hostA:1    9/16/17/36-
9/16/17/38
SUMMARY:
Total number of jobs:               4
Total CPU time consumed:      0.5 second
Maximum memory of a job:     4.2 MB
Maximum swap of a job:         5.2 MB
Total duration time:                 0 hour    2 minute    0 second

FILES

Reads lsb.acct, lsb.acct.n.

SEE ALSO

bhist(1), bsub(1), bjobs(1), lsb.acct(5), brsvadd(8), brsvs(1), bsla(1), lsb.serviceclasses(5)

[ Top ]


[ Platform Documentation ] [ Title ] [ Contents ] [ Previous ] [ Next ] [ Index ]


      Date Modified: February 24, 2004
Platform Computing: www.platform.com

Platform Support: support@platform.com
Platform Information Development: doc@platform.com

Copyright © 1994-2004 Platform Computing Corporation. All rights reserved.