IBM Books

Hitchhiker's Guide


Appendix A. A Sample Program to Illustrate Messages

This appendix provides sample output for a program run under POE with the maximum level of message reporting. It also points out the different types of messages you can expect, and explains what they mean.

To set the level of messages that get reported when you run your program, you can use the -infolevel (or -ilevel) option when you invoke POE, or the MP_INFOLEVEL environment variable. Setting either of these to 6 gives you the maximum number of diagnostic messages when you run your program. For more information about setting the POE message level, see IBM Parallel Environment for AIX: Operation and Use, Vol. 1.

Note that we're using numbered prefixes along the left-hand edge of the output you see below as a way to refer to particular lines; they are not part of the output you'll see when you run your program. For an explanation of the messages denoted by these numbered prefixes, see "Figuring Out What All of This Means".

The following command:

> poe hello_world_c -procs 2 -hostfile pool.list -infolevel 6

produces the following output. Note that the Resource Manager was used in this example:

1   INFO: DEBUG_LEVEL changed from 0 to 4
2   D1<L4>: Open of file pool.list successful
3   D1<L4>: mp_euilib = ip
4   D1<L4>: task 0 5 1
5   D1<L4>: extended 1  5 1
6   D1<L4>: node allocation strategy = 2
7   INFO: 0031-690  Connected to Resource Manager
8   INFO: 0031-118  Pool 1 requested for task 0
9   INFO: 0031-118  Pool 1 requested for task 1
10  D1<L4>: Elapsed time for call to jm_allocate: 0 seconds
11  INFO: 0031-119  Host k10n01.ppd.pok.ibm.com allocated for task 0
12  INFO: 0031-119  Host k10n02.ppd.pok.ibm.com allocated for task 1
13  D1<L4>: Requesting service pmv2
14  D1<L4>: Jobid = 803755221
15  D4<L4>: Command args:<>
16  D1<L4>: Task 0 pulse count is 0
17  D1<L4>: Task 1 pulse count is 0
18  D3<L4>: Message type 34 from source 0
19  D1<L4>: Task 0 pulse count is 1
20  D1<L4>: Task 1 pulse count is 0
21  D3<L4>: Message type 21 from source 0
22    0: INFO: 0031-724  Executing program: <hello_world_c>
23  D3<L4>: Message type 34 from source 1
24  D1<L4>: Task 0 pulse count is 1
25  D1<L4>: Task 1 pulse count is 1
26  D3<L4>: Message type 21 from source 0
27    0: INFO: DEBUG_LEVEL changed from 0 to 4
28    0: D1<L4>: mp_euilib is <ip>
29    0: D1<L4>: mp_css_interrupt is <0>
30  D3<L4>: Message type 21 from source 1
31    1: INFO: 0031-724  Executing program: <hello_world_c>
32  D3<L4>: Message type 21 from source 0
33    0: D1<L4>: cssAdapterType is <1>
34  D3<L4>: Message type 21 from source 1
35    1: INFO: DEBUG_LEVEL changed from 0 to 4
36    1: D1<L4>: mp_euilib is <ip>
37    1: D1<L4>: mp_css_interrupt is <0>
38    1: D1<L4>: cssAdapterType is <1>
39  D3<L4>: Message type 23 from source 0
40  D1<L4>: init_data for task 0: <129.40.161.65: 1675>
41  D3<L4>: Message type 23 from source 1
42  D1<L4>: init_data for task 1: <129.40.161.66: 1565>
43  D2<L4>: About to call pm_address
44  D2<L4>: Elapsed time for pm_address: 0 seconds
45  D3<L4>: Message type 21 from source 1
46    1: D1<L4>: About to call mpci_connect
47  D3<L4>: Message type 21 from source 0
48  D3<L4>: Message type 21 from source 1
49  D3<L4>: Message type 21 from source 0
50  D3<L4>: Message type 21 from source 1
51  D3<L4>: Message type 21 from source 0
52  D3<L4>: Message type 21 from source 1
53  D3<L4>: Message type 21 from source 0
54  D3<L4>: Message type 21 from source 1
55  D3<L4>: Message type 21 from source 0
56  D3<L4>: Message type 21 from source 1
57  D3<L4>: Message type 21 from source 0
58  D3<L4>: Message type 21 from source 1
59  D3<L4>: Message type 21 from source 0
60  0: D1<L4>: About to call mpci_connect
61  D3<L4>: Message type 21 from source 1
62  D3<L4>: Message type 21 from source 0
63  D3<L4>: Message type 21 from source 1
64  D3<L4>: Message type 21 from source 0
65  D3<L4>: Message type 21 from source 1
66    1: D1<L4>: Elapsed time for mpci_connect: 1 seconds
67  D3<L4>: Message type 21 from source 0
68  D3<L4>: Message type 44 from source 1
69  D3<L4>: Message type 21 from source 0
70  D3<L4>: Message type 21 from source 0
71    0: D1<L4>: Elapsed time for mpci_connect: 0 seconds
72  D3<L4>: Message type 44 from source 0
73  D2<L4>: <C O N N E C T   D A T A>
74  D2<L4>: Task	Down Count	Nodes
75  D2<L4>: ====	==========	=====
76  D2<L4>: 0	0		
77  D2<L4>: 1	0		
78  D2<L4>: <E N D   O F   C O N N E C T   D A T A>
79  D3<L4>: Message type 21 from source 0
80  0: D1<L4>: About to call _ccl_init
81  D3<L4>: Message type 21 from source 1
82  D3<L4>: Message type 21 from source 1
83  D3<L4>: Message type 21 from source 1
84  D3<L4>: Message type 21 from source 1
85  D3<L4>: Message type 21 from source 1
86  D3<L4>: Message type 21 from source 1
87  D3<L4>: Message type 21 from source 1
88    1: D1<L4>: About to call _ccl_init
89  D3<L4>: Message type 21 from source 0
90  D3<L4>: Message type 21 from source 1
91  D3<L4>: Message type 21 from source 0
92  D3<L4>: Message type 21 from source 0
93  D3<L4>: Message type 21 from source 1
94  D3<L4>: Message type 21 from source 0
95  D3<L4>: Message type 21 from source 1
96  D3<L4>: Message type 21 from source 0
97  D3<L4>: Message type 21 from source 1
98  D3<L4>: Message type 21 from source 0
99  D3<L4>: Message type 21 from source 1
100 D3<L4>: Message type 21 from source 0
101 D3<L4>: Message type 21 from source 1
102 D3<L4>: Message type 21 from source 0
103 D3<L4>: Message type 21 from source 1
104 D3<L4>: Message type 21 from source 0
105   0: D1<L4>: Elapsed time for _ccl_init: 0 seconds
106 D3<L4>: Message type 21 from source 1
107 D3<L4>: Message type 20 from source 0
108   0: Hello, World!
109 D3<L4>: Message type 21 from source 1
110   1: D1<L4>: Elapsed time for _ccl_init: 0 seconds
111 D3<L4>: Message type 21 from source 0
112  0: INFO: 0033-3075 VT Node Tracing completed.  Node merge beginning
113 D3<L4>: Message type 20 from source 1
114   1: Hello, World!
115 D3<L4>: Message type 21 from source 0
116   0: INFO: 0031-306  pm_atexit: pm_exit_value is 0.
117 D3<L4>: Message type 21 from source 1
118   1: INFO: 0033-3075 VT Node Tracing completed.  Node merge beginning
119 D3<L4>: Message type 17 from source 0
120 D3<L4>: Message type 21 from source 1
121   1: INFO: 0031-306  pm_atexit: pm_exit_value is 0.
122 D3<L4>: Message type 17 from source 1
123 D3<L4>: Message type 22 from source 0
124 INFO: 0031-656  I/O file STDOUT closed by task 0
125 D3<L4>: Message type 22 from source 1
126 INFO: 0031-656  I/O file STDOUT closed by task 1
127 D3<L4>: Message type 15 from source 0
128 D1<L4>: Accounting data from task 0 for source 0:
129 D3<L4>: Message type 15 from source 1
130 D1<L4>: Accounting data from task 1 for source 1:
131 D3<L4>: Message type 22 from source 0
132 INFO: 0031-656  I/O file STDERR closed by task 0
133 D3<L4>: Message type 22 from source 1
134 INFO: 0031-656  I/O file STDERR closed by task 1
135 D3<L4>: Message type 1 from source 0
136 INFO: 0031-251  task 0 exited: rc=0
137 D3<L4>: Message type 1 from source 1
138 INFO: 0031-251  task 1 exited: rc=0
139 D1<L4>: All remote tasks have exited: maxx_errcode = 0
140 INFO: 0031-639  Exit status from pm_respond = 0
141 D1<L4>: Maximum return code from user = 0
142 D2<L4>: In pm_exit... About to call pm_remote_shutdown
143 D2<L4>: Sending PMD_EXIT to task 0
144 D2<L4>: Sending PMD_EXIT to task 1
145 D2<L4>: Elapsed time for pm_remote_shutdown: 0 seconds
146 D2<L4>: In pm_exit... About to call jm_disconnect
147 D2<L4>: Elapsed time for jm_disconnect: 0 seconds
148 D2<L4>: In pm_exit... Calling exit with status = 0 at Wed Jun 21 07: 15: 07 19

Figuring Out What All of This Means

When you set -infolevel to 6, you get the full complement of diagnostic messages, which we'll explain here.

The example above includes numbered prefixes along the left-hand edge of the output so that we can refer to particular lines, and then tell you what they mean. Remember, these prefixes are not part of your output. The table below points you to the line number of the messages that are of most interest, and provides a short explanation.
Line(s) Message Description
7, 8, 9 Pool 1 was requested in host.list file, pool.list.
11-12 Names hosts that are used.
13 Indicates that service pmv2, from /etc/services is being used.
18 Message type 34 indicates pulse activity (the pulse mechanism checked that each remote node was actively participating with the home node).
21 Message type 21 indicates a STDERR message.
28 Indicates that the euilib message passing protocol was specified.
40, 42 String returned from _eui_init, which initializes mpci_libarary.
66, 71 Indicates initialization of mpci_library.
107, 108, 113, 114 Message type 20 shows STDOUT from your program.
116, 121 Indicates that the user's program has reached the exit handler. The exit code is 14.
119, 122 Message type 17 indicates the tasks have requested to exit.
124, 126, 132, 134 Indicates that the user has closed the STDOUT and STDERR pipes.
127, 129 Message type 15 indicates accounting data.
143-144 Indicates that the home node is sending an exit.
146-147 Indicates that the home node is disconnecting from the job management system.


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]