Diagnosing Bottlenecks with Performance Toolbox


About this document
     Related documentation
Monitoring system performance
Recording performance data
Analysis of collected data
Sample configuration files

About this document

This document describes how to check for resource bottlenecks using the Performance Toolbox for AIX. Resources on a system include memory, cpu, and Input/Output (I/O). This document covers bottlenecks across an entire system and is applicable to Performance Toolbox versions 2.1 and 2.2. This document does not address how to find the bottlenecks of a particular application. The following commands are described:

The Performance Toolbox package is a very powerful and diverse tool for identifying performance problems and potential performance bottlenecks. The following sections of this document will help you become familiar with what it has to offer and give you an idea of how to proceed with a strategy for monitoring system performance over a period of time. Please note that modifications to this basic strategy may have to be made in order to accommodate certain specific system environments.

Related documentation

Performance Analyzer - The AIX Support Family offers a product that utilizes the filter daemon of Performance Toolbox. When one of the alarms is triggered, AIX Performance Analyzer is invoked and sends a message to a specified user alerting them that a problem has been detected and provides a list of suggestions on how to alleviate the performance problem.

Consult Line Performance Analysis - The AIX Support Family offers a system analysis with tuning recommendations. For more information contact your AIX support center.

Performance Toolbox for AIX Guide and Reference (SC23-2625) - This IBM publication covers the use of the Performance Toolbox for AIX licensed product.

Performance Tuning Guide (SC23-2365) - This IBM publication covers performance monitoring and tuning of AIX systems. Order through your local IBM representative.

Monitoring system performance

All data is gathered by the xmservd daemon and forwarded to the program requesting the information.

For a listing of all the possible statistics and metrics which the xmservd daemon gathers, enter the following command (the listing is dependent upon each system's hardware and software configuration):

   # xmpeek -l | pg

There are four tools available for monitoring system performance in real-time: chmon, xmperf, 3dmon and ptxrlog.

The chmon program is a character-based program suited to run on a tty or within an xterm that allows for real-time monitoring of general performance data for checking the overall health of the system. The following syntax includes information about the three most active processes along with general system information:
   chmon -p3
The xmperf program is an Xwindows based program which allows for real-time monitoring of any set of statistics and metrics available to the xmservd daemon. The manner in which data is displayed is highly customizable. Simply run:
The 3dmon program is an Xwindows based program which provides real-time monitoring of any system running xmservd. Its purpose is to provide a 3D bar graph of performance statistics for multiple hosts/resources at the same time. The /usr/lpp/perfmgr/3dmon.cf file is used to list out statistics and configuration sets. Run the follwing command to monitor LAN statistics:
   3dmon -c LAN
The ptxrlog command can be used much like vmstat to view system statistics in line by line ASCII output at a set interval. Statistics listed in a control file called ptxrlog.cf can be monitored every 5 seconds with the syntax:
   ptxrlog -f ptxrlog.cf -i 5

All four of these programs can be started from the command line. The commands 3dmon, chmon, and ptxrlog can be started from within xmperf under the Utilities Menu.

Another useful feature of Performance Toolbox is the ability to have the system monitor and react to specific situations. This is done with the filter daemon (filtd). A useful method is to utilize the scripts provided by installing bos.perf.pmr and having the filter daemon launch these scripts to gather related performance data whenever an alarm condition is detected.

NOTE: IBM's Consult Line services can be invoked to help analyze data gathered from these scripts. In addition, IBM offers a product called Performance Analyzer that utilizes filtd. It provides alarms that detect potential performance problems. See the "Related documentation" section at the end of this fax for more information.

Using the filtd daemon

Using the filtd daemon requires working with the xmservd.res and filter.cf files. Sample versions of these files can be found in /usr/lpp/perfagent. It is recommended that these files be copied from /usr/lpp/perfagent to /etc/perf for customization.

At the end of /etc/rc.tcpip, add the following lines:
   # Starting xmservd (Performance Toolbox Deamon)
   sleep 2
Modify the line begining with xmquery to include -l0. The value of zero with the -l flag tells xmservd to always remain active. By default, it dies after 15 minutes of inactivity. The modified line should read:
   xmquery dgram udp wait root /usr/bin/xmservd xmservd -p3 -l0

Recording performance data

There are three types of recording files that can be created with Performance Toolbox: a binary recording file, an ASCII recording file, and a formatted ASCII recording file which can be read into a spreadsheet program.

The xmperf, 3dmon, ptxrlog, and the xmservd daemon itself can all be used to record performance data.

Recordings can be started for each active monitor window or for each instrument within a monitor window. Previously recorded data can be played back by xmperf.

Recordings can be started while 3dmon monitoring is active. For those with Version 2.2, the program 3dplay also offers a playback function for 3dmon recordings.

This command can record in binary format, in plain ASCII, or in formatted ASCII for reading into a spreadsheet. The ptxrlog program can be started from the command line or from within xmperf. The time to start and end the recording can be controlled with the -b and -e flags.

The xmservd daemon can be used to generate a binary recording file in the background while xmservd is running. The recording takes place on the system which is actually running the xmservd daemon, not over the network. The binary recording files are stored in /etc/perf.

In order to record performance statistics with xmservd, create a xmservd.cf file in /etc/perf on the system you wish to monitor and record data. A sample xmservd.cf file is provided at the end of this document.

Analysis of collected data

Collected data can be played back in xmperf or 3dplay (Version 2.2 only). The various commands beginning with the letters ptx allow for exporting collected data into a spreadsheet program. The azizo utility displays or prints recording files in graphical or tabulated formats. Files containing graphs are in PostScript format.

Sample configuration files

#   "SAMPLE filter.cf FILE"
#   output files are created in /var/perf/tmp
#   bos.perf.pmr (AIX 4.1 only) must be installed to run
#   the scripts in /usr/sbin/perf/pmr.  Personal scripts can
#   be used in place of those provided with bos.perf.pmr.
diskmax = Disk_>_busy "Busy percent - most busy disk"
cpunum  = CPU_#_user  "Number of CPU's on the system"
Better number for 1 more than then # of processors on the machine
@highcpuload:[/usr/sbin/perf/pmr/tprof_ 10] \
(CPU_gluser   CPU_glkern) > 90 && (Proc_runque > \
(DDS_IBM_Filters_cpunum   1)) \
"High CPU load"
@thrashing:   [/usr/sbin/perf/pmr/vmstat_ 60] \
(Mem_Virt_pgspgout * 6) > Mem_Virt_steal \
"System is thrashing"
@lowfree:     [/usr/sbin/perf/pmr/vmstat_ 60] \
Mem_Real_numfrb < 40 \
"Low free frames in RAM"
@lowpgsp:     [/usr/sbin/lsps -a >> /var/perf/tmp/pagingspace
2>&1] \
PagSp_%totalused > 90 \
"Low paging space"
@wait:        [/usr/sbin/perf/pmr/filemon_ 30] \
CPU_cpu0_wait > 0 && \
DDS_IBM_Filters_diskmax > 33 \
"I/O wait"
# "SAMPLE xmservd.cf FILE"
# Keep files at least 14 days and let each file contain
# one day's recordings
retain 14 1
# Set default sampling interval to 1 second
frequency 1000
# Statistics to record with default frequency
# Statistics recorded every 5 minutes
Mem/Real/%comp 300000
Mem/Real/%noncomp 300000
PagSp/%totalused 300000
# Record every weekday from 7 am to 7pm
start 1-5 7 0 1-5 19 0

[ Doc Ref: 90605219014820     Publish Date: Oct. 19, 2000     4FAX Ref: 6835 ]