not really. It is offically called performance troubleshooting which is part of my job. The following checks have been proven very useful in my PD session on WAS & DB2 application hanging. They’ve saved me lots of time. Moreover other application servers such as WebLogic / Oracle can use the similar strategy. First reproduce the hang, then do the following during the hang :
- Use vmstat and iostat to check AIX’s CPU and I/O status, how busy/idle they are. At least we’ve got an idea whether it is cpu-bound or io-bound problem, this should be done on Websphere as well DB2 machine depends on deployment.
- Take a javacore dump on AIX with "kill -3 ". This java stack trace is so valuable to determine what it is doing during the hang. It is key to the treasure hunting. This helped me to identify one JSP recompilation problem. I was just getting suspicious before I did it.
- Take an DB2 event monitor during the hang. Identity the long running SQLs. Pun them thru DB2 CLP is easier to reproduce the problem. This helped me to identity a couple of long running SQLs. Then I will employ some optimization tech to speed it up such as adding more index or increasing optimization level for the sql.
Here are the detailed info on step 3. a shell script do a DB2 event monitor dump.
db2 "create event monitor EVMTEST for $EVENTS write to file ‘$outdir’"
db2 "set event monitor EVMTEST state = 1"
echo "Please execute your SQL"
echo "Dump the event monitor ….."
db2 "set event monitor EVMTEST state = 0"
echo "Using db2evmon to get event monitor output …."
db2evmon -db dbname -evm EVMTEST > evm.out