Using MPICH P4 procgroup Files
If you’re using MPICH with a P4 procgroup file (by using the -p4pg option), you must use the same absolute path name in your procgroup file and on the mpirun command line. For example, if your procgroup file contains a different path name than that used in the mpirun command, even though this name resolves to the same executable, TotalView assumes that it is a different executable, which causes debugging problems.
The following example uses the same absolute path name on the TotalView command line and in the procgroup file:
% cat p4group
local 1 /users/smith/mympichexe
bigiron 2 /users/smith/mympichexe
% mpirun -p4pg p4group -tv /users/smith/mympichexe
In this example, TotalView does the following:
1. Reads the symbols from mympichexe only once.
2. Places MPICH processes in the same TotalView share group.
3. Names the processes mypichexe.0, mympichexe.1, mympichexe.2, and mympichexe.3.
If TotalView assigns names such as mympichexe<mympichexe>.0, a problem occurred and you need to compare the contents of your procgroup file and mpirun command line.
MPICH Debugging Tips
These debugging tips apply only to MPICH:
*Passing options to mpirun
You can pass options to TotalView using the MPICH mpirun command.
To pass options to TotalView when running mpirun, you can use the TOTALVIEW environment variable. For example, you can cause mpirun to invoke TotalView with the -no_stop_all option, as in the following C shell example:
setenv TOTALVIEW "totalview -no_stop_all"
*Using ch_p4
If you start remote processes with MPICH/ch_p4, you may need to change the way TotalView starts its servers.
By default, TotalView uses ssh to start its remote server processes. This is the same behavior as ch_p4 uses. If you configure ch_p4 to use a different startup mechanism from another process, you probably also need to change the way that TotalView starts the servers.