MPICH2 Applications

You should be using MPICH2 version 1.0.5p4 or higher. Earlier versions had problems that prevented TotalView from attaching to all the processes or viewing message queue data.

Downloading and Configuring MPICH2

You can download the current MPICH2 version from:

http://www.mpich.org/downloads/versions/

If you wish to use all of the TotalView MPI features, you must configure MPICH2. Do this by adding one of the following to the configure script that is within the downloaded information:

- -enable-debuginfo  

or

- - -enable-totalview

The configure script looks for the following file:

python2.x/config/Makefile

It fails if the file is not there.

The next steps are:

1.  Run make

2.  Run make install

This places the binaries and libraries in the directory specified by the optional - -prefix option.

3.  Set the PATH and LD_LIBRARY_PATH to point to the MPICH2 bin and lib directories.

Starting TotalView Debugging on an MPICH2 Hydra Job

As of MPICH2 1.4.1, the default job type for MPICH2 is Hydra. If you are instead using MPD, see Starting TotalView Debugging on an MPICH2 MPD Job.

Start a Hydra job as follows:

totalview -args mpiexec mpiexec-args program program-args

You may not see sources to your program at first. If you do see the program, you can set breakpoints. In either case, press the Go button to start your process. TotalView displays a dialog box when your program goes parallel that allows you to stop execution so you can set breakpoints.

(This is the default behavior. You can change it using the options within File >Preferences >Parallel page. See Parallel Attach Behaviors.)

Starting TotalView Debugging on an MPICH2 MPD Job

You must start the mpd daemon before starting an MPICH2 MPI job.

As of MPICH2 1.4.1, the default job type is Hydra, rather than MPD, so if you are using the default, there is no need to start the daemon. See Starting TotalView Debugging on an MPICH2 Hydra Job.

Starting the MPI MPD Job with MPD Process Manager

To start the mpd daemon, use the mpdboot command. For example:

mpdboot -n 4 -f hostfile

where:

-n 4

The number of hosts on which you wish to run the daemon. In this example, the daemon runs on four hosts.

-f hostfile

Lists the hosts on which the application will run. In this example, a file named hostfile contains this list.

You are now ready to start debugging your application.

Starting an MPICH2 MPD Job

Start an MPICH2 MPD job in one of the following ways:

mpiexec mpi-args -tv program -a program-args

This command tells MPI to start TotalView. You must have set the TOTALVIEW environment variable with the path to TotalView’s executable when you start a program using mpiexec. For example:

setenv TOTALVIEW \
          /opt/totalview/bin/totalview

This method of starting TotalView does not let you restart your program without exiting TotalView and you will not be able to attach to a running MPI job.

totalview python -a `which mpiexec` \ -tvsu mpiexec-args program program-args

This command lets you restart your MPICH2 job. It also lets you attach to a running MPICH2 job by using the Attach to a Running Program dialog box. You need to be careful that you attach to the right instance of python as it is likely that a few instances are running. The one to which you want to attach has no attached children—child processes are indented with a line showing the connection to the parent.

You may not see sources to your program at first. If you do see the program, you can set breakpoints. In either case, press the Go button to start your process. TotalView displays a dialog box when your program goes parallel that allows you to stop execution. (This is the default behavior. You can change it using the options within File >Preferences >Parallel page.)

You will also need to set the TOTALVIEW environment variable as indicated in the previous method.