Starting TotalView on an MPICH Job

Before you can bring an MPICH job under TotalView’s control, both TotalView and the tvdsvr must be in your path, most easily set in a login or shell startup script.

For version 1.1.2, the following command-line syntax starts a job under TotalView control:

mpirun [ MPICH-arguments ] –tv program [ program-arguments ]

For example:

mpirun –np 4 –tv sendrecv

The MPICH mpirun command obtains information from the TOTALVIEW environment variable and then uses this information when it starts the first process in the parallel job.

For Version 1.2.4, the syntax changes to the following:

mpirun –dbg=totalview [ other_mpich-args ] program [ program-args ]

For example:

mpirun –dbg=totalview –np 4 sendrecv

In this case, mpirun obtains the information it needs from the –dbg command-line option.

In other contexts, setting this environment variable means that you can use different versions of TotalView or pass command-line options to TotalView.

For example, the following is the C shell command that sets the TOTALVIEW environment variable so that mpirun passes the –no_stop_all option to TotalView:

setenv TOTALVIEW "totalview –no_stop_all"

TotalView begins by starting the first process of your job, the master process, under its control. You can then set breakpoints and begin debugging your code.

On the IBM SP computer with the ch_mpl device, the mpirun command uses the poe command to start an MPI job. While you still must use the MPICH mpirun (and its –tv option) command to start an MPICH job, the way you start MPICH differs. For details on using TotalView with poe, see “Starting TotalView on a PE Program”.

Starting TotalView using the ch_p4mpd device is similar to starting TotalView using poe on an IBM computer or other methods you might use on Sun and HP platforms. In general, you start TotalView using the totalview command, with the following syntax;

totalview mpirun [ totalview_args ] –a [ mpich-args ] program [ program-args ]

CLI:

totalviewcli mpirun [ totalview_args ] \
–a [ mpich-args ] program [ program-args ]

As your program executes, TotalView automatically acquires the processes that are part of your parallel job as your program creates them. Before TotalView begins to acquire them, it asks if you want to stop the spawned processes. If you click Yes, you can stop processes as they are initialized. This lets you check their states or set breakpoints that are unique to the process. TotalView automatically copies breakpoints from the master process to the slave processes as it acquires them. Consequently, you don’t have to stop them just to set these breakpoints.

If you’re using the GUI, TotalView updates the Root Window to show these newly acquired processes. For more information, see “Attaching to Processes Tips”.