Everything contained within Task 1: “Getting Started” applies to obtaining memory information from parallel programs. As you would expect, additional information is needed when you will be adding a parallel program. Begin by selecting
Home | Add Program. This is also the screen displayed if you do not name a program when starting MemoryScape. Next, select
Add parallel program.
Figure 37 shows the screen that MemoryScape displays.
Much of what you see here is to help you start your program. You could specify everything here on the command line and there would be no difference in the way your program behaves. For detailed information, see
“Setting Up MPI Debugging Sessions”.
Enter the number of processes that your program should create. This is equivalent to the
–np argument used by most MPI systems.
Some MPI systems let you specify the number of nodes upon which your tasks will execute. For example, suppose your program will use sixteen tasks. If you specify four nodes, four tasks would execute on each node.
After selecting memory debugging options (this is described in the next task) and starting execution, MemoryScape begins capturing information from each executing task. The sole difference between using MemoryScape on a parallel program versus a non-parallel program is that you have many processes to examine instead of just one.
Figure 38 shows the Process Status and Control area for a 32 process MPI job.
MemoryScape will be collecting information for each of the 32 processes, allowing you to examine memory information for each. Examining each, however, is seldom productive. Instead, you need to focus in on where problems may be occurring. The Memory Usage charts are often the best place to start.
Figure 39 shows part of a stacked bar chart.
What this tells you is that you might want to focus in on what process 6 is doing. You might also want to use memory comparison features to compare what is going on in process 6 (the process using the most memory) to process 9 (the process using the least memory).
If possible, you should run your program a few times, stopping it periodically to get an idea of how it is using memory and if there are any patterns to its use. Another time, you may want to stop the program periodically when you see memory use changing and then export memory data. In this way, you can perform detailed analyses later. This is particularly important in situations where access to HPC machines is limited.