Totalview® for HPC User Guide
About This Guide
Content Organization
TotalView Family Differences
Using the CLI
Audience
Conventions
TotalView Documentation
Contacting Us
PART I Introduction to Debugging with TotalView
Chapter 1 About TotalView
Sessions Manager
GUI and Command Line Interfaces
The GUI
The CLI
Stepping and Breakpoints
Data Display and Visualization
Data Display
Diving in a Variable Window
Viewing a Variable Value across Multiple Processes or Threads
Simplifying Array Display
Viewing a Variable’s Changing Value
Setting Watchpoints
Data Visualization
The Array Visualizer
The Parallel Backtrace View
The Call Tree and Call Graph
The Message Queue Graph
C++ View
Tools for Multi-Threaded and Parallel Applications
Program Using Almost Any Execution Model
View Process and Thread State
Control Program Execution
Using Groups
Synchronizing Execution with Barrier Points
Batch and Automated Debugging
Remote Display
Debugging on a Remote Host
CUDA Debugger
Memory Debugging
Reverse Debugging
What’s Next
Chapter 2 Basic Debugging
Program Load and Navigation
Load the Program to Debug
The Root and Process Windows
Program Navigation
Stepping and Executing
Simple Stepping
Canceling
Setting Breakpoints (Action Points)
Basic Breakpoints
Evaluation Points
Saving and Reloading Action Points
Examining Data
Viewing Built-in Data
Viewing Variables in the Process Window
Viewing Variables in an Expression List Window
Viewing Compound Variables Using the Variable Window
Basic Diving
Nested Dives
Rediving and Undiving
Diving in a New Window
Displaying an Element in an Array of Structures
Visualizing Arrays
Launching the Visualizer from an Eval Point
Viewing Options
Moving On
Chapter 3 Accessing TotalView Remotely
About Remote Display
Remote Display Supported Platforms
Remote Display Components
Installing the Client
Installing on Linux
Installing on Microsoft Windows
Installing on Apple Mac OS X Intel
Client Session Basics
Working on the Remote Host
Advanced Options
Naming Intermediate Hosts
Submitting a Job to a Batch Queuing System
Setting Up Your Systems and Security
Session Profile Management
Batch Scripts
tv_PBS.csh Script
tv_LoadLeveler.csh Script
PART II Debugging Tools and Tasks
Chapter 4 Starting TotalView
Compiling Programs
Using File Extensions
Starting TotalView
Starting TotalView
Creating or Loading a Session
Debugging a Program
Debugging a Core File
Debugging with a Replay Recording File
Passing Arguments to the Program Being Debugged
Debugging a Program Running on Another Computer
Debugging an MPI Program
Using gnu_debuglink Files
Initializing TotalView
Exiting from TotalView
Chapter 5 Loading and Managing Sessions
Setting up Debugging Sessions
Loading Programs from the Sessions Manager
Starting a Debugging Session
Debugging a New Program
Attaching to a Running Program
Debugging a Core File
Debugging with a Replay Recording File
Launching your Last Session
Loading Programs Using the CLI
Debugging Options and Environment Setup
Adding a Remote Host
Options: Reverse Debugging, Memory Debugging, and CUDA
Setting Environment Variables and Altering Standard I/O
Environment Variables
Standard I/O
Adding Notes to a Session
Managing Sessions
Editing or Starting New Sessions in a Sessions Window
Other Configuration Options
Handling Signals
Setting Search Paths
Setting Startup Parameters
Setting Preferences
Setting Preferences, Options, and X Resources
Chapter 6 Using and Customizing the GUI
Overview
Using Mouse Buttons
Using the Root Window
Controlling the Display of Processes and Threads
Default View
Changing the Display
Grouping by Status and Source Line
Grouping by All Properties
Using the Old Root Window
Suppressing the Root Window
Using the Process Window
Resizing and Positioning Windows
About Diving into Objects
Saving the Data in a Window
Searching and Navigating Program Elements
Searching for Text
Looking for Functions and Variables
Finding the Source Code for Functions
Resolving Ambiguous Names
Finding the Source Code for Files
Resetting the Stack Frame
Viewing the Assembler Version of Your Code
Editing Source Text
Chapter 7 Stepping through and Executing your Program
Using Stepping Commands
Stepping into Function Calls
Stepping Over Function Calls
Executing to a Selected Line
Executing Out of a Function
Continuing with a Specific Signal
Killing (Deleting) Programs
Restarting Programs
Setting the Program Counter
Chapter 8 Setting Action Points
About Action Points
Print Statements vs. Action Points
Setting Breakpoints and Barriers
Setting Source-Level Breakpoints
Choosing Source Lines
Setting Breakpoints at Locations
Ambiguous Functions and Pending Breakpoints
Displaying and Controlling Action Points
Disabling Action Points
Deleting Action Points
Enabling Action Points
Suppressing Action Points
Setting Breakpoints on Classes and Functions
Setting Machine-Level Breakpoints
Setting Breakpoints for Multiple Processes
Setting Breakpoints When Using the fork()/execve() Functions
Debugging Processes That Call the fork() Function
Debugging Processes that Call the execve() Function
Example: Multi-process Breakpoint
Setting Barrier Points
About Barrier Breakpoint States
Setting a Barrier Breakpoint
Creating a Satisfaction Set
Hitting a Barrier Point
Releasing Processes from Barrier Points
Deleting a Barrier Point
Changing Settings and Disabling a Barrier Point
Defining Eval Points and Conditional Breakpoints
Setting Eval Points
Creating Conditional Breakpoint Examples
Patching Programs
Branching Around Code
Adding a Function Call
Correcting Code
About Interpreted and Compiled Expressions
About Interpreted Expressions
About Compiled Expressions
Allocating Patch Space for Compiled Expressions
Allocating Dynamic Patch Space
Using Watchpoints
Using Watchpoints on Different Architectures
Creating Watchpoints
Displaying Watchpoints
Watching Memory
Triggering Watchpoints
Using Multiple Watchpoints
Copying Previous Data Values
Using Conditional Watchpoints
Saving Action Points to a File
Chapter 9 Examining and Editing Data and Program Elements
Changing How Data is Displayed
Displaying STL Variables
Changing Size and Precision
Displaying Variables
Displaying Program Variables
Controlling the Displayed Information
Seeing Value Changes
Seeing Structure Information
Displaying Variables in the Current Block
Viewing Variables in Different Scopes as Program Executes
Scoping Issues
Freezing Variable Window Data
Locking the Address
Browsing for Variables
Displaying Local Variables and Registers
Interpreting the Status and Control Registers
Dereferencing Variables Automatically
Examining Memory
Displaying Areas of Memory
Displaying Machine Instructions
Rebinding the Variable Window
Closing Variable Windows
Diving in Variable Windows
Displaying an Array of Structure’s Elements
Changing What the Variable Window Displays
Viewing a List of Variables
Entering Variables and Expressions
Seeing Variable Value Changes in the Expression List Window
Entering Expressions into the Expression Column
Using the Expression List with Multi-process/Multi-threaded Programs
Reevaluating, Reopening, Rebinding, and Restarting
Seeing More Information
Sorting, Reordering, and Editing
Changing the Values of Variables
Changing a Variable’s Data Type
Displaying C and C++ Data Types
Viewing Pointers to Arrays
Viewing Arrays
Viewing typedef Types
Viewing Structures
Viewing Unions
Casting Using the Built-In Types
Viewing Character Arrays ($string Data Type)
Viewing Wide Character Arrays ($wchar Data Types)
Viewing Areas of Memory ($void Data Type)
Viewing Instructions ($code Data Type)
Viewing Opaque Data
Type-Casting Examples
Displaying Declared Arrays
Displaying Allocated Arrays
Displaying the argv Array
Changing the Address of Variables
Displaying C++ Types
Viewing Classes
C++View
Displaying Fortran Types
Displaying Fortran Common Blocks
Displaying Fortran Module Data
Debugging Fortran 90 Modules
Viewing Fortran 90 User-Defined Types
Viewing Fortran 90 Deferred Shape Array Types
Viewing Fortran 90 Pointer Types
Displaying Fortran Parameters
Displaying Thread Objects
Scoping and Symbol Names
Qualifying Symbol Names
Chapter 10 Examining Arrays
Examining and Analyzing Arrays
Displaying Array Slices
Using Slices and Strides
Using Slices in the Lookup Variable Command
Array Slices and Array Sections
Viewing Array Data
Expression Field
Type Field
Slice Definition
Update View Button
Data Format Selection Box
Filtering Array Data Overview
Filtering Array Data
Filtering by Comparison
Filtering for IEEE Values
Filtering a Range of Values
Creating Array Filter Expressions
Using Filter Comparisons
Sorting Array Data
Obtaining Array Statistics
Displaying a Variable in all Processes or Threads
Diving on a “Show Across” Pointer
Editing a “Show Across” Variable
Visualizing Array Data
Visualizing a “Show Across” Variable Window
Chapter 11 Visualizing Programs and Data
Displaying Call Trees and Call Graphs
Parallel Backtrace View
Array Visualizer
Command Summary
How the Visualizer Works
Viewing Data Types in the Visualizer
Viewing Data
Visualizing Data Manually
Using the Visualizer
Using Dataset Window Commands
Using View Window Commands
Using the Graph Window
Displaying Graph Views
Using the Surface Window
Displaying Surface Views
Manipulating Surface Data
Visualizing Data Programmatically
Launching the Visualizer from the Command Line
Configuring TotalView to Launch the Visualizer
Setting the Visualizer Launch Command
Adapting a Third Party Visualizer
Chapter 12 Evaluating Expressions
Why is There an Expression System?
Calling Functions: Problems and Issues
Expressions in Eval Points and the Evaluate Window
Using C++
Using Programming Language Elements
Using C and C++
Using Fortran
Fortran Statements
Fortran Intrinsics
Using the Evaluate Window
Writing Assembler Code
Using Built-in Variables and Statements
Using TotalView Variables
Using Built-In Statements
Expression Evaluation with ReplayEngine
Chapter 13 About Groups, Processes, and Threads
A Couple of Processes
Threads
Complicated Programming Models
Types of Threads
Organizing Chaos
How TotalView Creates Groups
Simplifying What You’re Debugging
Chapter 14 Manipulating Processes and Threads
Viewing Process and Thread States
Seeing Attached Process States
Seeing Unattached Process States
Using the Toolbar to Select a Target
Stopping Processes and Threads
Using the Processes/Ranks and Threads Tabs
The Processes Tab
The Threads Tab
Updating Process Information
Holding and Releasing Processes and Threads
Using Barrier Points
Barrier Point Illustration
Examining Groups
Placing Processes in Groups
Starting Processes and Threads
Creating a Process Without Starting It
Creating a Process by Single-Stepping
Stepping and Setting Breakpoints
Chapter 15 Debugging Strategies for Parallel Applications
General Parallel Debugging Tips
Breakpoints, Stepping, and Program Execution
Setting Breakpoint Behavior
Synchronizing Processes
Using Group Commands
Stepping at Process Level
Viewing Processes, Threads, and Variables
Identifying Process and Thread Execution
Viewing Variable Values
Restarting from within TotalView
Attaching to Processes Tips
MPI Debugging Tips and Tools
MPI Display Tools
MPI Rank Display
Displaying the Message Queue Graph Window
Displaying the Message Queue
MPICH Debugging Tips
IBM PE Debugging Tips
PART III Using the CLI
Chapter 16 Using the Command Line Interface (CLI)
About the Tcl and the CLI
About The CLI and TotalView
Using the CLI Interface
Starting the CLI
Startup Example
Starting Your Program
About CLI Output
‘more’ Processing
Using Command Arguments
Using Namespaces
About the CLI Prompt
Using Built-in and Group Aliases
How Parallelism Affects Behavior
Types of IDs
Controlling Program Execution
Advancing Program Execution
Using Action Points
Chapter 17 Seeing the CLI at Work
Setting the CLI EXECUTABLE_PATH Variable
Initializing an Array Slice
Printing an Array Slice
Writing an Array Variable to a File
Automatically Setting Breakpoints
PART IV Advanced Tools and Customization
Chapter 18 Setting Up Remote Debugging Sessions
About Remote Debugging
Platform Issues when Remote Debugging
Automatically Launching a Process on a Remote Server
Troubleshooting Server Autolaunch
Changing the Remote Shell Command
Changing Arguments
Autolaunching Sequence
Starting the TotalView Server Manually
TotalView Server Launch Options and Commands
Server Launch Options
Setting Single-Process Server Launch Options
Setting Bulk Launch Window Options
Customizing Server Launch Commands
Setting the Single-Process Server Launch Command
Setting the Bulk Server Launch Command
Debugging Over a Serial Line
Starting the TotalView Debugger Server
Chapter 19 Setting Up MPI Debugging Sessions
Debugging MPI Programs
Starting MPI Programs
Starting MPI Programs Using File > Debug New Parallel Program
The Parallel Program Session Dialog
MPICH Applications
Starting TotalView on an MPICH Job
Attaching to an MPICH Job
Using MPICH P4 procgroup Files
MPICH2 Applications
Downloading and Configuring MPICH2
Starting TotalView Debugging on an MPICH2 Hydra Job
Starting TotalView Debugging on an MPICH2 MPD Job
Starting the MPI MPD Job with MPD Process Manager
Starting an MPICH2 MPD Job
Cray MPI Applications
IBM MPI Parallel Environment (PE) Applications
Preparing to Debug a PE Application
Using Switch-Based Communications
Performing a Remote Login
Setting Timeouts
Starting TotalView on a PE Program
Setting Breakpoints
Starting Parallel Tasks
Attaching to a PE Job
Attaching from a Node Running poe
Attaching from a Node Not Running poe
IBM Blue Gene Applications
Open MPI Applications
QSW RMS Applications
Starting TotalView on an RMS Job
Attaching to an RMS Job
SGI MPI Applications
Starting TotalView on an SGI MPI Job
Attaching to an SGI MPI Job
Using ReplayEngine with SGI MPI
Sun MPI Applications
Attaching to a Sun MPI Job
Starting MPI Issues
Using ReplayEngine with Infiniband MPIs
Chapter 20 Setting Up Parallel Debugging Sessions
Debugging OpenMP Applications
Debugging OpenMP Programs
About TotalView OpenMP Features
About OpenMP Platform Differences
Viewing OpenMP Private and Shared Variables
Viewing OpenMP THREADPRIVATE Common Blocks
Viewing the OpenMP Stack Parent Token Line
Using SLURM
Debugging Cray XT Applications
Cray XT Catamount
Configuring Cray XT for TotalView
Using TotalView with your Cray XT System
Cray Linux Environment (CLE)
Support for Cray Abnormal Termination Processing (ATP)
Special Requirements for Using ReplayEngine
Debugging Global Arrays Applications
Debugging Shared Memory (SHMEM) Code
Debugging UPC Programs
Invoking TotalView
Viewing Shared Objects
Displaying Pointer to Shared Variables
Debugging CoArray Fortran (CAF) Programs
Invoking TotalView
Viewing CAF Programs
Using CLI with CAF
Chapter 21 Group, Process, and Thread Control
Defining the GOI, POI, and TOI
Recap on Setting a Breakpoint
Stepping (Part I)
Understanding Group Widths
Understanding Process Width
Understanding Thread Width
Using Run To and duntil Commands
Setting Process and Thread Focus
Understanding Process/Thread Sets
Specifying Arenas
Specifying Processes and Threads
Defining the Thread of Interest (TOI)
About Process and Thread Widths
Specifier Examples
Setting Group Focus
Specifying Groups in P/T Sets
About Arena Specifier Combinations
‘All’ Does Not Always Mean ‘All’
Setting Groups
Using the g Specifier: An Extended Example
Merging Focuses
Naming Incomplete Arenas
Naming Lists with Inconsistent Widths
Stepping (Part II): Examples
Using P/T Set Operators
Creating Custom Groups
Chapter 22 Scalability in HPC Computing Environments
Overview
Configuring TotalView for Scalability
Process Window’s Process Tab
dlopen Options
dlopen Event Filtering
Handling dlopen Events in Parallel
MRNet
TotalView Infrastructure Models
TotalView Infrastructure Models
Using MRNet with TotalView
General Use
Using MRNet on Blue Gene
Using MRNet on Cray Computers
Chapter 23 Checkpointing
Chapter 24 Fine-Tuning Shared Library Use
Preloading Shared Libraries
Controlling Which Symbols TotalView Reads
Specifying Which Libraries are Read
Reading Excluded Information
PART V Using the CUDA Debugger
Chapter 25 About the TotalView CUDA Debugger
TotalView CUDA Debugging Model
Installing the CUDA SDK Tool Chain
Backward Compatibility with CUDA Device Drivers
Directive-Based Accelerator Programming Languages
Chapter 26 CUDA Debugging Tutorial
Compiling for Debugging
Compiling for Fermi
Compiling for Fermi and Tesla
Compiling for Kepler
Starting a TotalView CUDA Session
Loading the CUDA Kernel
Controlling Execution
Running to a Breakpoint in the GPU code
Viewing the Kernel’s Grid Identifier
Single-Stepping GPU Code
Halting a Running Application
Displaying CUDA Program Elements
GPU Assembler Display
GPU Variable and Data Display
CUDA Built-In Runtime Variables
Type Casting
PTX Registers
Enabling CUDA MemoryChecker Feature
GPU Core Dump Support
GPU Error Reporting
Displaying Device Information
Chapter 27 CUDA Problems and Limitations
Hangs or Initialization Failures
CUDA and ReplayEngine
Chapter 28 Sample CUDA Program
PART VI Appendices
Appendix A Glossary
Appendix B Licenses
3rd-Party Licenses
CUDA License Information
Debugging Memory Problems with MemoryScape™
Chapter 1 Locating Memory Problems
Checking for Problems
Programs and Memory
Behind the Scenes
Your Program’s Data
The Data Section
The Stack
The Heap
Finding Heap Allocation Problems
Finding Heap Deallocation Problems
realloc() Problems
Finding Memory Leaks
Starting MemoryScape
Using MemoryScape Options
Preloading MemoryScape
Understanding How Your Program is Using Memory
Finding free() and realloc() Problems
Event and Error Notification
Types of Problems
Freeing Stack Memory
Freeing bss Data
Freeing Data Section Memory
Freeing Memory That Is Already Freed
Tracking realloc() Problems
Freeing the Wrong Address
Finding Memory Leaks
Fixing Dangling Pointer Problems
Dangling Pointers
Batch Scripting and Using the CLI
Batch Scripting Using tvscript
Using the -dheap Command
dheap Example
dheap
Notification When free Problems Occur
Showing Backtrace Information: dheap -backtrace:
Guarding Memory Blocks: dheap -guards
Memory Reuse: dheap -hoard
Writing Heap Information: dheap -export
Filtering Heap Information: dheap -filter
Checking for Dangling Pointers: dheap -is_dangling:
Detecting Leaks: dheap -leaks
Block Painting: dheap -paint
Red Zones Bounds Checking: dheap -red_zones
Deallocation Notification: dheap -tag_alloc
TVHEAP_ARGS
Examining Memory
Block Properties
Memory Contents Tab
Additional Memory Block Information
Filtering
Using Guard Blocks
Using Red Zones
Using Guard Blocks and Red Zones
Block Painting
Hoarding
Example 1: Finding a Multithreading Problem
Example 2: Finding Dangling Pointer References
Debugging with TotalView
Chapter 2 Memory Tasks
Task 1: Getting Started
Starting MemoryScape
Adding Programs and Files to MemoryScape
Attaching to Programs and Adding Core Files
Stopping Before Finishing Execution
Exporting Memory Data
MemoryScape Information
Where to Go Next
Task 2: Adding Parallel Programs
Task 3: Setting MemoryScape Options
Basic Options
Advanced Options
Halt execution at process exit (standalone MemoryScape only)
Halt execution on memory event or error
Guard allocated memory
Use Red Zones to find memory access violations
Restricting Red Zones
Customizing Red Zones
Paint memory
Hoard deallocated memory
Where to Go Next
Task 4: Controlling Program Execution
Controlling Program Execution from the Home | Summary Screen
Controlling Program Execution from the Manage Processes Screen
Controlling Program Execution from a Context Menu
Where to Go Next
Task 5: Seeing Memory Usage
Information Types
Process and Library Reports
Chart Report
Where to Go Next
Task 6: Using Runtime Events
Error Notifications
Deallocation and Reuse Notifications
Where to Go Next
Task 7: Graphically Viewing the Heap
Window Sections
Block Information
Bottom Tabbed Areas
Where to Go Next
Task 8: Obtaining Detailed Heap Information
Heap Status Source Report
Heap Status Source Backtrace Report
Where to Go Next
Task 9: Seeing Leaks
Adding, Deleting, Enabling and Disabling Filters
Adding and Editing Filters
Where to Go Next
Task 11: Viewing Corrupted Memory
Examining Corrupted Memory Blocks
Viewing Memory Contents
Task 12: Saving and Restoring Memory State Information
Procedures for Exporting and Adding Memory Data
Using Saved State Information
Where to Go Next
Task 13: Comparing Memory
Overview
Obtaining a Comparison
Memory Comparison Report
Where to Go Next
Task 14: Saving Memory Information as HTML
Saving Report Information
Task 15: Hoarding Deallocated Memory
Task 16: Painting Memory
Chapter 3 Remote Access
Using Remote Display
Chapter 4 Creating Programs for Memory Debugging
Compiling Programs
Linking with the dbfork Library
dbfork on IBM AIX on RS/6000 Systems
Linking C++ Programs with dbfork
dbfork and Linux or Mac OS X
dbfork and SunOS 5 SPARC
Ways to Start MemoryScape
Attaching to Programs
Setting Up MPI Debugging Sessions
Debugging MPI Programs
Debugging MPICH Applications
Starting MemoryScape on an MPICH Job
Attaching to an MPICH Job
Using MPICH P4 procgroup Files
Starting MPI Issues
Debugging IBM MPI Parallel Environment (PE) Applications
Using Switch-Based Communications
Performing a Remote Login
Starting MemoryScape on a PE Program
Attaching to a PE Job
Debugging LAM/MPI Applications
Debugging QSW RMS Applications
Starting MemoryScape on an RMS Job
Attaching to an RMS Job
Debugging Sun MPI Applications
Attaching to a Sun MPI Job
Linking Your Application with the Agent
Using env to Insert the Agent
Installing tvheap_mr.a on AIX
LIBPATH and Linking
Using MemoryScape in Selected Environments
MPICH
IBM PE
RMS MPI
Chapter 5 MemoryScape Scripting
display_specifiers Command-Line Option
event_action Command-Line Option
Other Command Line Options
memscript Example
Chapter 6 MemoryScape Command-Line Options
Invoking MemoryScape
Syntax
Options
Reverse Debugging with ReplayEngine™
Chapter 1 Understanding ReplayEngine
TotalView ReplayEngine: A New Paradigm in Debugging
How ReplayEngine Works
System Resources ReplayEngine Uses
Replaying Your Program
Threads and Processes
Attaching to Running Programs
Saving and Loading the Execution History
Chapter 2 Using ReplayEngine
Enabling and Disabling ReplayEngine
Enabling ReplayEngine at Program Load
Enabling and Disabling ReplayEngine for a Loaded Program
Enabling Replay
Disabling Replay
ReplayEngine and CUDA
ReplayEngine and Expression Evaluation
Examining Program State and History
Setting Preferences
CLI Support
Known Limitations and Issues
Limitations
Performance Issues
Totalview® for HPC User Guide
Performance Issues
FixingDanglingPointerProblems
MSUGBlockProperties
CreatingPrograms
LinkingYourApplicationWithTheAgent
Installingtvheap_mraonAIX
LIBPATHandLinking
OtherTopics