TotalView : TotalView User Guide : PART V GPU Debugging : Debugging AMD ROCm Programs : AMD ROCm Debugging Model and Unified Display
AMD ROCm Debugging Model and Unified Display
Debugging HIP programs running on an AMD GPU presents some challenges when it comes to setting action points. When the host process starts, the GPU code objects have not yet been loaded onto the GPU, so the GPU code is not yet visible to the debugger for setting breakpoints. (This is also true of any libraries that are dynamically loaded using dlopen and against which the code was not originally linked.)
To address this issue, TotalView allows setting a breakpoint on any line in the Source view, whether or not it can identify executable code for that line. The breakpoint becomes either a pending breakpoint or a sliding breakpoint until the GPU code is loaded onto the GPU agent at runtime.
The Source view provides a unified display that includes line number symbols and breakpoints that span the host executable, host shared libraries, and the GPU ELF images loaded into the GPU agents. This design allows you to easily set breakpoints and view line number information for the host and HIP code at the same time. This is made possible by the way GPU agents are grouped, discussed in the section The TotalView AMD ROCm Debugging Model.