Single-Stepping GPU Code
TotalView allows you to single-step GPU code just like normal host code. The GPU focus work-item is used to guide the single-step operation in a manner similar to single-stepping a CPU thread. TotalView uses GPU-level instruction disassembly, instruction-level stepping, and breakpoint hopping to implement GPU source-level single stepping.
*Thread-width single-stepping: When the single-stepping width is a single GPU agent thread, all the waves on the agent are allowed to execute until the GPU focus work-item reaches the source-level single-stepping goal.
*Process or group-width single stepping: When the single-stepping width is a process or a group, all the GPU agent threads in the process or group are allowed to execute. This technique tends to keep all the waves executing in lockstep because they normally hit the temporary breakpoints planted by the source-level single stepper. However, wave and work-item divergence are allowed.