TotalView User Guide : Part V: Debugging : Using the CUDA Debugger : Problems and Limitations : System Limitations : SDK 3.0, 3.1, 3.2, 4.0, and 4.1 Limitations

SDK 3.0, 3.1, 3.2, 4.0, and 4.1 Limitations
*
Kernel launches: The CUDA debugging environment enforces blocking kernel launches.
*
Device memory: Device memory allocated via cudaMalloc() is not visible outside the kernel function.
*
Illegal program behavior: The debugger does not catch all illegal program behavior; examples include out of bounds memory accesses or divide-by-zero. For information on detecting addressing violations and errors in general, see Enabling CUDA MemoryChecker Feature” and GPU Error Reporting”.
*
Device allocations: Device allocations larger than 100 MB on Tesla GPUs, and larger than 32 MB on Fermi GPUs, may not be accessible in the debugger.
*
Breakpoints: Breakpoints in divergent code may not behave as expected.
*
Textures: Debugging applications using textures is not supported on GPUs with sm_type less than sm_20.
*
Multiple CUDA contexts: For SDK drivers 3.0, 3.1, 3.2, and 4.0, debugging applications with multiple CUDA contexts running on the same GPU is not supported on any GPU. For SDK 4.1, this limitation applies only to compute capabilities less than SM20.

Rogue Wave Software, Inc.
Voice: (303) 473-9118
rwonlinedocs@roguewave.com