dcuda

Manages GPU threads

Format

dcuda block [(Bx,By,Bz)]

dcuda thread [(Tx,Ty, Tz)]

dcuda kernel

dcuda device [<n>]

dcuda sm [<n>]

dcuda warp [<n>]

dcuda lane [<n>]

dcuda info-system

dcuda info-device

dcuda info-sm

dcuda info-warp

dcuda info-lane

dcuda focus (Bx,By,Bz),(Tx,Ty, Tz)

dcuda hwfocus <D/S/W/L>

Arguments

Bx, By, Bz

The x, y and z block indices

Tx, Ty, Tz

The x, y, and z thread indices

D/S/W/L

The coordinates defining the physical space of the hardware:

D: device number

S: streaming multiprocessor (SM)

W: warp (WP) number on the SM

L: lane (LN) number on the warp

Description

The dcuda commands allow you to manage and view GPU threads, in either the logical coordinate space of block and thread indices (<<<(Bx,By,Bz),(Tx,Ty,Tz)>>>) or the physical coordinate space that defines the hardware (the device number, the streaming multiprocessor number on the device, the warp number on the SM, and lane number on the warp).

dcuda block [(Bx,By,Bz)]

  • With no arguments, shows the current CUDA block

  • With a block argument of the form (Bx,By,Bz), changes the CUDA focus to that block. Parameters to the right (By and Bz, or just Bz) may be omitted; these are unchanged.

dcuda thread [(Tx,Ty,Tz)]

  • With no arguments, shows the current CUDA thread.

  • With a thread argument of the form (Tx,Ty,Tz), changes the CUDA focus to that thread. Parameters to the right (Ty and Tz, or just Tz) may be omitted; these are unchanged.

dcuda kernel

Displays the logical and hardware coordinates of the current CUDA context.

dcuda device [<n>]

With no arguments, shows the current CUDA device.

With a numeric argument, changes the CUDA device focus to that device.

dcuda sm [<n>]

  • With no arguments, shows the current CUDA SM (streaming multiprocessor).

  • With a numeric argument, changes the CUDA SM focus to that SM.

dcuda warp [<n>]

  • With no arguments, shows the current CUDA warp.

  • With a numeric argument, changes the CUDA warp focus to that warp.

dcuda lane [<n>]

  • With no arguments, shows the current CUDA lane.

  • With a numeric argument, changes the CUDA lane focus to that lane.

dcuda info-system

Displays the CUDA devices in the system.

dcuda info-device

Displays currently running SMs in the current device.

dcuda info-sm

Displays valid warps in the current SM.

dcuda info-warp

Displays valid lanes in the current warp.

dcuda info-lane

Displays the current lane.

dcuda focus (Bx,By, Bz),(Tx,Ty,Tz)

Changes the focus via CUDA logical coordinates of the form <<<(Bx,By,Bz),(Tx,Ty,Tz)>>>.

The following abbreviations are also accepted:

<<<Tx>>>

<<<(Tx)>>>

<<<(Tx,Ty)>>>

<<<(Tx,Ty,Tz)>>>

<<<(Bx),(Tx)>>>

<<<(Bx),(Tx,Ty)>>>

<<<(Bx),(Tx,Ty,Tz)>>>

<<<(Bx,By),(Tx)>>>

<<<(Bx,By),(Tx,Ty)>>>

<<<(Bx,By),(Tx,Ty,Tz)>>>

<<<(Bx,By,Bz),(Tx)>>>

<<<(Bx,By,Bz),(Tx,Ty)>>>

<<<(Bx,By,Bz),(Tx,Ty,Tz)>>>

Angle brackets are optional, but must be balanced.

dcuda hwfocus <D/S/W/L>

Changes the focus via CUDA hardware coordinates of the form D/S/W/L, S/W/L, W/L, or L.

Command alias

Alias

Definition

Description

cuda

dcuda

Writes out the focus thread, as in dcuda kernel.

Examples

Displaying device information

dcuda info-device

Output:

DEV: 0/1 Device Type: gt200 SM Type: sm_13 SM/WP/LN: 30/32/32 Regs/LN: 128

SM: 0/30 valid warps: 0x0000000000000001

 

dcuda info-sm

Output:

DEV: 0/1 Device Type: gt200 SM Type: sm_13 SM/WP/LN: 30/32/32 Regs/LN: 128

SM: 0/30 valid warps: 0x0000000000000001

WP: 0/32 valid/active/divergent lanes: 0x0000000f/0x0000000f/0x00000000 block: (0,0,0)

 

dcuda info-warp

Output:

DEV: 0/1 Device Type: gt200 SM Type: sm_13 SM/WP/LN: 30/32/32 Regs/LN: 128

SM: 0/30 valid warps: 0x0000000000000001

WP: 0/32 valid/active/divergent lanes: 0x0000000f/0x0000000f/0x00000000 block: (0,0,0)

LN: 0/32 pc=0x000000001ef2efa8 thread: (0,0,0)

LN: 1/32 pc=0x000000001ef2efa8 thread: (1,0,0)

LN: 2/32 pc=0x000000001ef2efa8 thread: (0,1,0)

LN: 3/32 pc=0x000000001ef2efa8 thread: (1,1,0)

 

dcuda info-lane

Output:

DEV: 0/1 Device Type: gt200 SM Type: sm_13 SM/WP/LN: 30/32/32 Regs/LN: 128

SM: 0/30 valid warps: 0x0000000000000001

WP: 0/32 valid/active/divergent lanes: 0x0000000f/0x0000000f/0x00000000 block: (0,0,0)

Displaying the focus

dcuda warp sm

Output:

sm 0 warp 0

 

dcuda lane device

Output:

device 0 lane 3

 

dcuda thread

Output:

thread (1,1,0)

 

dcuda kernel

Output:

device 0, sm 0, warp 0, lane 3, block (0,0,0), thread (1,1,0)

Changing the focus

In these commands, note that TotalView assigns CUDA threads a negative thread ID. In the examples here, the CUDA thread is labeled "1.-1".

dcuda thread (1,1,0)

Changes the CUDA focus to the thread represented by logical coordinates 1,1,0.

New CUDA focus (1.-1): device 0, sm 0, warp 0, lane 3, block (0,0,0), thread (1,1,0)

 

dcuda lane 2

Changes the CUDA focus to lane 2.

New CUDA focus (1.-1): device 0, sm 0, warp 0, lane 2, block (0,0,0), thread (0,1,0)

 

dcuda lane 1 sm 0

Changes the CUDA focus to lane 1 and to SM 0.

New CUDA focus (1.-1): device 0, sm 0, warp 0, lane 1, block (0,0,0), thread (1,0,0)

 

dcuda thread 0,0,0

Changes the CUDA focus to thread 0,0,0.

New CUDA focus (1.-1): device 0, sm 0, warp 0, lane 0, block (0,0,0), thread (0,0,0)

 

dcuda thread 1

Changes the CUDA focus to thread 1,0,0.

New CUDA focus (1.-1): device 0, sm 0, warp 0, lane 1, block (0,0,0), thread (1,0,0)

Related Topics

Debugging NVIDIA CUDA Programs