Using MRNet on Cray Computers

The following sections describe the options and state variables that control the configuration and use of MRNet on Cray. For more information, see TotalView Command Syntax and TotalView Variables.

For more information on Cray, see Debugging Cray XT/XE/XK/XC Applications on page 540.

is_cray_xt Flag

State variable: TV::is_cray_xt boolean

Default value: Set to true if TotalView is running on Linux-x86_64 or Linux-ARM64 (aarch64) and /proc/cray_xt/nid exists; otherwise, set to false.

Note that some Cray front-end (elogin) nodes do not have a /proc/cray_xt/nid file, in which case a job must be submitted to start TotalView on a Cray XT/XE/XK/XC node, or tvconnect must be used in your batch job. (For detail on tvconnect, see Reverse Connections on page 491.)

is_cray_cti Flag

State variable: TV::is_cray_cti boolean

Default value: Set to true if TotalView is running on Linux-x86_64 or Linux-ARM64 (aarch64) and /opt/cray/pe/cti/ exists; otherwise, set to false. TotalView uses the CTI (Cray Tools Interface) library to deploy debugger processes on the node where your application is running.

–cray_xt_mrnet_server_launch_string

Option: –cray_xt_mrnet_server_launch_string string

State variable: TV::cray_xt_mrnet_server_launch_string string

Default value: /var/spool/alps/%A/toolhelper%A/tvdsvr%K \

             -working_directory %D -set_pw %P -verbosity %V %F

Analogous to the standard MRNet server launch string, the Cray XT MRNet server launch string is used when MRNet launches the TotalView debugger servers on Cray when using the ATH (ALPS Tool Helper) library. TotalView expands the launch string using the normal launch string expansion rules.

–cray_xt_mrnet_xfer_file_list

Option: –cray_xt_mrnet_xfer_file_list stringlist

State variable: TV::cray_xt_mrnet_xfer_file_list stringlist

Default value:

The default value is calculated at TotalView startup time, as follows. The following is used as a "base" list of files needed by TotalView on the Cray compute nodes when MRNet and the Cray ATH libraries are in use.

TVROOT/bin/mrnet_commnode_main_cray_xt

TVROOT/bin/tvdsvr_mrnet

TVROOT/bin/tvdsvrmain_mrnet

TVROOT/shlib/mpa/obj_cray_xt/libmpattr.so.1

TVROOT/shlib/unwind/obj/libunwind-*.so.8

TVROOT/shlib/mrnet/obj_cray_xt/libmrnet.so

TVROOT/shlib/mrnet/obj_cray_xt/libxplat.so

TVROOT/shlib/mrnet/obj_cray_xt/libservertree_filters.so.1

TVROOT/shlib/mrnet/obj_cray_xt/libtvwrapalps.so.1

/lib64/libthread_db.so.1

Note that the name of the "libunwind-*.so.8" library depends on the platform, and will be either "libunwind-x86_64.so.8" for x86_64 or "libunwind-aarch64.so.8" for ARM64.

On the x86_64 platform, TotalView also stages the libraries required to support ReplayEngine, which include:

/usr/bin/ld

/usr/bin/objcopy

TVROOT/lib/libundodb_debugger_x64.so

TVROOT/lib/undodb_a_x64.o

TVROOT/lib/libundodb_infiniband_preload_x64.so

TVROOT/lib/undodb_a_x32.o

TVROOT/lib/libundodb_infiniband_preload_x32.so

The above list is then passed to the shell script named "cray_sysdso_deps.sh" to calculate the system shared libraries needed by the executables and shared libraries on the base list. The actual list of system libraries can vary from system to system, but typically consists of the following files:

/lib64/libgcc_s.so.1

/usr/lib64/libbfd-<version>.so

/usr/lib64/libstdc++.so.6

The version of libbfd, which is needed by ld and objcopy, varies from system to system.

The default value is a space-separated string-list of file names that are transferred (staged) to the compute nodes. These files are the shell script, executable and shared library files required to run the MRNet commnode and TotalView debugger server processes on the compute nodes. When instantiating the MRNet tree on Cray, the ALPS Tool Helper library is used to broadcast these files into the compute nodes' ramdisk under the /var/spool/alps/apid directory. TVROOT is the path to the platform-specific files in the TotalView installation.

Note that most up-to-date Cray systems support the debugger using the Cray Tools Interface (CTI) library, however TotalView attempts to support older legacy Cray systems that do not have CTI available by using the ALPS Tool Helper (ATH) library.

–cray_cti_mrnet_xfer_file_list

Option: –cray_cti_mrnet_xfer_file_list stringlist

State variable: TV::cray_cti_mrnet_xfer_file_list stringlist

Default value:

The default value is calculated at TotalView startup time, as follows. The following is used the “base” list of files needed by TotalView on the Cray compute nodes when MRNet and the Cray CTI libraries are in use.

TVROOT/bin/mrnet_commnode_main_cray_cti

TVROOT/bin/tvdsvrmain_mrnet

TVROOT/shlib/mpa/obj_cray_xt/libmpattr.so.1

TVROOT/shlib/unwind/obj/libunwind-*.so.8

TVROOT/shlib/mrnet/obj_cray_cti/libmrnet.so

TVROOT/shlib/mrnet/obj_cray_cti/libxplat.so

TVROOT/shlib/mrnet/obj_cray_cti/libservertree_filters.so.1

TVROOT/shlib/mrnet/obj_cray_cti/libtvwrapcti.so.1

/lib64/libthread_db.so.1

Note that the name of the "libunwind-*.so.8" library depends on the platform, and will be either "libunwind-x86_64.so.8" for x86_64 or "libunwind-aarch64.so.8" for ARM64.

On the x86_64 platform, TotalView also stages the libraries required to support ReplayEngine, which include:

/usr/bin/ld

/usr/bin/objcopy

TVROOT/lib/libundodb_debugger_x64.so

TVROOT/lib/undodb_a_x64.o

TVROOT/lib/libundodb_infiniband_preload_x64.so

Note that CTI does not support staging 32-bit ELF files, therefore they are not included in the above list. Shared library dependencies are calculated by CTI itself, therefore CTI takes care of staging any additional required shared library dependencies.