Support for Cray Abnormal Termination Processing (ATP)
Cray's ATP module stops a running job at the moment it crashes. This allows you to attach TotalView to the held job and begin debugging it. To hold a job as it is crashing you must set the ATP_HOLD_TIME environment variable before launching your job with aprun.
When your job crashes,
aprun outputs a message stating that your job has crashed and that ATP is holding it. You can now attach TotalView to
aprun using the normal attach procedure (see
"Attaching to a Running Program".
For more information on ATP, see the Cray intro_atp man page.