Chapter 10 ReplayEngine
ReplayEngine is embedded functionality on Linux x86 and x86-64 platforms that allows you go to backwards in a debugging session.
Note: If your platform does not support ReplayEngine, the ReplayEngine toolbar and ReplayEngine-related menu items do not display.
To make best use of this functionality, see
"How ReplayEngine Works".
For information on using ReplayEngine in a debugging session, see
"Using ReplayEngine".
ReplayEngine complements NextGen TotalView for HPC, so this discussion assumes a working knowledge of how the NextGen TotalView for HPC product works.
How ReplayEngine Works
Play It Backwards
The hardest step in locating software bugs involves working backward from a failure to identify the error that caused it. Conventional debugging techniques don’t make it easy to find the cause of an error, as they allow you to control program execution only in a forward direction.
Instead of going back to the beginning to try to recreate the conditions of a problem, ReplayEngine starts from the point of failure and works backward in time to find the cause. Recreating the conditions of a crash, sometimes the hardest problem in conventional forward debugging, is no longer necessary. You can now move to locate errors that occurred long before the failure they caused.
The Process of Recording and Playback
In order to move backward in your program, ReplayEngine saves state information as your program executes. This information includes the order in which your program executes and any changes to its data. When ReplayEngine is saving state information, it’s in record mode.
The saved state information is the program’s execution history. You can save the execution history at any time and then reload the recording when debugging the executable in a subsequent session.
Using a ReplayEngine command, either by clicking a toolbar button or by entering a command into the CLI, shifts ReplayEngine into replay mode. In this mode, you can move to any previously executed statement, at which point ReplayEngine displays its saved state information. The information displayed in replay mode is identical to the information displayed in record mode.
Most debugging commands work the same in replay mode as in record mode. Commands such as viewing a variable or setting a breakpoint work as you would expect. Debugging commands that do not work as expected are those that change or alter a recorded state. Typically, these are commands that:
• Change a variable’s value.
• Call functions that alter memory.
• Run threads asynchronously.
If your program calls a routine that displays information, in replay mode the routine will not display this information. For example, suppose your program calls printf(). When the printf() is executed in record mode, it writes text. However, when the printf() is replayed, the text is not rewritten. Similarly, if your program unlinks a file in record mode, the file remains always unlinked in replay mode. That is, in replay mode, the file will be unlinked even if you move back to before the unlink statement.
When executing in record mode, your program runs more slowly than without ReplayEngine turned on. Usually, you will not notice the extra execution time. However, when you are in replay mode, the computational overhead required to recreate the program’s state may be noticeable. When it needs extra time, ReplayEngine displays a dialog box to cancel the operation.