Two state variables and their related command line options enable you to filter dlopen events to defer planting breakpoints in the dlopened libraries until the process stops for some other reason. Deferring dlopen event processing allows the debugger to handle all dynamically loaded shared libraries at the same time, which is much more efficient than handling them serially.
Three possible dlopen filtering modes are available using these variables: Slow, Medium, and Fast, in which Fast provides the best performance, although it won’t be suitable for some debugging situations. For detail, see Filtering Modes.
You can configure TotalView to filter dlopen events for all invocations of TotalView using your .tvdrc file. For example, to use Fast mode by default for every TotalView session, add the following to your .tvdrc file:
dset TV::dlopen_always_recalculate false
dset TV::dlopen_recalculate_on_match ""
Or, launch just an individual instance of TotalView with these settings by entering:
Filtering modes for dlopen include Fast, Medium, and Slow. In Fast mode, the process never stops for a dlopen event, not even "null" dlopen events. Using this option can result in significant performance gains, but may be impractical for some applications. In Medium mode, some libraries can be specified to either immediately reevaluate or defer evaluation of breakpoint specifications, rather than all or none. In Slow mode, every dlopen event results in the immediate reevaluation of breakpoint specifications.
• Slow Mode: Reloads libraries on every dlopen event
Option:
dset TV::dlopen_always_recalculate true
Reloads libraries on every dlopen event, retaining TotalView’s traditional breakpoint reevaluation semantics. This mode is compatible with CUDA and is a good choice when your session has pending breakpoints. However, this mode does not perform or scale as well as the other modes, because it requires the TotalView client to handle every (non-null) dlopen event for every process.
If performance is not the primary concern, or the application or runtime environment does not perform many dlopen events, then this may be a good choice.
In this mode, when the target stops with a dlopen event, the debugger server reports the event to the debugger client, where the library list is reloaded and checked to see if any additional breakpoint locations need to be planted in the newly loaded libraries.
• Medium Mode: Reports or defers libraries that match defined patterns on a dlopen event
Options:
dset TV::dlopen_always_recalculate false
dset TV::dlopen_recalculate_on_match {glob-list}
A glob-list is a colon-separated list of positive or negated Tcl glob match patterns used to determine if the dlopened library event should be reported or deferred. For example:
Immediately report dlopen events for libraries that match any of the patterns on the glob-list, but defer reporting other dlopen events:
"*/libcuda.so*:*/libmylib1*:*/libmylib2.so"
Defer reporting dlopen events for libraries that match any of the patterns on the glob-list, but immediately report other dlopen events:
The glob match rules are defined by the standard Tcl string match command. For details and examples, see glob-list Matching Rules. Note that the library names are typically absolute path names, for example "/lib64/libc.so", so the glob patterns must take that into account.
This mode strikes a balance between performance and enabling breakpoints to be planted in dlopened libraries, and is useful if you have specific shared libraries that you know you always, or never, want to defer. For example, Open MPI performs many dlopen calls in parallel programs, however most users are not interested in planting breakpoints or debugging the Open MPI libraries themselves. Therefore, it makes sense to defer reporting dlopen events for Open MPI libraries.
In Medium mode, the target process stops on every dlopen event (just as in Slow mode), but:
— If a newly loaded library matches a positive glob-list entry, the event is immediately reported to the client, but all other libraries are deferred. For example:
Here, when /home/jones/libfoo.so or /home/jones/libbar.so are loaded, the dlopen event is immediately reported and breakpoints are reevaluated because their names match a pattern in the glob-list. However, when /usr/lib64/libompi.so is loaded, breakpoints are deferred because its name does not match a pattern in the glob-list.
— If a newly loaded library matches a negated glob-list pattern, and the list contains only negated patterns (i.e., does not contain a combination of negated and positive patterns), the event is deferred for that library, but all other libraries not matching a negated pattern are immediately reported.
— If a newly loaded library matches a negated glob-list pattern, and the list contains a combination of positive and negated patterns, the event might be deferred, depending on other library names in the library list. See glob-list Matching Rules for details.
— Adding "*/libcuda.so*" to the match list if you are debugging CUDA; otherwise TotalView will miss CUDA kernel launch events.
• Fast Mode: Does not stop for dlopen events
Options:
dset TV::dlopen_always_recalculate false
dset TV::dlopen_recalculate_on_match ""
This mode provides the best performance, deferring planting breakpoints in all dlopened libraries when a library is loaded. Breakpoints (pending or not) are planted in the dlopened libraries only when the process stops for some other reason; however, be aware that with this option, an application may have executed past the point at which you want to start debugging inside the dlopened library.
Because the debugger does not plant the dlopen breakpoint in the process, the process never stops for a dlopen event, not even "null" dlopen events.
While this mode may be impractical for some debugging situations, the performance gains are significant.
Table 4 summarizes the pros and cons of each mode.
Table 4: dlopen Event Filtering Modes
Mode/Speed
Option
Slow
TV::dlopen_always_recalculate true
Pros:
• Retains TotalView’s traditional breakpoint reevaluation semantics.
• Works best with pending breakpoints.
• Compatible with CUDA.
Cons:
• Does not perform or scale as well as the other modes because the TotalView client handles every (non-null) dlopen event for every process.
• Allows the TotalView client to process multiple dlopen events at a time, which is much more efficient.
• Compatible with CUDA.
Cons:
• Process stops at the dlopen breakpoint, even for "null" dlopen events.
• An application may execute past the point at which you want to start debugging inside the dlopened library.
• Requires adding to glob-list any libraries that should or should not cause breakpoint specifications to be reevaluated immediately when the library is loaded.
• Requires adding to the glob-list*/libcuda.so* for CUDA support.
• Performs best by never stopping the process at dlopen events.
• Allows the TotalView client to process multiple dlopen events at a time.
Cons:
• Breakpoints are not recalculated when a particular library is loaded, which breaks pending breakpoints and traditional breakpoint semantics.
• Breaks CUDA support.
glob-list Matching Rules
The glob-list is a colon-separated list of positive or negated Tcl glob match patterns. A glob match is negated if it starts with an exclamation point (!), which is removed from the pattern before testing for a match. The glob match rules are defined by the standard Tcl string match command. For example:
• Positive match pattern: /lib/libfoo*
• Negated match pattern: !/lib/libfoo*
Note that:
• The order of positive or negated glob-list patterns matter, if you are mixing positive and negated patterns.
• Spaces are included in a match, so stray spaces will impact the result.
• Empty patterns (see below) are allowed, but will result in no match.
• A trailing, negated empty pattern is allowed, which affects only the default result.
• Library names are typically absolute path names (e.g., "/lib64/libc.so"), so the glob patterns must take that into account.
• Tcl string match glob rules are not the same as shell glob rules, in that a "*" will match across directory boundaries. For example, the glob pattern "*/libfoo.so" will match "/lib/libfoo.so" and "/usr/lib/libfoo.so".
Mixed positive and negated patterns
While combining positive and negated patterns is likely to be rare, in some cases it is useful, for example, to defer reporting dlopen events for all shared libraries in a directory except one.
A glob-list that contains a combination of positive and negated glob patterns can return varied results:
1. When a library name matches a positive match pattern, the dlopen event is reported immediately, even if there are more library names on the library list that would have resulted in a negated match.
2. When a library name matches a negated match pattern, the reporting of the dlopen event might be deferred. If there are more library names on the library list (because loading the library resulted in loading its dependent libraries), they are also checked for a positive match. If a positive match is found for any of the dependent libraries, the dlopen event is reported immediately.
3. In both cases, once a library name matches a pattern, any remaining patterns on the glob-list are not checked.
Empty match patterns
The glob-list is allowed to have empty match patterns, which may either be a positive, empty match pattern ("") or a negated, empty match pattern ("!").
An empty match pattern never matches a library name, but might affect the default result:
• A positive empty match pattern is ignored and does not affect the default result.
• A negated empty match pattern is ignored but might affect the default result.
Results when there is no match
If no match is found in the glob-list for any library name on the library list, the default result is determined as follows:
1. If the last, non-empty pattern on the pattern list is a positive match pattern, reporting the dlopen event is deferred.
2. If the last, non-empty pattern on the pattern list is a negated match pattern (empty or not), the dlopen event is reported.
3. If the pattern list consists solely of positive empty match patterns (e.g., ":::"), reporting the dlopen event is deferred.
Examples
Defer all libraries except those in a specific directory
A glob-list can contain a path to a directory containing shared libraries:
In this case, TotalView calculates breakpoint specifications on all shared libraries except those in the /home/jones/project/lib/ directory.
Negated and positive patterns
You can control the results, based on the combination of negated and positive patterns, the order of the patterns, or the use of negated empty match patterns.
Consider the following glob-list containing both a negated and a positive pattern:
matches the first positive glob pattern “*/libopen-rte.so*”, so the dlopen event is reported immediately.
• /opt/mware/openmpi/lib/openmpi/mca_gizmo.so
matches the second negated glob pattern “/*/mware/*”, so reporting the dlopen event is deferred.
• /home/jones/project/lib/libmine.so
does not match either glob pattern, therefore a default result is returned. Since the last glob pattern on the list is a negated pattern, the dlopen event is reported.
Pattern order
If the glob-list contains both negated and positive patterns, the order in which the patterns appear matters and can result in unintended behavior. Consider what would happen if the patterns used in the previous example were swapped:
matches the first negated glob pattern "/*/mware/*", so the dlopen event is deferred. The second glob pattern "*/libopen-rte.so*" is not checked, because once a library name matches a pattern, any remaining patterns on the glob-list are not checked.
• /opt/mware/openmpi/lib/openmpi/mca_gizmo.so
matches the first negated glob pattern "/*/mware/*", so the dlopen event is deferred.
• /home/jones/project/lib/libmine.so
does not match either glob pattern, therefore a default result is returned. Since the last glob pattern on the list is a positive pattern, the dlopen event is deferred.
Simply swapping the patterns resulted in deferring every dlopen event, which is probably not the intention.