Nsight Compute 2022.1 - New Features
Updates in 2022.1.1
General
- Filtering kernel launches or profile results based on NVTX domains/ranges now takes registered strings in the payload field into account, if the range name is empty.
- Added support for the suffix
.max_rate
for ratio metrics.
Resolved Issues
- Fixed a crash during the disassembly of the kernel's SASS code for the Source page.
- Fixed a crash on exit of the NVIDIA Nsight Compute UI.
- Fixed a hang during profiling when CPU call stack collection is enabled.
- Fixed missing to flush UVM buffers before taking memory checkpoints during Range Replay.
- Fixed tracking of memory during Range Replay, if the CUDA context has any device mapped memory allocations.
- Fixed the maximum available shared memory sizes in the Occupancy Calculator for NVIDIA Ampere GPUs.
- Fixed that the shared memory usage of the kernel is incorrectly initialized when opening the Occupancy Calculator from a profile report.
Updates in 2022.1.0
General
- Added support for the CUDA toolkit 11.6.
- Added a new Range Replay mode to profile ranges of multiple, concurrent kernels. Range replay is available in the NVIDIA Nsight Compute CLI and the non-interactive Profile activity.
- Added a new rule to detect non-fused floating-point instructions.
- The Uncoalesced Memory access rules now show results in a dynamic table.
- Unix Domain Sockets and Windows Named Pipes are used for local connection between the host and target processes on x86_64 Linux and Windows, respectively.
- The NvRules API now supports querying action names using different function name bases (e.g. demangled).
NVIDIA Nsight Compute
- The default report page is now chosen automatically when opening a report.
- Added coverage for ECC (Error Correction Code) operations in the L2 Cache table of the Memory Analysis section.
- Added a new L2 Evict Policies table to the Memory Analysis section.
- The Occupancy Calculator now updates automatically when the input changes.
- Added new metric Thread Instructions Executed to the Source page.
- Added tooltips to the Register Dependency columns in the Source page to identify the associated register more conveniently.
- Improved the selection of Sections and Sets in the Profile activity connection dialog.
- NVLink utilization is shown in the NVLink Tables section.
- NVLink links are colored according to the measured throughput.
NVIDIA Nsight Compute CLI
-
--kernel-regex
and--kernel-regex-base
options are no longer supported. Alternate options are--kernel-name
and--kernel-name-base
respectively, added in 2021.1.0. - Added support to resolve CUDA source files in the
--page
source output with the new--resolve-source-file
command line option. - Added new option
--target-processes-filter
to filter the processes being profiled by name. - The CPU Stack Trace is shown in the NVIDIA Nsight Compute CLI output.
- Resolved Issues
- Fixed the calculation of aggregated average instruction execution metrics in non-SASS views on the Source page.
- Fixed that atomic instructions are counted as both loads and stores in the Memory Analysis tables.
Resolved Issues
- Fixed the calculation of aggregated average instruction execution metrics in non-SASS views on the Source page.
- Fixed that atomic instructions are counted as both loads and stores in the Memory Analysis tables.
For a complete overview of all NVIDIA Nsight? Compute features and access to resources, please visit the main Nsight? Compute page.
NVIDIA? Nsight? Compute 2022.1 is available for download under the NVIDIA Registered Developer Program.
?Download 2022.1.1? ?Download 2022.1.0? ?Documentation?