• <xmp id="om0om">
  • <table id="om0om"><noscript id="om0om"></noscript></table>
  • http://www.open-lab.net/sites/default/files/akamai/tools/NsCompute/2019-1/NSC19-1_MProc_3Child_PgDetails_Occupancy.PNG

    NVIDIA? Nsight? Compute is an interactive kernel profiler for CUDA applications. It provides detailed performance metrics and API debugging via a user interface and command line tool. In addition, its baseline feature allows users to compare results within the tool. Nsight Compute provides a customizable and data-driven user interface and metric collection and can be extended with analysis scripts for post-processing results.

     Download Now 
    Version 2019.1 New Features  |  Revision History

    NVIDIA? Nsight? Compute is freely offered through the NVIDIA Registered Developer Program and as part of the CUDA Toolkit

    Baseline Comparisons

    • Set multiple baselines to compare variations in GPU architecture, kernel launch parameters, memory usage, ...
    • Compare performance metrics between baselines and the current run
    • New: Now with the ability to compare child processes

    Run from NsCompute GUI or from Console Command Line

    • NsCompute GUI provides text for console commands
    • GUI/Console provide similar features, functionality, output, and reports

    CUDA 10.1 Task Graph Profiling

    • Stop at a kernel launch from a graph node
    • State of graph node shown in resource page
    • Export graph visualization

    Source Code Correlation

    • Correlate individual Source, SASS, or PTX lines and metrics
    • Shown here with PC Sampling data available in Volta and Turing architectures
    • New: Improved heat map for identifying high metric values


    • Interactive kernel profiler
    • Profiler report for kernels and/or child processes
    • Diff’ing results across one or multiple reports using baselines
    • Fast data collection
    • Intuitive UI for interactive profiling
    • Command line operation for manual and automated profiling
    • Fully customizable reports and rules

    Variations from the Nsight Compute found in the CUDA Toolkit 10.1

      Bug Fixes:
      • Metric smsp__inst_executed.sum incorrectly reported as zero
      • Extra triggers/records reported when profiling in a multi-context environment

    System Requirements

    Supported platforms


    • Linux x86_64[1]
    • Windows x86_64[1]
    • MacOS[1]

    • Linux x86_64[1]
    • Windows x86_64[1]
    • DRIVE OS QNX aarch64[2][3]
    • DRIVE OS Linux aarch64[2][3]
         [1] available in this download and the CUDA Desktop Toolkit
         [2] available in the Embedded or Drive toolkits only
         [3] Only the command line interface (CLI) is provided for these platforms. There is no Nsight Compute GUI application for these platforms

    Supported GPU architectures

    • Pascal: GP10x (excluding GP100)
    • Volta: GV100
    • Turing: TU1xx


    • Please use the drivers provided with CUDA Toolkit 10.1 production release or a more recent version.


    Nsight Compute Documentation


    To provide feedback, request additional features, or report Nsight Compute issues, please use the Developer Forums