NVIDIA Nsight Perf SDK
The NVIDIA? Nsight? Perf SDK is a graphics profiling toolbox for DirectX, Vulkan, and OpenGL enabling you to collect GPU performance metrics directly from your application.Get Started
Integrate GPU performance metric collection into your application or graphics developer tool of choice. Activate profiling from your own custom programmatic triggers. Choose the list of GPU metrics to collect, customize your output, and keep control over your workflow.
Upgrade Your CI/CD
Generate detailed profiler reports on every developer and artist change. Add dedicated perf regression criteria by inspecting GPU metric values.
Realtime Performance HUD
Add continuous performance metrics collection to your code, and leverage the built-in HUD renderer to effortlessly enable real-time, high-level performance triage.
Explore panels with metrics on SM, L2 cache, ROP, VRAM and various other subunits to gain an early understanding of the performance characteristics and potential bottlenecks of the scene as you move through it.
The HUD- and Periodic Sampler-utility classes also serve as an example for creating your own powerful, low-overhead, real-time workflows on top of the low-level Nsight Perf SDK API.
HTML Profiler Report Generator
Generate detailed profiler reports with minimal effort. Simply insert a few calls at Graphics API Device Initialization, Present/SwapBuffers, a Keypress handler, or an automated trigger.
Insert annotations (PushRange/PopRange) around GPU workloads to collect additional reports per region of execution. The report generator automatically collects 100s of GPU metrics of interest; there is no need to study these complex topics on first usage.
The reports provide a top-down representation of GPU performance, with fast navigation to the top performance limiters. Quickly determine the workload type, pipeline activity and utilization, shader latency reasons, and 3D data flow.
Partners and Industry Standards
NVIDIA Nsight Tools News
Improving GPU Performance by Reducing Instruction Cache Misses
Instruction cache misses can cause performance degradation for kernels that have a large instruction footprint, which is often caused by substantial loop unrolling.
CUDA 12.1 Supports Large Kernel Parameters
CUDA 12.1 offers you the option of passing up to 32,764 bytes using kernel parameters, which can be exploited to simplify applications as well as gain performance improvements.
Upcoming Event: Level Up with NVIDIA Nsight Graphics and Optimize Your Game
Learn how to use the latest NVIDIA RTX technology in NVIDIA Nsight Graphics and get your questions answered in a live Q&A session with experts.
Accelerating Data Center and HPC Performance Analysis with NVIDIA Nsight Systems
NVIDIA Nsight Systems 2023.2 previews profiling for multinode systems alongside support for profiling Python, networking hardware metrics, and a new analysis framework.
View all Nsight news
Ready to download NVIDIA Nsight? Perf SDK?