Provides support for the NVIDIA Blackwell SM100 architecture. CUTLASS is a collection of CUDA C++ templates and abstractions for implementing high-performance GEMM computations.
]]>Bringing support for NVIDIA Blackwell architecture across data center and GeForce products, NVIDIA cuDNN 9.7 delivers speedups of up to 84% for FP8 Flash Attention operations and optimized GEMM capabilities with advanced fusion support to accelerate deep learning workloads.
]]>The NVIDIA CUDA Deep Neural Network library (cuDNN) is a GPU-accelerated library for accelerating deep learning primitives with state-of-the-art performance. cuDNN is integrated with popular deep learning frameworks like PyTorch, TensorFlow, and XLA (Accelerated Linear Algebra). These frameworks abstract the complexities of direct GPU programming, enabling you to focus on designing and…
]]>NVIDIA announces the newest CUDA Toolkit software release, 12.0. This release is the first major release in many years and it focuses on new programming models and CUDA application acceleration through new hardware capabilities. For more information, watch the YouTube Premiere webinar, CUDA 12.0: New Features and Beyond. You can now target architecture-specific features and instructions…
]]>Today, NVIDIA is announcing the availability of cuTENSOR, version 1.4, which supports up to 64-dimensional tensors, distributed multi-GPU tensor operations, and helps improve tensor contraction performance models. This software can be downloaded now free of charge. Download the cuTENSOR software. For more information, see the cuTENSOR Release Notes. cuTENSOR is a high…
]]>NVIDIA continues to enhance CUTLASS to provide extensive support for mixed-precision computations, providing specialized data-movement, and multiply-accumulate abstractions. Today, NVIDIA is announcing the availability of CUTLASS version 2.8. Download the free CUTLASS v2.8 software. See the CUTLASS Release Notes for more information. CUTLASS is a collection of CUDA…
]]>Today, NVIDIA is announcing the availability of nvCOMP, version 2.1.0. This software can be downloaded now free of charge. Download now. See the nvCOMP Release Notes for more information. nvCOMP is a CUDA library that features generic compression interfaces to enable developers to use high-performance GPU compressors in their applications.
]]>Today, NVIDIA is announcing the availability of cuSPARSELt, version 0.2.0, which increases performance on activation functions, bias vectors, and Batched Sparse GEMM. This software can be downloaded now free of charge. Download the cuSPARSELt software. For more technical information, see the cuSPARSELt Release Notes. NVIDIA cuSPARSELt is a high-performance CUDA…
]]>Today, cuSOLVERMp version 0.0.1 is now available at no charge for members of the NVIDIA Developer Program. Download Now What’s New About cuSOLVERMp cuSOLVERMp provides a distributed-memory multi-node and multi-GPU solution for solving systems of linear equations at scale! In the future, it will also solve eigenvalue and singular value problems.
]]>Today, NVIDIA is announcing the availability of nvCOMP version 2.0.0. This software can be downloaded now free for members of the NVIDIA Developer Program. Download Now What’s New See the nvCOMP Release Notes for more information About nvCOMP nvCOMP is a CUDA library that features generic compression interfaces to enable developers to use high-performance GPU…
]]>NVIDIA announced its latest update to the HPL-AI Benchmark version 2.0.0, which will reside in the HPC-Benchmarks container version 21.4. The HPL-AI (High Performance Linpack – Artificial Intelligence) benchmark helps evaluate the convergence of HPC and data-driven AI workloads. Historically, HPC workloads are benchmarked at double-precision, representing the accuracy requirements in…
]]>Today, NVIDIA is announcing the availability of cuTENSOR version 1.3.0. This software can be downloaded now free for members of the NVIDIA Developer Program. Download Now What’s New See the cuTENSOR Release Notes for more information. About cuTENSOR cuTENSOR is a high-performance CUDA library for tensor primitives; its key features are: Learn more…
]]>Today, NVIDIA is announcing the availability of cuSPARSELt version 0.1.0. This software can be downloaded now free for members of the NVIDIA Developer Program. Download Now What’s New See the cuSPARSELt Release Notes for more information About cuSPARSELt NVIDIA cuSPARSELt is a high-performance CUDA library dedicated to general matrix-matrix operations in which at least one operand is a…
]]>Python plays a key role within the science, engineering, data analytics, and deep learning application ecosystem. NVIDIA has long been committed to helping the Python ecosystem leverage the accelerated massively parallel performance of GPUs to deliver standardized libraries, tools, and applications. Today, we’re introducing another step towards simplification of the developer experience with…
]]>