CUDA

May 23, 2025

AI Transforms Brain MRIs Into Potential Stroke Predictors

Researchers, using AI to analyze routine brain scans, have discovered a promising new method to reliably identify a common but hard-to-detect precursor of many...

3 MIN READ

May 22, 2025

Blackwell Breaks the 1,000 TPS/User Barrier With Meta’s Llama 4 Maverick

NVIDIA has achieved a world-record large language model (LLM) inference speed. A single NVIDIA DGX B200 node with eight NVIDIA Blackwell GPUs can achieve over...

9 MIN READ

May 14, 2025

Get Trained and Certified at GTC Paris at VivaTech 2025

Join us at GTC Paris on June 10th and choose from six full-day, instructor-led workshops.

1 MIN READ

May 12, 2025

Just Released: NVIDIA Warp is Now Open-Source Under Apache 2.0

NVIDIA Warp, a simulation computing framework, is now accessible to all developers.

1 MIN READ

May 09, 2025

CUDA C++ Compiler Updates Impacting ELF Visibility and Linkage

In the next CUDA major release, CUDA 13.0, NVIDIA is introducing two significant changes to the NVIDIA CUDA Compiler Driver (NVCC) that will impact ELF...

11 MIN READ

May 05, 2025

Just Released: CUDA 12.9

New features include enhancements to confidential computing and family-specific features and targets supported by NVCC.

1 MIN READ

May 02, 2025

An Even Easier Introduction to CUDA (Updated)

Note: This blog post was originally published on Jan 25, 2017, but has been edited to reflect new updates. This post is a super simple introduction to CUDA, the...

16 MIN READ

May 01, 2025

NVIDIA Blackwell and NVIDIA CUDA 12.9 Introduce Family-Specific Architecture Features

One of the earliest architectural design decisions that went into the CUDA platform for NVIDIA GPUs was support for backward compatibility of GPU code. This...

14 MIN READ

An image representing matrix multiplication.

May 01, 2025

Boosting Matrix Multiplication Speed and Flexibility with NVIDIA cuBLAS 12.9

The NVIDIA CUDA-X math libraries empower developers to build accelerated applications for AI, scientific computing, data processing, and more. Two...

8 MIN READ

Apr 23, 2025

NVIDIA cuPyNumeric 25.03 Now Fully Open Source with PIP and HDF5 Support

NVIDIA cuPyNumeric is a library that aims to provide a distributed and accelerated drop-in replacement for NumPy built on top of the Legate framework. It brings...

4 MIN READ

Apr 16, 2025

Announcing ComputeEval, an Open Source Framework for Evaluating LLMs on CUDA

Large language models (LLMs) are revolutionizing how developers code and how they learn to code. For seasoned or junior developers alike, today’s...

4 MIN READ

Mar 13, 2025

Networking Reliability and Observability at Scale with NCCL 2.24

The NVIDIA Collective Communications Library (NCCL) implements multi-GPU and multinode (MGMN) communication primitives optimized for NVIDIA GPUs and networking....

14 MIN READ

Mar 12, 2025

Understanding PTX, the Assembly Language of CUDA GPU Computing

Parallel thread execution (PTX) is a virtual machine instruction set architecture that has been part of CUDA from its beginning. You can think of PTX as the...

13 MIN READ

A person typing in front of several computer monitors.

Mar 10, 2025

Optimizing Compile Times for CUDA C++

In modern software development, time is an incredibly valuable resource, especially during the compilation process. For developers working with CUDA C++ on...

10 MIN READ

Mar 04, 2025

GPU-Accelerate Algorithmic Trading Simulations by over 100x with Numba

Quantitative developers need to run back-testing simulations to see how financial algorithms perform from a profit and loss (P&L) standpoint. Statistical...

12 MIN READ

Feb 25, 2025

NVIDIA cuDSS Advances Solver Technologies for Engineering and Scientific Computing

NVIDIA cuDSS is a first-generation sparse direct solver library designed to accelerate engineering and scientific computing. cuDSS is increasingly adopted in...

12 MIN READ