Models / Libraries / Frameworks

Jul 07, 2025
Asking an Encyclopedia-Sized Question: How To Make the World Smarter with Multi-Million Token Real-Time Inference
Modern AI applications increasingly rely on models that combine huge parameter counts with multi-million-token context windows. Whether it is AI agents...
8 MIN READ

Jul 01, 2025
Per-Tensor and Per-Block Scaling Strategies for Effective FP8 Training
In this blog post, we’ll break down the main FP8 scaling strategies—per-tensor scaling, delayed and current scaling, and per-block scaling (including the...
10 MIN READ

Jun 26, 2025
Run Google DeepMind’s Gemma 3n on NVIDIA Jetson and RTX
As of today, NVIDIA now supports the general availability of Gemma 3n on NVIDIA RTX and Jetson. Gemma, previewed by Google DeepMind at Google I/O last month,...
4 MIN READ

Jun 25, 2025
Join Us at We Are Developers World Congress 2025
Join us at We Are Developers World Congress from July 9 to 11 to attend our workshops and connect with experts.
1 MIN READ

Jun 18, 2025
How Early Access to NVIDIA GB200 Systems Helped LMArena Build a Model to Evaluate LLMs
LMArena at the University of California, Berkeley is making it easier to see which large language models excel at specific tasks, thanks to help from NVIDIA and...
6 MIN READ

Jun 18, 2025
AI in Manufacturing and Operations at NVIDIA: Accelerating ML Models with NVIDIA CUDA-X Data Science
NVIDIA leverages data science and machine learning to optimize chip manufacturing and operations workflows—from wafer fabrication and circuit probing to...
8 MIN READ

Jun 13, 2025
Run High-Performance LLM Inference Kernels from NVIDIA Using FlashInfer??
Best-in-class LLM Inference requires two key elements: speed and developer velocity. Speed refers to maximizing the efficiency of the underlying hardware by...
6 MIN READ

Jun 12, 2025
Accelerated Sequence Alignment for Protein Science with MMseqs2-GPU and NVIDIA NIM
Protein sequence alignment—comparing protein sequences for similarities—is fundamental to modern biology and medicine. It illuminates gene functions by...
9 MIN READ

Jun 11, 2025
Accelerated Molecular Modeling with NVIDIA cuEquivariance and NVIDIA NIM microservices
The emergence of models like AlphaFold2 has skyrocketed the demand for faster inference and training of molecular AI models. The need for speed comes with...
8 MIN READ

Jun 11, 2025
Advancing Literature Review & Target Discovery With NVIDIA Biomedical AI-Q Research Agent Blueprint
Biomedical research and drug discovery have long been constrained by labor-intensive processes. In order to kick-off a drug discovery campaign, researchers...
4 MIN READ

Jun 11, 2025
Simplify LLM Deployment and AI Inference with a Unified NVIDIA NIM Workflow
Integrating large language models (LLMs) into a production environment, where real users interact with them at scale, is the most important part of any AI...
10 MIN READ

Jun 11, 2025
Accelerate Decision Optimization Using Open Source NVIDIA cuOpt
Businesses make thousands of decisions every day—what to produce, where to ship, how to allocate resources. At scale, optimizing these decisions becomes a...
5 MIN READ

Jun 11, 2025
Develop Custom Physical AI Foundation Models with NVIDIA Cosmos Predict-2
Building smarter robots and autonomous vehicles (AVs) starts with physical AI models that understand real-world dynamics. These models serve two critical roles:...
7 MIN READ

Jun 11, 2025
Accelerating AV Simulation with Neural Reconstruction and World Foundation Models
Autonomous vehicle (AV) stacks are evolving from a hierarchy of discrete building blocks to end-to-end architectures built on foundation models. This transition...
7 MIN READ

Jun 05, 2025
Supercharge Tree-Based Model Inference with Forest Inference Library in NVIDIA cuML
Tree-ensemble models remain a go-to for tabular data because they're accurate, comparatively inexpensive to train, and fast. But deploying Python inference on...
11 MIN READ

Jun 04, 2025
NVIDIA Speech AI Models Deliver Industry-Leading Accuracy and Performance
NVIDIA is driving state-of-the-art performance, efficiency, and accessibility in both speech AI and language models, setting the stage for innovations that are...
5 MIN READ