RAPIDS

Jul 03, 2025
RAPIDS Adds GPU Polars Streaming, a Unified GNN API, and Zero-Code ML Speedups
RAPIDS, a suite of NVIDIA CUDA-X libraries for Python data science, released version 25.06, introducing exciting new features. These include a Polars GPU...
6 MIN READ

Jun 27, 2025
How to Work with Data Exceeding VRAM in the Polars GPU Engine
In high-stakes fields such as quant finance, algorithmic trading, and fraud detection, data practitioners frequently need to process hundreds of gigabytes (GB)...
4 MIN READ

Jun 18, 2025
AI in Manufacturing and Operations at NVIDIA: Accelerating ML Models with NVIDIA CUDA-X Data Science
NVIDIA leverages data science and machine learning to optimize chip manufacturing and operations workflows—from wafer fabrication and circuit probing to...
8 MIN READ

Jun 12, 2025
Driving Toward Billion-Cell Analysis and Biological Breakthroughs with RAPIDS-singlecell
The future of cell biology and virtual cell models is dependent on measuring and analyzing data at scale. Single-cell experiments have been growing at an...
7 MIN READ

Jun 05, 2025
Supercharge Tree-Based Model Inference with Forest Inference Library in NVIDIA cuML
Tree-ensemble models remain a go-to for tabular data because they're accurate, comparatively inexpensive to train, and fast. But deploying Python inference on...
11 MIN READ

Jun 02, 2025
Supercharging Fraud Detection in Financial Services with Graph Neural Networks (Updated)
Note: This blog post was originally published on Oct. 28, 2024, but has been edited to reflect new updates. Fraud in financial services is a massive problem....
10 MIN READ

May 29, 2025
RAPIDS Brings Zero-Code-Change Acceleration, IO Performance Gains, and Out-of-Core XGBoost
Over the past two releases, RAPIDS introduced zero-code-change acceleration for Python machine learning, huge IO performance improvements, larger-than-memory...
10 MIN READ

May 22, 2025
Grandmaster Pro Tip: Winning First Place in a Kaggle Competition with Stacking Using cuML
What does it take to win a Kaggle competition in 2025? In the April Playground challenge, the goal was to predict how long users would listen to a podcast—and...
7 MIN READ

May 19, 2025
Spotlight: Atgenomix SeqsLab Scales Health Omics Analysis for Precision Medicine
In traditional clinical medical practice, treatment decisions are often based on general guidelines, past experiences, and trial-and-error approaches. Today,...
9 MIN READ

May 15, 2025
Simplify Setup and Boost Data Science in the Cloud using NVIDIA CUDA-X and Coiled
Imagine analyzing millions of NYC ride-share journeys—tracking patterns across boroughs, comparing service pricing, or identifying profitable pickup...
10 MIN READ

May 15, 2025
Predicting Performance on Apache Spark with GPUs
The world of big data analytics is constantly seeking ways to accelerate processing and reduce infrastructure costs. Apache Spark has become a leading platform...
9 MIN READ

May 08, 2025
Accelerate Deep Learning and LLM Inference with Apache Spark in the Cloud
Apache Spark is an industry-leading platform for big data processing and analytics. With the increasing prevalence of unstructured data—documents, emails,...
10 MIN READ

May 07, 2025
Building Nemotron-CC, A High-Quality Trillion Token Dataset for LLM Pretraining from Common Crawl Using NVIDIA NeMo Curator
Curating high-quality pretraining datasets is critical for enterprise developers aiming to train state-of-the-art large language models (LLMs). To enable...
7 MIN READ

May 01, 2025
Stacking Generalization with HPO: Maximize Accuracy in 15 Minutes with NVIDIA cuML
Stacking generalization is a widely used technique among machine learning (ML) engineers, where multiple models are combined to boost overall predictive...
7 MIN READ

Apr 10, 2025
Efficiently Scaling Polars GPU Parquet Reader
When working with large datasets, the performance of your data processing tools becomes critical. Polars, an open-source library for data manipulation known for...
4 MIN READ

Apr 03, 2025
Accelerating Apache Parquet Scans on Apache Spark with GPUs
As data sizes have grown in enterprises across industries, Apache Parquet has become a prominent format for storing data. Apache Parquet is a columnar storage...
8 MIN READ