Data Science

May 15, 2025
Simplify Setup and Boost Data Science in the Cloud using NVIDIA CUDA-X and Coiled
Imagine analyzing millions of NYC ride-share journeys—tracking patterns across boroughs, comparing service pricing, or identifying profitable pickup...
10 MIN READ

May 15, 2025
Predicting Performance on Apache Spark with GPUs
The world of big data analytics is constantly seeking ways to accelerate processing and reduce infrastructure costs. Apache Spark has become a leading platform...
9 MIN READ

May 15, 2025
Accelerating Embedding Lookups with cuEmbed
NVIDIA recently released cuEmbed, a high-performance, header-only CUDA library that accelerates embedding lookups on NVIDIA GPUs. If you're building...
8 MIN READ

May 08, 2025
Accelerate Deep Learning and LLM Inference with Apache Spark in the Cloud
Apache Spark is an industry-leading platform for big data processing and analytics. With the increasing prevalence of unstructured data—documents, emails,...
10 MIN READ

May 08, 2025
Spotlight: Accelerating the Discovery of New Battery Materials with SES AI's Molecular Universe
From the Stone Age to the digital era, materials have been the foundation of our civilization across all epochs. Today, finding new materials leads to progress...
7 MIN READ

May 07, 2025
Building Nemotron-CC, A High-Quality Trillion Token Dataset for LLM Pretraining from Common Crawl Using NVIDIA NeMo Curator
Curating high-quality pretraining datasets is critical for enterprise developers aiming to train state-of-the-art large language models (LLMs). To enable...
7 MIN READ

May 07, 2025
Using Python to Automate 3D Workflows with OpenUSD?
Universal Scene Description (OpenUSD) offers a powerful, open, and extensible ecosystem for describing, composing, simulating, and collaborating within complex...
7 MIN READ

May 02, 2025
An Even Easier Introduction to CUDA (Updated)
Note: This blog post was originally published on Jan 25, 2017, but has been edited to reflect new updates. This post is a super simple introduction to CUDA, the...
16 MIN READ

May 01, 2025
Stacking Generalization with HPO: Maximize Accuracy in 15 Minutes with NVIDIA cuML
Stacking generalization is a widely used technique among machine learning (ML) engineers, where multiple models are combined to boost overall predictive...
7 MIN READ

Apr 29, 2025
Structuring Applications to Secure the KV Cache
When interacting with transformer-based models like large language models (LLMs) and vision-language models (VLMs), the structure of the input shapes the...
11 MIN READ

Apr 29, 2025
Kaggle Grandmasters Unveil Winning Strategies for Data Science Superpowers
Kaggle Grandmasters David Austin and Chris Deotte from NVIDIA and Ruchi Bhatia from HP joined Brenda Flynn from Kaggle at this year’s Google Cloud Next...
9 MIN READ

Apr 23, 2025
NVIDIA cuPyNumeric 25.03 Now Fully Open Source with PIP and HDF5 Support
NVIDIA cuPyNumeric is a library that aims to provide a distributed and accelerated drop-in replacement for NumPy built on top of the Legate framework. It brings...
4 MIN READ

Apr 17, 2025
Grandmaster Pro Tip: Winning First Place in Kaggle Competition with Feature Engineering using NVIDIA cuDF-pandas
Feature engineering remains one of the most effective ways to improve model accuracy when working with tabular data. Unlike domains such as NLP and computer...
5 MIN READ

Apr 16, 2025
Efficient Federated Learning in the Era of LLMs with Message Quantization and Streaming
Federated learning (FL) has emerged as a promising approach for training machine learning models across distributed data sources while preserving data privacy....
8 MIN READ

Apr 15, 2025
NVIDIA Llama Nemotron Ultra Open Model Delivers Groundbreaking Reasoning Accuracy
AI is no longer just about generating text or images—it’s about deep reasoning, detailed problem-solving, and powerful adaptability for real-world...
8 MIN READ

Apr 11, 2025
Effortless Federated Learning on Mobile with NVIDIA FLARE and Meta ExecuTorch
NVIDIA and the PyTorch team at Meta announced a groundbreaking collaboration that brings federated learning (FL) capabilities to mobile devices through the...
12 MIN READ