Nick Becker – NVIDIA Technical Blog

Nick Becker – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-05-01T18:34:21Z http://www.open-lab.net/blog/feed/ Nick Becker <![CDATA[NVIDIA cuML Brings Zero Code Change Acceleration to scikit-learn]]> http://www.open-lab.net/blog/?p=97091 2025-04-23T00:22:52Z 2025-03-18T17:42:25Z

Scikit-learn, the most widely used ML library, is popular for processing tabular data because of its simple API, diversity of algorithms, and compatibility with...]]>

Scikit-learn, the most widely used ML library, is popular for processing tabular data because of its simple API, diversity of algorithms, and compatibility with popular Python libraries such as pandas and NumPy. NVIDIA cuML now enables you to continue using familiar scikit-learn APIs and Python libraries while enabling data scientists and machine learning engineers to harness the power of CUDA on…

]]> Nick Becker <![CDATA[RAPIDS 24.12 Introduces cuDF on PyPI, CUDA Unified Memory for Polars, and Faster GNNs]]> http://www.open-lab.net/blog/?p=94415 2024-12-19T21:46:07Z 2024-12-19T21:21:42Z

RAPIDS 24.12 introduces cuDF packages to PyPI, speeds up groupby aggregations and reading files from AWS S3, enables larger-than-GPU memory queries in the...]]>

RAPIDS 24.12 introduces cuDF packages to PyPI, speeds up aggregations and reading files from AWS S3, enables larger-than-GPU memory queries in the Polars GPU engine, and faster graph neural network (GNN) training on real-world graphs. Starting with the 24.12 release of RAPIDS, CUDA 12 builds of , , , and all of their dependencies are now available on PyPI. As a result…

]]> Nick Becker <![CDATA[Harnessing GPU Acceleration for Multi-Label Classification with RAPIDS cuML]]> http://www.open-lab.net/blog/?p=93575 2024-12-12T19:17:22Z 2024-12-12T16:55:40Z

Modern classification workflows often require classifying individual records and data points into multiple categories instead of just assigning a single label....]]>

Modern classification workflows often require classifying individual records and data points into multiple categories instead of just assigning a single label. Open-source Python libraries like scikit-learn make it easier to build models for these multi-label problems. Several models have built-in support for multi-label datasets, and a simple scikit-learn utility function enables using those…

]]> Nick Becker <![CDATA[Faster Causal Inference on Large Datasets with NVIDIA RAPIDS]]> http://www.open-lab.net/blog/?p=91854 2024-11-18T20:15:01Z 2024-11-14T16:00:00Z

As consumer applications generate more data than ever before, enterprises are turning to causal inference methods for observational data to help shed light on...]]>

As consumer applications generate more data than ever before, enterprises are turning to causal inference methods for observational data to help shed light on how changes to individual components of their app impact key business metrics. Over the last decade, econometricians have developed a technique called double machine learning that brings the power of machine learning models to causal…

]]> Nick Becker <![CDATA[NVIDIA RAPIDS 24.10 Introduces Accelerated NetworkX with Zero Code Change, Updates for UMAP and cuDF-Pandas]]> http://www.open-lab.net/blog/?p=91788 2024-11-14T17:10:34Z 2024-11-13T22:37:14Z

The RAPIDS v24.10 release takes another step forward in bringing accelerated computing to data scientists and developers with a seamless user experience. This...]]>

The RAPIDS v24.10 release takes another step forward in bringing accelerated computing to data scientists and developers with a seamless user experience. This blog post highlights the new features including: NetworkX accelerated by RAPIDS cuGraph is now GA in the 24.10 release beginning with NetworkX 3.4. This release adds GPU-accelerated graph creation, a new user experience…

]]> Nick Becker <![CDATA[Even Faster and More Scalable UMAP on the GPU with RAPIDS cuML]]> http://www.open-lab.net/blog/?p=91198 2024-11-14T17:10:53Z 2024-10-31T20:24:07Z

UMAP is a popular dimension reduction algorithm used in fields like bioinformatics, NLP topic modeling, and ML preprocessing. It works by creating a k-nearest...]]>

UMAP is a popular dimension reduction algorithm used in fields like bioinformatics, NLP topic modeling, and ML preprocessing. It works by creating a k-nearest neighbors (k-NN) graph, which is known in literature as an all-neighbors graph, to build a fuzzy topological representation of the data, which is used to embed high-dimensional data into lower dimensions. RAPIDS cuML already contained…

]]> 2 Nick Becker <![CDATA[NVIDIA CUDA-X Now Accelerates the Polars Data Processing Library]]> http://www.open-lab.net/blog/?p=89963 2024-10-17T18:19:09Z 2024-10-08T15:00:00Z

Polars, one of the fastest-growing data analytics tools, has just crossed 9M monthly downloads. As a modern DataFrame library, it is designed for efficiently...]]>

Polars, one of the fastest-growing data analytics tools, has just crossed 9M monthly downloads. As a modern DataFrame library, it is designed for efficiently processing datasets that fit on a single machine, without the overhead and complexity of distributed computing systems that are required for massive-scale workloads. As enterprises grapple with complex data problems—ranging from…

]]> Nick Becker <![CDATA[RAPIDS cuDF Instantly Accelerates pandas up to 50x?on Google Colab]]> http://www.open-lab.net/blog/?p=82534 2024-05-30T19:55:55Z 2024-05-14T20:30:00Z

At Google I/O'24, Laurence Moroney, head of AI Advocacy at Google, announced that RAPIDS cuDF is now integrated into Google Colab. Developers can now instantly...]]>

At Google I/O’24, Laurence Moroney, head of AI Advocacy at Google, announced that RAPIDS cuDF is now integrated into Google Colab. Developers can now instantly accelerate pandas code up to 50x on Google Colab GPU instances, and continue using pandas as data grows—without sacrificing performance. RAPIDS cuDF is a GPU DataFrame library that accelerates the data processing tool pandas with zero…

]]> Nick Becker <![CDATA[RAPIDS cuDF Accelerates pandas Nearly 150x with Zero Code Changes]]> http://www.open-lab.net/blog/?p=72591 2024-05-15T15:55:04Z 2024-03-18T22:00:00Z

At NVIDIA GTC 2024, it was announced that RAPIDS cuDF can now bring GPU acceleration to 9.5M million pandas users without requiring them to change their code....]]>

At NVIDIA GTC 2024, it was announced that RAPIDS cuDF can now bring GPU acceleration to 9.5M million pandas users without requiring them to change their code. Update: RAPIDS cuDF now instantly accelerates pandas with zero code changes in Google Colab. Try out the tutorial in a Colab notebook today. pandas, a flexible and powerful data analysis and manipulation library for Python…

]]> 5 Nick Becker <![CDATA[NVIDIA and Snowflake Collaboration Boosts Data Cloud AI Capabilities]]> http://www.open-lab.net/blog/?p=66965 2023-07-13T19:00:27Z 2023-06-27T16:00:00Z

NVIDIA and Snowflake announced a new partnership bringing accelerated computing to the Data Cloud with the new Snowpark Container Services (private preview), a...]]>

NVIDIA and Snowflake announced a new partnership bringing accelerated computing to the Data Cloud with the new Snowpark Container Services (private preview), a runtime for developers to manage and deploy containerized workloads. By integrating the capabilities of GPUs and AI into the Snowflake platform, customers can enhance ML performance and efficiently fine-tune LLMs. They achieve this by…

]]> 1 Nick Becker <![CDATA[Faster HDBSCAN Soft Clustering with RAPIDS cuML]]> http://www.open-lab.net/blog/?p=58016 2023-07-11T23:26:06Z 2022-12-06T19:00:00Z

HDBSCAN is a state-of-the-art, density-based clustering algorithm that has become popular in domains as varied as topic modeling, genomics, and geospatial...]]>

HDBSCAN is a state-of-the-art, density-based clustering algorithm that has become popular in domains as varied as topic modeling, genomics, and geospatial analytics. RAPIDS cuML has provided accelerated HDBSCAN since the 21.10 release in October 2021, as detailed in GPU-Accelerated Hierarchical DBSCAN with RAPIDS cuML – Let’s Get Back To The Future. However, support for soft clustering (also…

]]> 0 Nick Becker <![CDATA[Advancing the State of the Art in AutoML, Now 10x Faster with NVIDIA GPUs and RAPIDS]]> http://www.open-lab.net/blog/?p=32442 2022-08-21T23:51:49Z 2021-06-09T15:00:00Z

To achieve state-of-the-art machine learning (ML) solutions, data scientists often build complex ML models. However, these techniques are computationally...]]>

To achieve state-of-the-art machine learning (ML) solutions, data scientists often build complex ML models. However, these techniques are computationally expensive, and until recently required extensive background knowledge, experience, and human effort. Recently, at GTC 21, AWS Senior Data Scientist Nick Erickson gave a session sharing how the combination of AutoGluon, RAPIDS…

]]> 1 Nick Becker <![CDATA[10 Minutes to Data Science: Transitioning Between RAPIDS cuDF and CuPy Libraries]]> http://www.open-lab.net/blog/?p=23338 2025-05-01T18:34:21Z 2021-03-19T21:01:26Z

RAPIDS is about creating bridges, connections, and clean handoffs between GPU PyData libraries. Interoperability with functionality is our goal. For example, if...]]>

This post was originally published on the RAPIDS AI blog. RAPIDS is about creating bridges, connections, and clean handoffs between GPU PyData libraries. Interoperability with functionality is our goal. For example, if you’re working with RAPIDS cuDF but need a more linear-algebra oriented function that exists in CuPy, you can leverage the interoperability of the GPU PyData ecosystem to…

]]> 0 ��˳��97caoporen��