Read this tutorial on how to tap into GPUs by importing cuDF instead of pandas�Cwith only a few code changes.
]]>This post is part of a series on accelerated data analytics. Digital advancements in climate modeling, healthcare, finance, and retail are generating unprecedented volumes and types of data. IDC says that by 2025, there will be 180 ZB of data compared to 64 ZB in 2020, scaling up the need for data analytics to turn all that data into insights. NVIDIA provides the RAPIDS suite of��
]]>This post is part of a series on accelerated data analytics. Update: The below blog describes how to use GPU-only RAPIDS cuDF, which requires code changes. RAPIDS cuDF now has a CPU/GPU interoperability (cudf.pandas) that speeds up pandas code by up to 150x with zero code changes. At GTC 2024, NVIDIA announced that the cudf.pandas library is now GA. At Google I/O��
]]>If you work in data analytics, you know that data ingest is often the bottleneck of data preprocessing workflows. Getting data from storage and decoding it can often be one of the most time-consuming steps in the workflow because of the data volume and the complexity of commonly used formats. Optimizing data ingest can greatly reduce this bottleneck for data scientists working on large data sets.
]]>Over the past few releases, the NVIDIA cuDF team has added several new features to user-defined functions (UDFs) that can streamline the development process while improving overall performance. In this post, I walk through the new UDF enhancements and show how you can take advantage of them within your own applications: If you��re not familiar with pandas, series apply is the main��
]]>Imagine you have just started a new data science project. The goal is to build a model predicting Y, the target variable. You have already received some data from the stakeholders/data engineers, did a thorough EDA, and selected some variables you believe are relevant for the problem at hand. Then you finally built your first model. The score is acceptable, but you believe you can do much better.
]]>This is the third installment of the series of introductions to the RAPIDS ecosystem. The series explores and discusses various aspects of RAPIDS that allow its users solve ETL (Extract, Transform, Load) problems, build ML (Machine Learning) and DL (Deep Learning) models, explore expansive graphs, process geospatial, signal, and system log data, or use SQL language via BlazingSQL to process data.
]]>This series on the RAPIDS ecosystem explores the various aspects that enable you to solve extract, transform, load (ETL) problems, build machine learning (ML) and deep learning (DL) models, explore expansive graphs, process signal and system logs, or use the SQL language through BlazingSQL to process data. For part 1, see Pandas DataFrame Tutorial: A Beginner��s Guide to GPU Accelerated DataFrames��
]]>This post is the first installment of the series of introductions to the RAPIDS ecosystem. The series explores and discusses various aspects of RAPIDS that allow its users solve ETL (Extract, Transform, Load) problems, build ML (Machine Learning) and DL (Deep Learning) models, explore expansive graphs, process geospatial, signal, and system log data, or use SQL language via BlazingSQL to process��
]]>