Gregory Kimball

Gregory Kimball is a software engineering manager at NVIDIA working on the RAPIDS team. Gregory leads development for libcudf, the CUDA/C++ library for columnar data processing that powers RAPIDS cuDF. Gregory holds a PhD in applied physics from the California Institute of Technology.
Avatar photo

Posts by Gregory Kimball

Data Center / Cloud

Efficient ETL with Polars and Apache Spark on NVIDIA Grace CPU

The NVIDIA Grace CPU Superchip delivers outstanding performance and best-in-class energy efficiency for CPU workloads in the data center and in the cloud. The... 7 MIN READ
A diagram of how JSON data is processed.
Data Science

JSON Lines Reading with pandas 100x Faster Using NVIDIA cuDF

JSON is a widely adopted format for text-based information working interoperably between systems, most commonly in web applications and large language models... 10 MIN READ
Data Science

Supercharging Deduplication in pandas Using RAPIDS cuDF

A common operation in data analytics is to drop duplicate rows. Deduplication is critical in Extract, Transform, Load (ETL) workflows, where you might want to... 12 MIN READ
Data Science

Scaling Up to One Billion Rows of Data in pandas using RAPIDS cuDF

The One Billion Row Challenge is a fun benchmark to showcase basic data processing operations. It was originally launched as a pure-Java competition, and has... 11 MIN READ
Data Science

Encoding and Compression Guide for Parquet String Data Using RAPIDS

Parquet writers provide encoding and compression options that are turned off by default. Enabling these options may provide better lossless compression for your... 10 MIN READ
Data Science

Streamline ETL Workflows with Nested Data Types in RAPIDS libcudf

Nested data types are a convenient way to represent hierarchical relationships within columnar data. They are frequently used as part of extract, transform,... 10 MIN READ