Matt Ahrens – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-05-15T19:07:19Z http://www.open-lab.net/blog/feed/ Matt Ahrens <![CDATA[Predicting Performance on Apache Spark with GPUs]]> http://www.open-lab.net/blog/?p=100118 2025-05-15T19:07:19Z 2025-05-15T17:00:00Z The world of big data analytics is constantly seeking ways to accelerate processing and reduce infrastructure costs. Apache Spark has become a leading platform...]]>

The world of big data analytics is constantly seeking ways to accelerate processing and reduce infrastructure costs. Apache Spark has become a leading platform for scale-out analytics, handling massive datasets for ETL, machine learning, and deep learning workloads. While traditionally CPU-based, the advent of GPU acceleration offers a compelling promise: significant speedups for data processing…

Source

]]>
Matt Ahrens <![CDATA[Accelerating Apache Parquet Scans on Apache Spark with GPUs]]> http://www.open-lab.net/blog/?p=98350 2025-04-22T23:57:50Z 2025-04-03T16:18:03Z As data sizes have grown in enterprises across industries, Apache Parquet has become a prominent format for storing data. Apache Parquet is a columnar storage...]]>

As data sizes have grown in enterprises across industries, Apache Parquet has become a prominent format for storing data. Apache Parquet is a columnar storage format designed for efficient data processing at scale. By organizing data by columns rather than rows, Parquet enables high-performance querying and analysis, as it can read only the necessary columns for a query instead of scanning entire…

Source

]]>
3
Matt Ahrens <![CDATA[Accelerating JSON Processing on Apache Spark with GPUs]]> http://www.open-lab.net/blog/?p=95298 2025-04-23T15:01:08Z 2025-01-29T22:10:22Z JSON is a popular format for text-based data that allows for interoperability between systems in web applications as well as data management. The format has...]]>

JSON is a popular format for text-based data that allows for interoperability between systems in web applications as well as data management. The format has been in existence since the early 2000s and came from the need for communication between web servers and browsers. The standard JSON format consists of key-value pairs that can include nested objects. JSON has grown in usage for storing web…

Source

]]>
���˳���97caoporen����