Lee Yang – NVIDIA Technical Blog

Lee Yang – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-05-29T19:05:15Z http://www.open-lab.net/blog/feed/ Lee Yang <![CDATA[Predicting Performance on Apache Spark with GPUs]]> http://www.open-lab.net/blog/?p=100118 2025-05-29T19:04:59Z 2025-05-15T17:00:00Z

The world of big data analytics is constantly seeking ways to accelerate processing and reduce infrastructure costs. Apache Spark has become a leading platform...]]>

The world of big data analytics is constantly seeking ways to accelerate processing and reduce infrastructure costs. Apache Spark has become a leading platform for scale-out analytics, handling massive datasets for ETL, machine learning, and deep learning workloads. While traditionally CPU-based, the advent of GPU acceleration offers a compelling promise: significant speedups for data processing…

]]> Lee Yang <![CDATA[Accelerate Deep Learning and LLM Inference with Apache Spark in the Cloud]]> http://www.open-lab.net/blog/?p=99585 2025-05-29T19:05:15Z 2025-05-08T16:00:00Z

Apache Spark is an industry-leading platform for big data processing and analytics. With the increasing prevalence of unstructured data��documents, emails,...]]>

Apache Spark is an industry-leading platform for big data processing and analytics. With the increasing prevalence of unstructured data—documents, emails, multimedia content—deep learning (DL) and large language models (LLMs) have become core components of the modern data analytics pipeline. These models enable a variety of downstream tasks, such as image captioning, semantic tagging…

]]> Lee Yang <![CDATA[Distributed Deep Learning Made Easy with Spark 3.4]]> http://www.open-lab.net/blog/?p=66415 2024-06-06T16:23:05Z 2023-06-12T20:30:00Z

Apache Spark is an industry-leading platform for distributed extract, transform, and load (ETL) workloads on large-scale data. However, with the advent of deep...]]>

Apache Spark is an industry-leading platform for distributed extract, transform, and load (ETL) workloads on large-scale data. However, with the advent of deep learning (DL), many Spark practitioners have sought to add DL models to their data processing pipelines across a variety of use cases like sales predictions, content recommendations, sentiment analysis, and fraud detection. Yet…

]]> 0 ��˳��97caoporen��