Wenwen Gao – NVIDIA Technical Blog

Wenwen Gao – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-05-15T19:07:45Z http://www.open-lab.net/blog/feed/ Wenwen Gao <![CDATA[Run Hugging Face Models Instantly with Day-0 Support from NVIDIA NeMo Framework]]> http://www.open-lab.net/blog/?p=99933 2025-05-15T19:07:31Z 2025-05-12T17:48:24Z

As organizations strive to maximize the value of their generative AI investments, accessing the latest model developments is crucial to continued success. By...]]>

As organizations strive to maximize the value of their generative AI investments, accessing the latest model developments is crucial to continued success. By using state-of-the-art models on Day-0, teams can harness these innovations efficiently, maintain relevance, and be competitive. The past year has seen a flurry of exciting model series releases in the open-source community…

]]> Wenwen Gao <![CDATA[Turbocharge LLM Training Across Long-Haul Data Center Networks with NVIDIA Nemo Framework]]> http://www.open-lab.net/blog/?p=99764 2025-05-15T19:07:37Z 2025-05-08T18:28:58Z

Multi-data center training is becoming essential for AI factories as pretraining scaling fuels the creation of even larger models, leading the demand for...]]>

Multi-data center training is becoming essential for AI factories as pretraining scaling fuels the creation of even larger models, leading the demand for computing performance to outpace the capabilities of a single facility. By distributing workloads across multiple data centers, organizations can overcome limitations in power, cooling, and space, enabling the training of even larger…

]]> Wenwen Gao <![CDATA[LLM Inference Benchmarking Guide: NVIDIA GenAI-Perf and NIM]]> http://www.open-lab.net/blog/?p=99180 2025-05-15T19:07:45Z 2025-05-06T17:35:39Z

This is the second post in the LLM Benchmarking series, which shows how to use GenAI-Perf to benchmark the Meta Llama 3 model when deployed with NVIDIA NIM.?...]]>

This is the second post in the LLM Benchmarking series, which shows how to use GenAI-Perf to benchmark the Meta Llama 3 model when deployed with NVIDIA NIM. When building LLM-based applications, it is critical to understand the performance characteristics of these models on a given hardware. This serves multiple purposes: As a client-side LLM-focused benchmarking tool…

]]> Wenwen Gao <![CDATA[LLM Inference Benchmarking: Fundamental Concepts]]> http://www.open-lab.net/blog/?p=98215 2025-05-09T18:23:04Z 2025-04-02T17:00:00Z

This is the first post in the large language model latency-throughput benchmarking series, which aims to instruct developers on common metrics used for LLM...]]>

This is the first post in the large language model latency-throughput benchmarking series, which aims to instruct developers on common metrics used for LLM benchmarking, fundamental concepts, and how to benchmark your LLM applications. The past few years have witnessed the rise in popularity of generative AI and large language models (LLMs), as part of a broad AI revolution.

]]> Wenwen Gao <![CDATA[Accelerate Custom Video Foundation Model Pipelines with New NVIDIA NeMo Framework Capabilities]]> http://www.open-lab.net/blog/?p=94541 2025-03-20T16:23:00Z 2025-01-07T16:00:00Z

Generative AI has evolved from text-based models to multimodal models, with a recent expansion into video, opening up new potential uses across various...]]>

Generative AI has evolved from text-based models to multimodal models, with a recent expansion into video, opening up new potential uses across various industries. Video models can create new experiences for users or simulate scenarios for training autonomous agents at scale. They are helping revolutionize various industries including robotics, autonomous vehicles, and entertainment.

]]> Wenwen Gao <![CDATA[Fine-Tune and Align LLMs Easily with NVIDIA NeMo Customizer]]> http://www.open-lab.net/blog/?p=80290 2025-02-17T05:26:51Z 2024-03-27T18:00:00Z

As large language models (LLMs) continue to gain traction in enterprise AI applications, the demand for custom models that can understand and integrate specific...]]>

As large language models (LLMs) continue to gain traction in enterprise AI applications, the demand for custom models that can understand and integrate specific industry terminology, domain expertise, and unique organizational requirements becomes increasingly important. To address this growing need for customizing LLMs, the NVIDIA NeMo team has announced an early access program for NeMo…

]]> Wenwen Gao <![CDATA[Scaling Recommendation System Inference with NVIDIA Merlin Hierarchical Parameter Server]]> http://www.open-lab.net/blog/?p=54195 2023-02-28T01:34:06Z 2022-08-31T18:00:00Z

Recommendation systems are widely used today to personalize user experiences and improve customer engagement in various settings like e-commerce, social media,...]]>

Recommendation systems are widely used today to personalize user experiences and improve customer engagement in various settings like e-commerce, social media, and news feeds. Serving user requests with low latency and high accuracy is critical to sustaining user engagement. This includes performing high-speed lookups and computations while seamlessly refreshing models with the newest…

]]> 1 Wenwen Gao <![CDATA[Fast, Terabyte-Scale Recommender Training Made Easy with NVIDIA Merlin Distributed-Embeddings]]> http://www.open-lab.net/blog/?p=54372 2022-09-01T23:00:57Z 2022-08-31T16:00:00Z

Embeddings play a key role in deep learning recommender models. They are used to map encoded categorical inputs in data to numerical values that can be...]]>

Embeddings play a key role in deep learning recommender models. They are used to map encoded categorical inputs in data to numerical values that can be processed by the math layers or multilayer perceptrons (MLPs). Embeddings often constitute most of the parameters in deep learning recommender models and can be quite large, even reaching into the terabyte scale. It can be difficult to fit…

]]> 0 ��˳��97caoporen��