Yiheng Zhang – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-05-29T19:05:05Z http://www.open-lab.net/blog/feed/ Yiheng Zhang <![CDATA[NVIDIA TensorRT Unlocks FP4 Image Generation ?for NVIDIA Blackwell GeForce RTX 50 Series GPUs]]> http://www.open-lab.net/blog/?p=99256 2025-05-29T19:05:05Z 2025-05-14T15:05:11Z The launch of the NVIDIA Blackwell platform ushered in a new era of improvements in generative AI technology. At its forefront is the newly launched GeForce RTX...]]>

The launch of the NVIDIA Blackwell platform ushered in a new era of improvements in generative AI technology. At its forefront is the newly launched GeForce RTX 50 series GPUs for PCs and workstations that boast fifth-generation Tensor Cores with 4-bit floating point compute (FP4)—a must-have for accelerating advanced generative AI models like FLUX from Black Forest Labs. As the latest image…

Source

]]>
Yiheng Zhang <![CDATA[NVIDIA Blackwell Platform Sets New LLM Inference Records in MLPerf Inference v4.1]]> http://www.open-lab.net/blog/?p=87957 2024-09-05T17:57:17Z 2024-08-28T15:00:00Z Large language model (LLM) inference is a full-stack challenge. Powerful GPUs, high-bandwidth GPU-to-GPU interconnects, efficient acceleration libraries, and a...]]>

Large language model (LLM) inference is a full-stack challenge. Powerful GPUs, high-bandwidth GPU-to-GPU interconnects, efficient acceleration libraries, and a highly optimized inference engine are required for high-throughput, low-latency inference. MLPerf Inference v4.1 is the latest version of the popular and widely recognized MLPerf Inference benchmarks, developed by the MLCommons…

Source

]]>
1
Yiheng Zhang <![CDATA[NVIDIA H200 Tensor Core GPUs and NVIDIA TensorRT-LLM Set MLPerf LLM Inference Records]]> http://www.open-lab.net/blog/?p=80197 2024-11-14T15:53:12Z 2024-03-27T15:29:05Z Generative AI is unlocking new computing applications that greatly augment human capability, enabled by continued model innovation. Generative AI...]]>

Generative AI is unlocking new computing applications that greatly augment human capability, enabled by continued model innovation. Generative AI models—including large language models (LLMs)—are used for crafting marketing copy, writing computer code, rendering detailed images, composing music, generating videos, and more. The amount of compute required by the latest models is immense and…

Source

]]>
Yiheng Zhang <![CDATA[Leading MLPerf Inference v3.1 Results with NVIDIA GH200 Grace Hopper Superchip Debut]]> http://www.open-lab.net/blog/?p=70450 2023-09-22T16:17:33Z 2023-09-09T16:00:00Z AI is transforming computing, and inference is how the capabilities of AI are deployed in the world��s applications. Intelligent chatbots, image and video...]]>

AI is transforming computing, and inference is how the capabilities of AI are deployed in the world’s applications. Intelligent chatbots, image and video synthesis from simple text prompts, personalized content recommendations, and medical imaging are just a few examples of AI-powered applications. Inference workloads are both computationally demanding and diverse, requiring that platforms be…

Source

]]>
1
Yiheng Zhang <![CDATA[Setting New Records in MLPerf Inference v3.0 with Full-Stack Optimizations for AI]]> http://www.open-lab.net/blog/?p=62958 2023-07-05T19:23:50Z 2023-04-05T19:10:55Z The most exciting computing applications currently rely on training and running inference on complex AI models, often in demanding, real-time deployment...]]>

The most exciting computing applications currently rely on training and running inference on complex AI models, often in demanding, real-time deployment scenarios. High-performance, accelerated AI platforms are needed to meet the demands of these applications and deliver the best user experiences. New AI models are constantly being invented to enable new capabilities…

Source

]]>
0
���˳���97caoporen����