Zhiyu Cheng – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2024-08-22T18:24:54Z http://www.open-lab.net/blog/feed/ Zhiyu Cheng <![CDATA[NVIDIA TensorRT Model Optimizer v0.15 Boosts Inference Performance and Expands Model Support]]> http://www.open-lab.net/blog/?p=87227 2024-08-22T18:24:54Z 2024-08-15T17:11:37Z NVIDIA has announced the latest v0.15 release of NVIDIA TensorRT Model Optimizer, a state-of-the-art quantization toolkit of model optimization techniques...]]>

NVIDIA has announced the latest v0.15 release of NVIDIA TensorRT Model Optimizer, a state-of-the-art quantization toolkit of model optimization techniques including quantization, sparsity, and pruning. These techniques reduce model complexity and enable downstream inference frameworks like NVIDIA TensorRT-LLM and NVIDIA TensorRT to more efficiently optimize the inference speed of generative AI…

Source

]]>
Zhiyu Cheng <![CDATA[NVIDIA TensorRT Accelerates Stable Diffusion Nearly 2x Faster with 8-bit Post-Training Quantization]]> http://www.open-lab.net/blog/?p=78835 2024-04-09T23:45:30Z 2024-03-08T01:17:34Z In the dynamic realm of generative AI, diffusion models stand out as the most powerful architecture for generating high-quality images with text prompts. Models...]]>

In the dynamic realm of generative AI, diffusion models stand out as the most powerful architecture for generating high-quality images with text prompts. Models like Stable Diffusion have revolutionized creative applications. However, the inference process of diffusion models can be computationally intensive due to the iterative denoising steps required. This presents significant challenges…

Source

]]>
10
���˳���97caoporen����