Wei-Ming Chen – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2024-08-22T18:24:54Z http://www.open-lab.net/blog/feed/ Wei-Ming Chen <![CDATA[NVIDIA TensorRT Model Optimizer v0.15 Boosts Inference Performance and Expands Model Support]]> http://www.open-lab.net/blog/?p=87227 2024-08-22T18:24:54Z 2024-08-15T17:11:37Z NVIDIA has announced the latest v0.15 release of NVIDIA TensorRT Model Optimizer, a state-of-the-art quantization toolkit of model optimization techniques...]]>

NVIDIA has announced the latest v0.15 release of NVIDIA TensorRT Model Optimizer, a state-of-the-art quantization toolkit of model optimization techniques including quantization, sparsity, and pruning. These techniques reduce model complexity and enable downstream inference frameworks like NVIDIA TensorRT-LLM and NVIDIA TensorRT to more efficiently optimize the inference speed of generative AI…

Source

]]>
���˳���97caoporen����