Dheeraj Peri – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-07-24T22:40:55Z http://www.open-lab.net/blog/feed/ Dheeraj Peri <![CDATA[Double PyTorch Inference Speed for Diffusion Models Using Torch-TensorRT]]> http://www.open-lab.net/blog/?p=103677 2025-07-24T22:40:55Z 2025-07-24T20:13:37Z NVIDIA TensorRT is an AI inference library built to optimize machine learning models for deployment on NVIDIA GPUs. TensorRT targets dedicated hardware in...]]>

NVIDIA TensorRT is an AI inference library built to optimize machine learning models for deployment on NVIDIA GPUs. TensorRT targets dedicated hardware in modern architectures, such as NVIDIA Blackwell Tensor Cores, to accelerate common operations found in advanced machine learning models. It can also modify AI models to run more efficiently on specific hardware by using optimization techniques…

Source

]]>
Dheeraj Peri <![CDATA[Accelerating Quantized Networks with the NVIDIA QAT Toolkit for TensorFlow and NVIDIA TensorRT]]> http://www.open-lab.net/blog/?p=48838 2023-04-04T17:00:05Z 2022-06-16T17:28:18Z We��re excited to announce the NVIDIA Quantization-Aware Training (QAT) Toolkit for TensorFlow 2 with the goal of accelerating the quantized networks with...]]>

Join the NVIDIA Triton and NVIDIA TensorRT community to stay current on the latest product updates, bug fixes, content, best practices, and more. We’re excited to announce the NVIDIA Quantization-Aware Training (QAT) Toolkit for TensorFlow 2 with the goal of accelerating the quantized networks with NVIDIA TensorRT on NVIDIA GPUs. This toolkit provides you with an easy-to-use API to quantize…

Source

]]>
0
Dheeraj Peri <![CDATA[Estimating Depth with ONNX Models and Custom Layers Using NVIDIA TensorRT]]> http://www.open-lab.net/blog/?p=20731 2022-10-10T19:00:08Z 2020-09-24T18:20:20Z TensorRT is an SDK for high performance, deep learning inference. It includes a deep learning inference optimizer and a runtime that delivers low latency and...]]>

TensorRT is an SDK for high performance, deep learning inference. It includes a deep learning inference optimizer and a runtime that delivers low latency and high throughput for deep learning applications. TensorRT uses the ONNX format as an intermediate representation for converting models from major frameworks such as TensorFlow and PyTorch. In this post, you learn how to convert PyTorch…

Source

]]>
5
���˳���97caoporen����