NVIDIA TensorRT is a high-performance deep learning inference optimizer and runtime that delivers low latency, high-throughput inference for deep learning applications. NVIDIA released TensorRT last year with the goal of accelerating deep learning inference for production deployment. In this post we’ll introduce TensorRT 3, which improves performance versus previous versions and includes new…
]]>