This post was updated July 20, 2021 to reflect NVIDIA TensorRT 8.0 updates. Join the NVIDIA Triton and NVIDIA TensorRT community to stay current on the latest product updates, bug fixes, content, best practices, and more. When deploying a neural network, it’s useful to think about how the network could be made to run faster or take less space. A more efficient network can make better…
]]>Deep neural networks achieve outstanding performance in a variety of fields, such as computer vision, speech recognition, and natural language processing. The computational power needed to process these neural networks is rapidly increasing, so efficient models and computation are crucial. Neural network pruning, removing unnecessary model parameters to yield a sparse network, is a useful way to…
]]>