Denis Timonin – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2023-05-24T00:22:56Z http://www.open-lab.net/blog/feed/ Denis Timonin <![CDATA[Accelerated Inference for Large Transformer Models Using NVIDIA Triton Inference Server]]> http://www.open-lab.net/blog/?p=51300 2023-05-24T00:22:56Z 2022-08-03T17:00:00Z This is the first part of a two-part series discussing the NVIDIA Triton Inference Server��s FasterTransformer (FT) library, one of the fastest libraries for...]]>

This is the first part of a two-part series discussing the NVIDIA Triton Inference Server’s FasterTransformer (FT) library, one of the fastest libraries for distributed inference of transformers of any size (up to trillions of parameters). It provides an overview of FasterTransformer, including the benefits of using the library. Join the NVIDIA Triton and NVIDIA TensorRT community to stay…

Source

]]>
1
Denis Timonin <![CDATA[Deploying GPT-J and T5 with NVIDIA Triton Inference Server]]> http://www.open-lab.net/blog/?p=51318 2023-03-14T23:22:55Z 2022-08-03T17:00:00Z This is the second part of a two-part series about NVIDIA tools that allow you to run large transformer models for accelerated inference. For an introduction to...]]>

This is the second part of a two-part series about NVIDIA tools that allow you to run large transformer models for accelerated inference. For an introduction to the FasterTransformer library (Part 1), see Accelerated Inference for Large Transformer Models Using NVIDIA Triton Inference Server. Join the NVIDIA Triton and NVIDIA TensorRT community to stay current on the latest product updates…

Source

]]>
7
���˳���97caoporen����