Jiahong Liu – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2024-12-12T19:35:15Z http://www.open-lab.net/blog/feed/ Jiahong Liu <![CDATA[NVIDIA TensorRT-LLM Now Accelerates Encoder-Decoder Models with In-Flight Batching]]> http://www.open-lab.net/blog/?p=93516 2024-12-12T19:35:15Z 2024-12-11T22:10:51Z NVIDIA recently announced that NVIDIA TensorRT-LLM now accelerates encoder-decoder model architectures. TensorRT-LLM is an open-source library that optimizes...]]>

NVIDIA recently announced that NVIDIA TensorRT-LLM now accelerates encoder-decoder model architectures. TensorRT-LLM is an open-source library that optimizes inference for diverse model architectures, including the following: The addition of encoder-decoder model support further expands TensorRT-LLM capabilities, providing highly optimized inference for an even broader range of…

Source

]]>
Jiahong Liu <![CDATA[Building a Speech-Enabled AI Virtual Assistant with NVIDIA Riva on Amazon EC2]]> http://www.open-lab.net/blog/?p=50606 2023-03-14T18:54:05Z 2022-07-28T15:30:00Z Speech AI can assist human agents in contact centers, power virtual assistants and digital avatars, generate live captioning in video conferencing, and much...]]>

Speech AI can assist human agents in contact centers, power virtual assistants and digital avatars, generate live captioning in video conferencing, and much more. Under the hood, these voice-based technologies orchestrate a network of automatic speech recognition (ASR) and text-to-speech (TTS) pipelines to deliver intelligent, real-time responses. Sign up for the latest Data Science news.

Source

]]>
3
Jiahong Liu <![CDATA[Getting the Most Out of NVIDIA T4 on AWS G4 Instances]]> http://www.open-lab.net/blog/?p=31638 2022-08-21T23:51:42Z 2021-05-14T18:43:00Z As the explosive growth of AI models continues unabated, natural language processing and understanding are at the forefront of this growth. As the industry...]]>

As the explosive growth of AI models continues unabated, natural language processing and understanding are at the forefront of this growth. As the industry heads toward trillion-parameter models and beyond, acceleration for AI inference is now a must-have. Many organizations deploy these services in the cloud and seek to get optimal performance and utility out of every instance they rent.

Source

]]>
0
���˳���97caoporen����