Evaluating large language models (LLMs) and retrieval-augmented generation (RAG) systems is a complex and nuanced process, reflecting the sophisticated and multifaceted nature of these systems. Unlike traditional machine learning (ML) models, LLMs generate a wide range of diverse and often unpredictable outputs, making standard evaluation metrics insufficient. Key challenges include the…
]]>At NVIDIA, the Sales Operations team equips the Sales team with the tools and resources needed to bring cutting-edge hardware and software to market. Managing this across NVIDIA’s diverse technology is a complex challenge shared by many enterprises. Through collaboration with our Sales team, we found that they rely on internal and external documentation…
]]>Data is the lifeblood of modern enterprises, fueling everything from innovation to strategic decision making. However, as organizations amass ever-growing volumes of information—from technical documentation to internal communications—they face a daunting challenge: how to extract meaningful insights and actionable structure from an overwhelming sea of unstructured data.
]]>AI agents powered by large language models (LLMs) help organizations streamline and reduce manual workloads. These agents use multilevel, iterative reasoning to analyze problems, devise solutions, and execute tasks with various tools. Unlike traditional chatbots, LLM-powered agents automate complex tasks by effectively understanding and processing information. To avoid potential risks in specific…
]]>NVIDIA NeMo has consistently developed automatic speech recognition (ASR) models that set the benchmark in the industry, particularly those topping the Hugging Face Open ASR Leaderboard. These NVIDIA NeMo ASR models that transcribe speech into text offer a range of architectures designed to optimize both speed and accuracy: Previously, these models faced speed performance…
]]>NVIDIA NIM, part of NVIDIA AI Enterprise, provides containers to self-host GPU-accelerated inferencing microservices for pretrained and customized AI models across clouds, data centers, and workstations. NIM microservices for speech and translation are now available. The new speech and translation microservices leverage NVIDIA Riva and provide automatic speech recognition (ASR)…
]]>Speech and translation AI models developed at NVIDIA are pushing the boundaries of performance and innovation. The NVIDIA Parakeet automatic speech recognition (ASR) family of models and the NVIDIA Canary multilingual, multitask ASR and translation model currently top the Hugging Face Open ASR Leaderboard. In addition, a multilingual P-Flow-based text-to-speech (TTS) model won the LIMMITS ’24…
]]>The integration of speech and translation AI into our daily lives is rapidly reshaping our interactions, from virtual assistants to call centers and augmented reality experiences. Speech AI Day provided valuable insights into the latest advancements in speech AI, showcasing how this technology addresses real-world challenges. In this first of three Speech AI Day sessions…
]]>The telecommunication industry has seen a proliferation of AI-powered technologies in recent years, with speech recognition and translation leading the charge. Multi-lingual AI virtual assistants, digital humans, chatbots, agent assists, and audio transcription are technologies that are revolutionizing the telco industry. Businesses are implementing AI in call centers to address incoming requests…
]]>NVIDIA showed how AI workflows can be leveraged to help you accelerate the development of AI solutions to address a range of use cases at NVIDIA GTC 2023. AI workflows are cloud-native, packaged reference examples showing how NVIDIA AI frameworks can be used to efficiently build AI solutions such as intelligent virtual assistants, digital fingerprinting for cybersecurity…
]]>Join this webinar on January 25 and learn how to build a voice-enabled intelligent virtual assistant to improve customer experiences at contact centers.
]]>As the global service economy grows, companies rely increasingly on contact centers to drive better customer experiences, increase customer satisfaction, and lower costs with increased efficiencies. Customer demand has increased far more rapidly than contact center employment ever could. Combined with the high agent churn rate, customer demand creates a need for more automated real-time customer…
]]>This post was updated in March 2023. Sign up for the latest Speech AI news from NVIDIA. Speech AI is used in a variety of applications, including contact centers’ agent assists for empowering human agents, voice interfaces for intelligent virtual assistants (IVAs), and live captioning in video conferencing. To support these features, speech AI technology includes automatic speech recognition…
]]>Build better GPU-accelerated Speech AI applications with the latest NVIDIA Riva updates, including enterprise support.
]]>Major updates to Riva, an SDK for building speech AI applications, and a paid Riva Enterprise offering were announced at NVIDIA GTC 2022 last week. Several key updates to the NeMo framework, a framework for training Large Language Models, were also announced. Riva offers world-class accuracy for real-time automatic speech recognition (ASR) and text-to-speech (TTS) skills across multiple…
]]>This month, NVIDIA released world-class speech-to-text models for Spanish, German, and Russian in Riva, powering enterprises to deploy speech AI applications globally. In addition, enterprises can now create expressive speech interfaces using Riva’s customizable text-to-speech pipeline. NVIDIA Riva is a GPU-accelerated speech AI SDK for developing real-time applications like live captioning…
]]>Today, NVIDIA announced that it will help developers, researchers, and data scientists working with Graph Neural Networks (GNN) on large heterogeneous graphs with billions of edges by providing GPU-accelerated Deep Graph Library (DGL) containers. These containers will enable developers to work more efficiently in an integrated, GPU-accelerated environment that combines DGL and PyTorch.
]]>The audio and video quality of real-time communication applications such as virtual collaboration and content creation applications is the true gauge of users’ real-time communication experience. They rely heavily on network bandwidth and user equipment quality. Narrow network bandwidth and low-quality equipment produce unstable and noisy audio and video outputs. This problem is often…
]]>NVIDIA GTC is right around the corner! Join NVIDIA November 8-11, as we host sessions covering the latest breakthroughs in conversational AI, recommender systems, and video conferencing. Here’s a sneak peek at some of our top sessions: Conversational AI Demystified, and a Hands-on Walkthrough Presented by: NVIDIA and SVA System Vertrieb Alexander GmbH Thanks to new tools and…
]]>Video conferencing, audio and video streaming, and telecommunications recently exploded due to pandemic-related closures and work-from-home policies. Businesses, educational institutions, and public-sector agencies are experiencing a skyrocketing demand for virtual collaboration and content creation applications. The crucial part of online communication is the video stream, whether it’s a simple…
]]>With audio and video streaming, conferencing, and telecommunication on the rise, it has become essential for developers to build applications with outstanding audio quality and enable end users to communicate and collaborate effectively. Various background noises can disrupt communication, ranging from traffic and construction to dogs barking and babies crying. Moreover, a user could talk in a…
]]>