Sirisha Rella – NVIDIA Technical Blog

Sirisha Rella – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-01-23T19:24:23Z http://www.open-lab.net/blog/feed/ Sirisha Rella <![CDATA[Generative AI Agents Developer Contest: Top Tips for Getting Started]]> http://www.open-lab.net/blog/?p=82980 2024-10-18T20:21:31Z 2024-05-29T16:01:10Z

Join our contest that runs through June 17 and showcase your innovation using cutting-edge generative AI-powered applications using NVIDIA and LangChain...]]>

Join our contest that runs through June 17 and showcase your innovation using cutting-edge generative AI-powered applications using NVIDIA and LangChain technologies. To get you started, we explore a few applications for inspiring your creative journey, while sharing tips and best practices to help you succeed in the development process. There are many different practical applications…

]]> Sirisha Rella <![CDATA[Speech AI Spotlight: Visualizing Spoken Language and Sounds on AR Glasses]]> http://www.open-lab.net/blog/?p=66701 2023-07-13T19:00:30Z 2023-06-23T15:00:00Z

Audio can include a wide range of sounds, from human speech to non-speech sounds like barking dogs and sirens. When designing accessible applications for people...]]>

Audio can include a wide range of sounds, from human speech to non-speech sounds like barking dogs and sirens. When designing accessible applications for people with hearing difficulties, the application should be able to recognize sounds and understand speech. Such technology would help deaf or hard-of-hearing individuals with visualizing speech, like human conversations and non-speech…

]]> 1 Sirisha Rella <![CDATA[Exploring Unique Applications of Text-to-Speech Technology]]> http://www.open-lab.net/blog/?p=62914 2023-06-09T22:32:18Z 2023-04-19T18:29:00Z

When interacting with a virtual assistant, you give a command and receive a verbal response. The technology powering this generated voice response is known as...]]>

When interacting with a virtual assistant, you give a command and receive a verbal response. The technology powering this generated voice response is known as text-to-speech (TTS). TTS applications are highly useful as they enable greater content accessibility for those who use assistive devices. With the latest TTS techniques, you can generate a synthetic voice from only a few minutes of…

]]> 0 Sirisha Rella <![CDATA[Speech AI Technology Enables Natural Interactions with Service Robots]]> http://www.open-lab.net/blog/?p=59175 2023-06-12T08:17:08Z 2022-12-17T00:23:07Z

From taking your order and serving you food in a restaurant to playing poker with you, service robots are becoming increasingly prevalent. Globally, you can...]]>

From taking your order and serving you food in a restaurant to playing poker with you, service robots are becoming increasingly prevalent. Globally, you can find these service robots at hospitals, airports, and retail stores. According to Gartner, by 2030, 80% of humans will engage with smart robots daily, due to smart robot advancements in intelligence, social interactions…

]]> 0 Sirisha Rella <![CDATA[Deep Learning is Transforming ASR and TTS Algorithms]]> http://www.open-lab.net/blog/?p=59169 2023-04-04T21:25:25Z 2022-12-16T23:48:41Z

Speech is one of the primary means to communicate with an AI-powered application. From virtual assistants to digital avatars, voice-based interfaces are...]]>

Speech is one of the primary means to communicate with an AI-powered application. From virtual assistants to digital avatars, voice-based interfaces are changing how we typically interact with smart devices. Deep learning techniques for speech recognition and speech synthesis are helping improve the user experience—think human-like responses and natural-sounding tones. If you plan to…

]]> 0 Sirisha Rella <![CDATA[Speech AI Spotlight: Reimagine Customer Service with Virtual Agents]]> http://www.open-lab.net/blog/?p=58387 2023-06-12T08:23:40Z 2022-12-14T18:00:00Z

Virtual agents or voice-enabled assistants have been around for quite some time. But in the last decade, their usefulness and popularity have exploded with the...]]>

Virtual agents or voice-enabled assistants have been around for quite some time. But in the last decade, their usefulness and popularity have exploded with the use of AI. According to Gartner, virtual assistants will automate up to 75% of tasks for call center agents by 2025–up from 30% in 2021. This translates to a better experience for both contact center agents and customers.

]]> 0 Sirisha Rella <![CDATA[Developing the Next Generation of Extended Reality Applications with Speech AI]]> http://www.open-lab.net/blog/?p=54831 2023-11-03T07:15:10Z 2022-09-14T16:00:00Z

Virtual reality (VR), augmented reality (AR), and mixed reality (MR) environments can feel incredibly real due to the physically immersive experience. Adding a...]]>

]]> 0 Sirisha Rella <![CDATA[Essential Guide to Automatic Speech Recognition Technology]]> http://www.open-lab.net/blog/?p=51263 2023-06-12T09:10:15Z 2022-08-08T21:30:00Z

Over the past decade, AI-powered speech recognition systems have slowly become part of our everyday lives, from voice search to virtual assistants in contact...]]>

Over the past decade, AI-powered speech recognition systems have slowly become part of our everyday lives, from voice search to virtual assistants in contact centers, cars, hospitals, and restaurants. These speech recognition developments are made possible by deep learning advancements. Sign up for the latest Data Science news. Get the latest announcements, notebooks, hands-on tutorials…

]]> 0 Sirisha Rella <![CDATA[Build Speech AI in Multiple Languages and Train Large Language Models with the Latest from Riva and NeMo Framework]]> http://www.open-lab.net/blog/?p=45648 2023-06-12T20:54:30Z 2022-03-28T16:00:00Z

Major updates to Riva, an SDK for building speech AI applications, and a paid Riva Enterprise offering were announced at NVIDIA GTC 2022 last week. Several key...]]>

Major updates to Riva, an SDK for building speech AI applications, and a paid Riva Enterprise offering were announced at NVIDIA GTC 2022 last week. Several key updates to the NeMo framework, a framework for training Large Language Models, were also announced. Riva offers world-class accuracy for real-time automatic speech recognition (ASR) and text-to-speech (TTS) skills across multiple…

]]> 0 Sirisha Rella <![CDATA[Create Speech AI Applications in Multiple Languages and Customize Text-to-Speech with Riva]]> http://www.open-lab.net/blog/?p=43993 2023-03-14T18:55:13Z 2022-02-07T17:00:00Z

This month, NVIDIA released world-class speech-to-text models for Spanish, German, and Russian in Riva, powering enterprises to deploy speech AI applications...]]>

This month, NVIDIA released world-class speech-to-text models for Spanish, German, and Russian in Riva, powering enterprises to deploy speech AI applications globally. In addition, enterprises can now create expressive speech interfaces using Riva’s customizable text-to-speech pipeline. NVIDIA Riva is a GPU-accelerated speech AI SDK for developing real-time applications like live captioning…

]]> 6 Sirisha Rella <![CDATA[ICYMI: New AI Tools and Technologies Announced at NVIDIA GTC Keynote]]> http://www.open-lab.net/blog/?p=39300 2023-03-22T01:16:48Z 2021-11-09T19:08:00Z

At NVIDIA GTC this November, new software tools were announced that help developers build real-time speech applications, optimize inference for a variety of...]]>

At NVIDIA GTC this November, new software tools were announced that help developers build real-time speech applications, optimize inference for a variety of use-cases, optimize open-source interoperability for recommender systems, and more. Watch the keynote from CEO, Jensen Huang, to learn about the latest NVIDIA breakthroughs. Today, NVIDIA unveiled a new version of NVIDIA Riva with a…

]]> 0 Sirisha Rella <![CDATA[Speech Recognition: Deploying Models to Production]]> http://www.open-lab.net/blog/?p=39744 2023-12-30T01:51:25Z 2021-11-09T09:37:00Z

This post is part of a series about generating accurate speech transcription. For part 1, see Speech Recognition: Generating Accurate Domain-Specific Audio...]]>

This post is part of a series about generating accurate speech transcription. For part 1, see Speech Recognition: Generating Accurate Domain-Specific Audio Transcriptions Using NVIDIA Riva. For part 2, see Speech Recognition: Customizing Models to Your Domain Using Transfer Learning. NVIDIA Riva is an AI speech SDK for developing real-time applications like transcription, virtual assistants…

]]> 0 Sirisha Rella <![CDATA[Speech Recognition: Customizing Models to Your Domain Using Transfer Learning]]> http://www.open-lab.net/blog/?p=39742 2023-03-22T01:16:53Z 2021-11-09T09:36:00Z

This post is part of a series about generating accurate speech transcription. For part 1, see Speech Recognition: Generating Accurate Transcriptions Using...]]>

This post is part of a series about generating accurate speech transcription. For part 1, see Speech Recognition: Generating Accurate Transcriptions Using NVIDIA Riva. For part 3, see Speech Recognition: Deploying Models to Production. Creating a new AI deep learning model from scratch is an extremely time– and resource-intensive process. A common solution to this problem is to employ…

]]> 0 Sirisha Rella <![CDATA[Speech Recognition: Generating Accurate Domain-Specific Audio Transcriptions Using NVIDIA Riva]]> http://www.open-lab.net/blog/?p=39715 2025-01-23T19:24:23Z 2021-11-09T09:35:00Z

This post is part of a series about generating accurate speech transcription. For part 2, see Speech Recognition: Customizing Models to Your Domain Using...]]>

This post is part of a series about generating accurate speech transcription. For part 2, see Speech Recognition: Customizing Models to Your Domain Using Transfer Learning. For part 3, see Speech Recognition: Deploying Models to Production. Every day millions of audio minutes are produced across several industries such as Telecommunications, Finance, and Unified Communications as a Service…

]]> 2 Sirisha Rella <![CDATA[NVIDIA at INTERSPEECH 2021]]> http://www.open-lab.net/blog/?p=36357 2022-08-21T23:52:31Z 2021-08-18T22:02:56Z

Researchers from around the world working on speech applications are gathering this month for INTERSPEECH, a conference focused on the latest research and...]]>

Researchers from around the world working on speech applications are gathering this month for INTERSPEECH, a conference focused on the latest research and technologies in speech processing. NVIDIA researchers will present papers on groundbreaking research in speech recognition and speech synthesis. Conversational AI research is fueling innovations in speech processing that help computers…

]]> 0 Sirisha Rella <![CDATA[Speeding Up Deep Learning Inference Using NVIDIA TensorRT (Updated)]]> http://www.open-lab.net/blog/?p=34881 2022-10-10T18:51:45Z 2021-07-20T13:00:00Z

This post was updated July 20, 2021 to reflect NVIDIA TensorRT 8.0 updates. NVIDIA TensorRT is an SDK for deep learning inference. TensorRT provides APIs and...]]>

This post was updated July 20, 2021 to reflect NVIDIA TensorRT 8.0 updates. NVIDIA TensorRT is an SDK for deep learning inference. TensorRT provides APIs and parsers to import trained models from all major deep learning frameworks. It then generates optimized runtime engines deployable in the datacenter as well as in automotive and embedded environments. This post provides a simple…

]]> 5 Sirisha Rella <![CDATA[NVIDIA Accelerates Conversational AI from Research to Production with Latest Updates in NVIDIA NeMo and NVIDIA Riva]]> http://www.open-lab.net/blog/?p=32530 2022-08-21T23:51:51Z 2021-06-04T18:04:00Z

NVIDIA recently released NVIDIA Riva with world-class speech recognition capability for enterprises to generate highly accurate transcriptions and NVIDIA NeMo...]]>

NVIDIA recently released NVIDIA Riva with world-class speech recognition capability for enterprises to generate highly accurate transcriptions and NVIDIA NeMo 1.0, which includes new state-of-the-art speech and language models for democratizing and accelerating conversational AI research. NVIDIA Riva world-class speech recognition is an out-of-the-box speech service that can be easily…

]]> 0 Sirisha Rella <![CDATA[Announcing Megatron for Training Trillion Parameter Models and NVIDIA Riva Availability]]> http://www.open-lab.net/blog/?p=30236 2023-12-30T00:45:19Z 2021-04-12T19:38:00Z

Conversational AI is opening new ways for enterprises to interact with customers in every industry using applications like real-time transcription, translation,...]]>

Conversational AI is opening new ways for enterprises to interact with customers in every industry using applications like real-time transcription, translation, chatbots, and virtual assistants. Building domain-specific interactive applications requires state-of-the-art models, optimizations for real-time performance, and tools to adapt those models with your data. This week at GTC…

]]> 0 Sirisha Rella <![CDATA[Integrating with Data Generation and Labeling Tools for Accurate AI Training]]> http://www.open-lab.net/blog/?p=30162 2023-03-22T01:11:53Z 2021-04-12T19:31:00Z

Data plays a crucial role in creating intelligent applications. To create an efficient AI/ ML app, you must train machine learning models with high-quality,...]]>

Data plays a crucial role in creating intelligent applications. To create an efficient AI/ ML app, you must train machine learning models with high-quality, labeled datasets. Generating and labeling such data from scratch has been a critical bottleneck for enterprises. Many companies prefer a one-stop solution to support their AI/ML workflow from data generation, data labeling, model training/

]]> 0 Sirisha Rella <![CDATA[Speeding Up Development of Speech and Language Models with NVIDIA NeMo]]> http://www.open-lab.net/blog/?p=17649 2023-03-22T01:09:09Z 2020-10-05T13:00:00Z

[stextbox id="info"]This is an updated version of Neural Modules for Fast Development of Speech and Language Models. This post upgrades the NeMo diagram with...]]>

This is an updated version of Neural Modules for Fast Development of Speech and Language Models. This post upgrades the NeMo diagram with PyTorch and PyTorch Lightning support and updates the tutorial with the new code base. As a researcher building state-of-the-art speech and language models, you must be able to quickly experiment with novel network architectures.

]]> 0 Sirisha Rella <![CDATA[Speeding Up Deep Learning Inference Using TensorRT]]> http://www.open-lab.net/blog/?p=17026 2022-10-10T18:51:44Z 2020-04-22T00:39:30Z

[stextbox id="info"]Looking for more? Check out the hands-on DLI training course: Optimization and Deployment of TensorFlow Models with TensorRT[/stextbox] This...]]>

]]> 5 ��˳��97caoporen��