Rafael Valle – NVIDIA Technical Blog

Rafael Valle – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2024-09-19T19:34:33Z http://www.open-lab.net/blog/feed/ Rafael Valle <![CDATA[Achieving State-of-the-Art Zero-Shot Waveform Audio Generation across Audio Types]]> http://www.open-lab.net/blog/?p=88329 2024-09-19T19:34:33Z 2024-09-05T20:30:00Z

Stunning audio content is an essential component of virtual worlds. Audio generative AI plays a key role in creating this content, and NVIDIA is continuously...]]>

Stunning audio content is an essential component of virtual worlds. Audio generative AI plays a key role in creating this content, and NVIDIA is continuously pushing the limits in this field of research. BigVGAN, developed in collaboration with the NVIDIA Applied Deep Learning Research and NVIDIA NeMo teams, is a generative AI model specialized in audio waveform synthesis that achieves state-of…

]]> Rafael Valle <![CDATA[Training Your Own Voice Font Using Flowtron]]> http://www.open-lab.net/blog/?p=20673 2023-07-27T20:00:22Z 2020-10-03T23:40:08Z

Recent conversational AI research has demonstrated automatically generating high quality, human-like audio from text. For example, you can use Tacotron 2 and...]]>

Recent conversational AI research has demonstrated automatically generating high quality, human-like audio from text. For example, you can use Tacotron 2 and WaveGlow to convert text into high quality, natural-sounding speech in real time. You can also use FastPitch to generate mel spectrograms in parallel, achieving good speedup compared to Tacotron 2. However, current text-to-speech models…

]]> 0 ��˳��97caoporen��