NVIDIA NeMo is an end-to-end platform for the development of multimodal generative AI models at scale anywhere—on any cloud and on-premises. The NeMo team just released?Canary, a multilingual model that transcribes speech in English, Spanish, German, and French with punctuation and capitalization. Canary also provides bi-directional translation, between English and the three other supported…
]]>Speech and translation AI models developed at NVIDIA are pushing the boundaries of performance and innovation. The NVIDIA Parakeet automatic speech recognition (ASR) family of models and the NVIDIA Canary multilingual, multitask ASR and translation model currently top the Hugging Face Open ASR Leaderboard. In addition, a multilingual P-Flow-based text-to-speech (TTS) model won the LIMMITS ’24…
]]>