Pre-Trained / Foundation Models – NVIDIA Technical Blog

Pre-Trained / Foundation Models – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-06-09T15:00:00Z http://www.open-lab.net/blog/feed/ Anu Srivastava <![CDATA[NVIDIA Accelerates Inference on Meta Llama 4 Scout and Maverick]]> http://www.open-lab.net/blog/?p=98468 2025-04-22T23:57:03Z 2025-04-06T02:18:34Z

The newest generation of the popular Llama AI models is here with Llama 4 Scout and Llama 4 Maverick. Accelerated by NVIDIA open-source software, they can...]]>

The newest generation of the popular Llama AI models is here with Llama 4 Scout and Llama 4 Maverick. Accelerated by NVIDIA open-source software, they can... Decorative image of a llama in sunglasses standing on two feet, with a shadow that is flexing it's muscles.

Decorative image of a llama in sunglasses standing on two feet, with a shadow that is flexing it's muscles.

The newest generation of the popular Llama AI models is here with Llama 4 Scout and Llama 4 Maverick. Accelerated by NVIDIA open-source software, they can achieve over 40K output tokens per second on NVIDIA Blackwell B200 GPUs, and are available to try as NVIDIA NIM microservices. The Llama 4 models are now natively multimodal and multilingual using a mixture-of-experts (MoE) architecture.

]]> 1 Anu Srivastava <![CDATA[Lightweight, Multimodal, Multilingual Gemma 3 Models Are Streamlined for Performance]]> http://www.open-lab.net/blog/?p=96770 2025-04-23T00:33:31Z 2025-03-12T08:45:00Z

Building AI systems with foundation models requires a delicate balancing of resources such as memory, latency, storage, compute, and more. One size does not fit...]]>

Building AI systems with foundation models requires a delicate balancing of resources such as memory, latency, storage, compute, and more. One size does not fit... Diagram of a retail clothing store with an avatar consulting a kiosk.

Building AI systems with foundation models requires a delicate balancing of resources such as memory, latency, storage, compute, and more. One size does not fit all for developers managing cost and user experience when bringing generative AI capability to the rapidly growing ecosystem of AI-powered applications. You need options for high-quality, customizable models that can support large��

]]> 0 Chintan Patel <![CDATA[Develop Academic and Industrial Applications with a New Specialized Math Model]]> http://www.open-lab.net/blog/?p=89747 2024-10-17T18:19:06Z 2024-10-09T16:00:00Z

Mathstral, an advanced AI model developed from the ground up, can deliver superior performance for enhanced learning of math, engineering, and science.]]>

Mathstral, an advanced AI model developed from the ground up, can deliver superior performance for enhanced learning of math, engineering, and science. Decorative image of stars in a geometric shape and colored pink, purple, blue, and green.

Decorative image of stars in a geometric shape and colored pink, purple, blue, and green.

Mathstral, an advanced AI model developed from the ground up, can deliver superior performance for enhanced learning of math, engineering, and science.

]]> 0 Mehran Maghoumi <![CDATA[Streamlining Data Processing for Domain Adaptive Pretraining with NVIDIA NeMo Curator]]> http://www.open-lab.net/blog/?p=87876 2024-10-18T20:11:21Z 2024-09-10T16:30:00Z

Domain-adaptive pretraining (DAPT) of large language models (LLMs) is an important step towards building domain-specific models. These models demonstrate...]]>

Domain-adaptive pretraining (DAPT) of large language models (LLMs) is an important step towards building domain-specific models. These models demonstrate... NVIDIA NeMo Curator icon on a purple background.

NVIDIA NeMo Curator icon on a purple background.

Domain-adaptive pretraining (DAPT) of large language models (LLMs) is an important step towards building domain-specific models. These models demonstrate greater capabilities in domain-specific tasks compared to their off-the-shelf open or commercial counterparts. Recently, NVIDIA published a paper about ChipNeMo, a family of foundation models that are geared toward industrial chip design��

]]> 0 Monika Jhuria <![CDATA[New Foundational Models and Training Capabilities with NVIDIA TAO 5.5]]> http://www.open-lab.net/blog/?p=87263 2024-09-09T19:37:08Z 2024-08-28T16:00:00Z

NVIDIA TAO is a framework designed to simplify and accelerate the development and deployment of AI models. It enables you to use pretrained models, fine-tune...]]>

NVIDIA TAO is a framework designed to simplify and accelerate the development and deployment of AI models. It enables you to use pretrained models, fine-tune... GIF shows multiple photos and images selected within the photos according to a prompt, such as

GIF shows multiple photos and images selected within the photos according to a prompt, such as

NVIDIA TAO is a framework designed to simplify and accelerate the development and deployment of AI models. It enables you to use pretrained models, fine-tune them with your own data, and optimize the models for specific use cases without needing deep AI expertise. TAO integrates seamlessly with the NVIDIA hardware and software ecosystem, providing tools for efficient AI model training��

]]> 0 Elena Rastorgueva <![CDATA[New Standard for Speech Recognition and Translation from the NVIDIA NeMo Canary Model]]> http://www.open-lab.net/blog/?p=80661 2024-08-06T17:19:16Z 2024-04-18T20:09:33Z

NVIDIA NeMo is an end-to-end platform for the development of multimodal generative AI models at scale anywhere��on any cloud and on-premises. The NeMo team...]]>

NVIDIA NeMo is an end-to-end platform for the development of multimodal generative AI models at scale anywhere��on any cloud and on-premises. The NeMo team... Decorative image of text and speech recognition processes encircling the globe.

Decorative image of text and speech recognition processes encircling the globe.

NVIDIA NeMo is an end-to-end platform for the development of multimodal generative AI models at scale anywhere��on any cloud and on-premises. The NeMo team just released?Canary, a multilingual model that transcribes speech in English, Spanish, German, and French with punctuation and capitalization. Canary also provides bi-directional translation, between English and the three other supported��

]]> 1 Hainan Xu <![CDATA[Turbocharge ASR Accuracy and Speed with NVIDIA NeMo Parakeet-TDT]]> http://www.open-lab.net/blog/?p=80732 2024-08-12T16:06:21Z 2024-04-18T20:03:54Z

NVIDIA NeMo, an end-to-end platform for developing multimodal generative AI models at scale anywhere��on any cloud and on-premises��recently released...]]>

NVIDIA NeMo, an end-to-end platform for developing multimodal generative AI models at scale anywhere��on any cloud and on-premises��recently released...

asr-graphic

NVIDIA NeMo, an end-to-end platform for developing multimodal generative AI models at scale anywhere��on any cloud and on-premises��recently released Parakeet-TDT. This new addition to the?NeMo ASR Parakeet model family boasts better accuracy and 64% greater speed over the previously best model, Parakeet-RNNT-1.1B. This post explains Parakeet-TDT and how to use it to generate highly accurate��

]]> 0 Amanda Saunders <![CDATA[Develop Custom Enterprise Generative AI with NVIDIA NeMo]]> http://www.open-lab.net/blog/?p=80360 2025-02-17T05:27:49Z 2024-03-27T20:00:00Z

Generative AI is transforming computing, paving new avenues for humans to interact with computers in natural, intuitive ways. For enterprises, the prospect of...]]>

Generative AI is transforming computing, paving new avenues for humans to interact with computers in natural, intuitive ways. For enterprises, the prospect of...

custom-generative-ai-icons

Generative AI is transforming computing, paving new avenues for humans to interact with computers in natural, intuitive ways. For enterprises, the prospect of generative AI is vast. Businesses can tap into their rich datasets to streamline time-consuming tasks��from text summarization and translation to insight prediction and content generation. But they must also navigate adoption challenges.

]]> 0 Gordana Neskovic <![CDATA[NVIDIA Speech and Translation AI Models Set Records for Speed and Accuracy]]> http://www.open-lab.net/blog/?p=79365 2024-08-12T16:09:12Z 2024-03-19T16:00:00Z

Speech and translation AI models developed at NVIDIA are pushing the boundaries of performance and innovation. The NVIDIA Parakeet automatic speech recognition...]]>

Speech and translation AI models developed at NVIDIA are pushing the boundaries of performance and innovation. The NVIDIA Parakeet automatic speech recognition...

speech-ai-composite-graphic

Speech and translation AI models developed at NVIDIA are pushing the boundaries of performance and innovation. The NVIDIA Parakeet automatic speech recognition (ASR) family of models and the NVIDIA Canary multilingual, multitask ASR and translation model currently top the Hugging Face Open ASR Leaderboard. In addition, a multilingual P-Flow-based text-to-speech (TTS) model won the LIMMITS ��24��

]]> 0 Piotr ?elasko <![CDATA[New Support for Dutch and Persian Released by NVIDIA NeMo ASR]]> http://www.open-lab.net/blog/?p=76636 2024-02-08T18:52:04Z 2024-01-16T18:29:16Z

Breaking barriers in speech recognition, NVIDIA NeMo proudly presents pretrained models tailored for Dutch and Persian��languages often overlooked in the AI...]]>

Breaking barriers in speech recognition, NVIDIA NeMo proudly presents pretrained models tailored for Dutch and Persian��languages often overlooked in the AI... Person sitting at a desk having a conversation with a speech ai chatbot.

Person sitting at a desk having a conversation with a speech ai chatbot.

Breaking barriers in speech recognition, NVIDIA NeMo proudly presents pretrained models tailored for Dutch and Persian��languages often overlooked in the AI landscape. These models leverage the recently introduced FastConformer architecture and were trained simultaneously with CTC and transducer objectives to maximize each model��s accuracy. Automatic speech recognition (ASR) is a��

]]> 1 Haggai Maron <![CDATA[Designing Deep Networks to Process Other Deep Networks]]> http://www.open-lab.net/blog/?p=68489 2023-08-24T18:03:38Z 2023-08-17T17:28:37Z

Deep neural networks (DNNs) are the go-to model for learning functions from data, such as image classifiers or language models. In recent years, deep models...]]>

Deep neural networks (DNNs) are the go-to model for learning functions from data, such as image classifiers or language models. In recent years, deep models...

network-graphic

Deep neural networks (DNNs) are the go-to model for learning functions from data, such as image classifiers or language models. In recent years, deep models have become popular for representing the data samples themselves. For example, a deep model can be trained to represent an image, a 3D object, or a scene, an approach called Implicit Neural Representations. (See also Neural Radiance Fields and��

]]> 1 Chintan Shah <![CDATA[Customizing AI Models: Train Character Detection and Recognition Models with NVIDIA TAO]]> http://www.open-lab.net/blog/?p=68713 2023-08-24T18:03:40Z 2023-08-15T15:00:00Z

Optical Character Detection (OCD) and Optical Character Recognition (OCR) are computer vision techniques used to extract text from images. Use cases vary across...]]>

Optical Character Detection (OCD) and Optical Character Recognition (OCR) are computer vision techniques used to extract text from images. Use cases vary across...

ocd-ocr-detail

Optical Character Detection (OCD) and Optical Character Recognition (OCR) are computer vision techniques used to extract text from images. Use cases vary across industries and include extracting data from scanned documents or forms with handwritten texts, automatically recognizing license plates, sorting boxes or objects in a fulfillment center based on serial numbers��

]]> 0 Chintan Shah <![CDATA[Customizing AI Models: Deploy a Character Detection and Recognition Model with NVIDIA Triton]]> http://www.open-lab.net/blog/?p=69017 2023-08-24T18:03:39Z 2023-08-15T15:00:00Z

NVIDIA Triton Inference Server streamlines and standardizes AI inference by enabling teams to deploy, run, and scale trained ML or DL models from any framework...]]>

NVIDIA Triton Inference Server streamlines and standardizes AI inference by enabling teams to deploy, run, and scale trained ML or DL models from any framework...

ocd-ocr-detail

NVIDIA Triton Inference Server streamlines and standardizes AI inference by enabling teams to deploy, run, and scale trained ML or DL models from any framework on any GPU- or CPU-based infrastructure. It helps developers deliver high-performance inference across cloud, on-premises, edge, and embedded devices. The nvOCDR library is integrated into Triton for inference.

]]> 0 Amanda Saunders <![CDATA[Unlocking the Power of Enterprise-Ready LLMs with NVIDIA NeMo]]> http://www.open-lab.net/blog/?p=68768 2024-11-20T23:03:57Z 2023-08-08T18:35:11Z

For more information about NVIDIA NeMo, see Develop Custom Enterprise Generative AI with NVIDIA NeMo. Generative AI has introduced a new era in computing, one...]]>

For more information about NVIDIA NeMo, see Develop Custom Enterprise Generative AI with NVIDIA NeMo. Generative AI has introduced a new era in computing, one... Decorative image.

Decorative image.

For more information about NVIDIA NeMo, see Develop Custom Enterprise Generative AI with NVIDIA NeMo. Generative AI has introduced a new era in computing, one promising to revolutionize human-computer interaction. At the forefront of this technological marvel are large language models (LLMs), empowering enterprises to recognize, summarize, translate, predict, and generate content using large��

]]> 1 Chintan Shah <![CDATA[Access the Latest in Vision AI Model Development Workflows with NVIDIA TAO Toolkit 5.0]]> http://www.open-lab.net/blog/?p=62089 2023-08-10T17:11:20Z 2023-07-25T15:50:00Z

NVIDIA TAO Toolkit provides a low-code AI framework to accelerate vision AI model development suitable for all skill levels, from novice beginners to expert...]]>

NVIDIA TAO Toolkit provides a low-code AI framework to accelerate vision AI model development suitable for all skill levels, from novice beginners to expert... TAO Toolkit graphic

TAO Toolkit graphic

NVIDIA TAO Toolkit provides a low-code AI framework to accelerate vision AI model development suitable for all skill levels, from novice beginners to expert data scientists. With the TAO Toolkit, developers can use the power and efficiency of transfer learning to achieve state-of-the-art accuracy and production-class throughput in record time with adaptation and optimization.

]]> 6 Yucheng Tang <![CDATA[Visual Foundation Models for Medical Image Analysis]]> http://www.open-lab.net/blog/?p=66710 2023-06-29T19:00:44Z 2023-06-20T16:00:00Z

The analysis of 3D medical images is crucial for advancing clinical responses, disease tracking, and overall patient survival. Deep learning models form the...]]>

The analysis of 3D medical images is crucial for advancing clinical responses, disease tracking, and overall patient survival. Deep learning models form the... Image of torso from medical segmentation scan.

Image of torso from medical segmentation scan.

The analysis of 3D medical images is crucial for advancing clinical responses, disease tracking, and overall patient survival. Deep learning models form the backbone of modern 3D medical representation learning, enabling precise spatial context measurements that are essential for clinical decision-making. These 3D representations are highly sensitive to the physiological properties of medical��

]]> 0 Bhoomi Gadhia <![CDATA[Develop Physics-Informed Machine Learning Models with Graph Neural Networks]]> http://www.open-lab.net/blog/?p=66096 2023-06-14T19:45:19Z 2023-06-06T18:30:00Z

NVIDIA PhysicsNeMo is a framework for building, training, and fine-tuning deep learning models for physical systems, otherwise known as physics-informed machine...]]>

NVIDIA PhysicsNeMo is a framework for building, training, and fine-tuning deep learning models for physical systems, otherwise known as physics-informed machine...

gnn-gif-parameterized-vortex-shedding

NVIDIA PhysicsNeMo is a framework for building, training, and fine-tuning deep learning models for physical systems, otherwise known as physics-informed machine learning (physics-ML) models. PhysicsNeMo is available as OSS (Apache 2.0 license) to support the growing physics-ML community. The latest PhysicsNeMo software update, version 23.05, brings together new capabilities��

]]> 2 James Cameron <![CDATA[Bootstrapping Object Detection Model Training with 3D Synthetic Data]]> http://www.open-lab.net/blog/?p=61031 2023-04-06T18:15:28Z 2023-03-29T17:00:00Z

Training AI models requires mountains of data. Acquiring large sets of training data can be difficult, time-consuming, and expensive. Also, the data collected...]]>

Training AI models requires mountains of data. Acquiring large sets of training data can be difficult, time-consuming, and expensive. Also, the data collected... Three crates containing various fruits

Three crates containing various fruits

Training AI models requires mountains of data. Acquiring large sets of training data can be difficult, time-consuming, and expensive. Also, the data collected may not be able to cover various corner cases, preventing the AI model from accurately predicting a wide variety of scenarios. Synthetic data offers an alternative to real-world data, enabling AI researchers and engineers to bootstrap��

]]> 13 Manish Harsh <![CDATA[Scaling AI with MLOps and the NVIDIA Partner Ecosystem]]> http://www.open-lab.net/blog/?p=61612 2023-06-09T22:36:44Z 2023-03-08T23:29:40Z

AI is impacting every industry, from improving customer service and streamlining supply chains to accelerating cancer research. As enterprises invest in...]]>

AI is impacting every industry, from improving customer service and streamlining supply chains to accelerating cancer research. As enterprises invest in... Data, Train, Monitor, Deploy.

Data, Train, Monitor, Deploy.

AI is impacting every industry, from improving customer service and streamlining supply chains to accelerating cancer research. As enterprises invest in AI to stay ahead of the competition, they often struggle with finding the strategy and infrastructure for success. Many AI projects are rapidly evolving, which makes production at scale especially challenging. We believe in developing��

]]> 1 Somshubra Majumdar <![CDATA[Controlled Adaptation of Speech Recognition Models to New Domains]]> http://www.open-lab.net/blog/?p=60523 2023-06-12T07:55:13Z 2023-02-03T14:00:00Z

Have you ever tried to fine-tune a speech recognition system on your accent only to find that, while it recognizes your voice well, it fails to detect words...]]>

Have you ever tried to fine-tune a speech recognition system on your accent only to find that, while it recognizes your voice well, it fails to detect words...

computer-code

Have you ever tried to fine-tune a speech recognition system on your accent only to find that, while it recognizes your voice well, it fails to detect words spoken by others? This is common in speech recognition systems that have trained on hundreds of thousands of hours of speech. In large-scale automatic speech recognition (ASR), a system may perform well in many but not all scenarios.

]]> 0 Dima Rekesh <![CDATA[Multilingual and Code-Switched Automatic Speech Recognition with NVIDIA NeMo]]> http://www.open-lab.net/blog/?p=60289 2023-11-03T07:15:06Z 2023-01-31T17:00:00Z

Multilingual automatic speech recognition (ASR) models have gained significant interest because of their ability to transcribe speech in more than one language....]]>

Multilingual automatic speech recognition (ASR) models have gained significant interest because of their ability to transcribe speech in more than one language....

multilingual-asr-featured

Multilingual automatic speech recognition (ASR) models have gained significant interest because of their ability to transcribe speech in more than one language. This is fueled by the growing multilingual communities as well as by the need to reduce complexity. You only need one model to handle multiple languages. This post explains how to use pretrained multilingual NeMo ASR models from the��

]]> 0 Varun Praveen <![CDATA[Accelerating AI Development with NVIDIA TAO Toolkit and Weights & Biases]]> http://www.open-lab.net/blog/?p=60315 2023-06-12T07:56:43Z 2023-01-31T13:30:00Z

Leveraging image classification, object detection, automatic speech recognition (ASR), and other forms of AI can fuel massive transformation within companies...]]>

Leveraging image classification, object detection, automatic speech recognition (ASR), and other forms of AI can fuel massive transformation within companies... Graphic of street with cars

Graphic of street with cars

Leveraging image classification, object detection, automatic speech recognition (ASR), and other forms of AI can fuel massive transformation within companies and business sectors. However, building AI and deep learning models from scratch is a daunting task. A common prerequisite for building these models is having a large amount of high-quality training data and the right expertise to��

]]> 0 Cynthia Countouris <![CDATA[NVIDIA Announces Cloud-Native Metropolis Microservices and Retail AI Workflows for Theft Prevention]]> http://www.open-lab.net/blog/?p=59474 2023-05-25T03:12:36Z 2023-01-12T14:00:00Z

Retail shrinkage is on the rise with industry losses totaling $100B in 2021 and growing, due to inflationary pressures. To help software developers accelerate...]]>

Retail shrinkage is on the rise with industry losses totaling $100B in 2021 and growing, due to inflationary pressures. To help software developers accelerate... Woman scanning groceries

Woman scanning groceries

Retail shrinkage is on the rise with industry losses totaling $100B in 2021 and growing, due to inflationary pressures. To help software developers accelerate the development of retail loss prevention solutions, NVIDIA is releasing a suite of microservices as part of NVIDIA Metropolis, along with retail AI workflows. These AI workflows deliver pretrained AI models along with the applications��

]]> 2 Angie Lee <![CDATA[What Is a Pretrained AI Model?]]> http://www.open-lab.net/blog/?p=58452 2024-06-05T22:06:21Z 2022-12-16T20:00:00Z

A pretrained AI model is a deep learning model that��s trained on large datasets to accomplish a specific task, and it can be used as is or customized to suit...]]>

A pretrained AI model is a deep learning model that��s trained on large datasets to accomplish a specific task, and it can be used as is or customized to suit...

ai-for-dev-blog-pretrained-models-corp-blog-1280x680-1

A pretrained AI model is a deep learning model that��s trained on large datasets to accomplish a specific task, and it can be used as is or customized to suit application requirements across multiple industries.

]]> 0 Pranjali Joshi <![CDATA[AI Models Recap: Scalable Pretrained Models Across Industries]]> http://www.open-lab.net/blog/?p=58341 2023-06-12T08:23:57Z 2022-12-07T19:32:20Z

The year 2022 has thus far been a momentous, thrilling, and an overwhelming year for AI aficionados. Get3D is pushing the boundaries of generative 3D modeling,...]]>

The year 2022 has thus far been a momentous, thrilling, and an overwhelming year for AI aficionados. Get3D is pushing the boundaries of generative 3D modeling,...

The year 2022 has thus far been a momentous, thrilling, and an overwhelming year for AI aficionados. Get3D is pushing the boundaries of generative 3D modeling, an AI model can now diagnose breast cancer from MRIs as accurately as board-certified radiologists, and state-of-the-art speech AI models have widened their horizons to extended reality. Pretrained models from NVIDIA have redefined��

]]> 0 Robert Clark <![CDATA[Deploying a 1.3B GPT-3 Model with NVIDIA NeMo Framework]]> http://www.open-lab.net/blog/?p=56807 2023-06-12T08:37:05Z 2022-11-04T21:35:25Z

Large language models (LLMs) are some of the most advanced deep learning algorithms that are capable of understanding written language. Many modern LLMs are...]]>

Large language models (LLMs) are some of the most advanced deep learning algorithms that are capable of understanding written language. Many modern LLMs are...

nemo-deploy-public-llms-featured

Large language models (LLMs) are some of the most advanced deep learning algorithms that are capable of understanding written language. Many modern LLMs are built using the transformer network introduced by Google in 2017 in the Attention Is All You Need research paper. NVIDIA NeMo framework is an end-to-end GPU-accelerated framework for training and deploying transformer-based LLMs up to a��

]]> 3 Chintan Patel <![CDATA[New on NGC: SDKs for Large Language Models, Digital Twins, Digital Biology, and More]]> http://www.open-lab.net/blog/?p=56711 2023-07-31T19:31:36Z 2022-11-03T19:00:00Z

New SDKs are available in the NGC catalog, a hub of GPU-optimized deep learning, machine learning, and HPC applications. With highly performant software...]]>

New SDKs are available in the NGC catalog, a hub of GPU-optimized deep learning, machine learning, and HPC applications. With highly performant software...

nvidia-large-language-model-services

New SDKs are available in the NGC catalog, a hub of GPU-optimized deep learning, machine learning, and HPC applications. With highly performant software containers, pretrained models, industry-specific SDKs, and Jupyter notebooks available, AI developers and data scientists can simplify and reduce complexities in their end-to-end workflows. This post provides an overview of new and updated��

]]> 0 Asawaree Bhide <![CDATA[Detecting Objects in Point Clouds Using ROS 2 and TAO-PointPillars]]> http://www.open-lab.net/blog/?p=55668 2023-06-12T08:53:26Z 2022-09-30T16:00:00Z

Accurate, fast object detection is an important task in robotic navigation and collision avoidance. Autonomous agents need a clear map of their surroundings to...]]>

Accurate, fast object detection is an important task in robotic navigation and collision avoidance. Autonomous agents need a clear map of their surroundings to...

automotive-deepmap-drive-sim-stuttgart-radar-localization-1832800

Accurate, fast object detection is an important task in robotic navigation and collision avoidance. Autonomous agents need a clear map of their surroundings to navigate to their destination while avoiding collisions. For example, in warehouses that use autonomous mobile robots (AMRs) to transport objects, avoiding hazardous machines that could potentially damage robots has become a challenging��

]]> 2 Xianchao Wu <![CDATA[Improving Japanese Language ASR by Combining Convolutions with Attention Mechanisms]]> http://www.open-lab.net/blog/?p=54745 2023-06-12T08:56:00Z 2022-09-12T14:30:00Z

Automatic speech recognition (ASR) research generally focuses on high-resource languages such as English, which is supported by hundreds of thousands of hours...]]>

Automatic speech recognition (ASR) research generally focuses on high-resource languages such as English, which is supported by hundreds of thousands of hours...

Abstract graphic of Human Genome dna sequencing analysis, Sequencing DNA means determining the order of the four chemical building blocks called bases

Automatic speech recognition (ASR) research generally focuses on high-resource languages such as English, which is supported by hundreds of thousands of hours of speech. Recent literature has renewed focus on more complex languages, such as Japanese. Like other Asian languages, Japanese has a vast base character set (upwards of 3,000 unique characters are used in common vernacular)��

]]> 0 Jason Black <![CDATA[Jetson Project of the Month: Using Pretrained Models to Predict Bus Arrival Times]]> http://www.open-lab.net/blog/?p=54270 2023-04-06T03:03:36Z 2022-08-30T17:00:00Z

No one likes standing around and waiting for the bus to arrive, especially when you need to be somewhere on time. Wouldn��t it be great if you could predict...]]>

No one likes standing around and waiting for the bus to arrive, especially when you need to be somewhere on time. Wouldn��t it be great if you could predict...

Temp-1280x720 (5)

No one likes standing around and waiting for the bus to arrive, especially when you need to be somewhere on time. Wouldn��t it be great if you could predict when the next bus is due to arrive? At the beginning of this year, Armenian developer Edgar Gomtsyan had some time to spare, and he puzzled over this very question. Rather than waiting for a government entity to implement a solution��

]]> 0 Jason Black <![CDATA[Jetson Project of the Month: Exploring Human-Robot Interactions with Pretrained Models]]> http://www.open-lab.net/blog/?p=49857 2023-06-12T09:22:29Z 2022-07-06T16:30:00Z

They say ��imitation is the sincerest form of flattery.�� Well, in the case of a robotics project by Polish-based developer Tomasz Tomanek, imitation��or...]]>

They say ��imitation is the sincerest form of flattery.�� Well, in the case of a robotics project by Polish-based developer Tomasz Tomanek, imitation��or...

They say ��imitation is the sincerest form of flattery.�� Well, in the case of a robotics project by Polish-based developer Tomasz Tomanek, imitation��or mimicry��is the goal of his robot named Mariola. In this latest Jetson Project of the Month, Tomanek has developed a funky little robot using pretrained machine learning models to make human-robot interactions come to life.

]]> 0 Debraj Sinha <![CDATA[Metropolis Spotlight: Sighthound Enhances Traffic Safety with NVIDIA GPU-Accelerated AI Technologies]]> http://www.open-lab.net/blog/?p=37968 2023-04-06T03:08:22Z 2021-10-04T18:07:57Z

NVIDIA Metropolis partner Sighthound��formerly Boulder AI��is helping cities improve traffic management and pedestrian safety with software and hardware...]]>

NVIDIA Metropolis partner Sighthound��formerly Boulder AI��is helping cities improve traffic management and pedestrian safety with software and hardware... JAKARTA, Indonesia - December 17, 2019: Tilted top down horizontal view of a fountain in the middle of a roundabout in Jakarta city at sunny morning.

JAKARTA, Indonesia - December 17, 2019: Tilted top down horizontal view of a fountain in the middle of a roundabout in Jakarta city at sunny morning.

NVIDIA Metropolis partner Sighthound��formerly Boulder AI��is helping cities improve traffic management and pedestrian safety with software and hardware solutions to bring cloud-native solutions for edge data intelligence. To design efficient, equitable, and sustainable infrastructure, city planners rely on accurate roadway usage data. Sighthound builds edge-enabled��

]]> 0 Kalyan Meher Vadrevu <![CDATA[Register for the NVIDIA Metropolis Developer Webinars on Sept. 22]]> http://www.open-lab.net/blog/?p=37245 2023-08-18T19:34:34Z 2021-09-08T20:01:16Z

Join NVIDIA experts and Metropolis partners on Sept. 22 for webinars exploring developer SDKs, GPUs, go-to-market opportunities, and more. All three sessions,...]]>

Join NVIDIA experts and Metropolis partners on Sept. 22 for webinars exploring developer SDKs, GPUs, go-to-market opportunities, and more. All three sessions,...

Metropolis

Join NVIDIA experts and Metropolis partners on Sept. 22 for webinars exploring developer SDKs, GPUs, go-to-market opportunities, and more. All three sessions, each with unique speakers and content, will be recorded and will be available for on-demand viewing later. Register Now >> Wednesday, September 22, 2021, 1 PM PDT Wednesday, September 22, 2021��

]]> 0 Shokoufeh Monejzi Kouchak <![CDATA[Building Medical 3D Image Segmentation Using Jupyter Notebooks from the NGC Catalog]]> http://www.open-lab.net/blog/?p=34158 2022-08-21T23:52:09Z 2021-07-12T19:04:03Z

The NVIDIA NGC team is hosting a webinar with live Q&A to dive into this Jupyter notebook available from the NGC catalog. Learn how to use these resources...]]>

The NVIDIA NGC team is hosting a webinar with live Q&A to dive into this Jupyter notebook available from the NGC catalog. Learn how to use these resources...

Jupyter-notebooks-featured

The NVIDIA NGC team is hosting a webinar with live Q&A to dive into this Jupyter notebook available from the NGC catalog. Learn how to use these resources to kickstart your AI journey. Register now: NVIDIA NGC Jupyter Notebook Day: Medical Imaging Segmentation. Image segmentation partitions a digital image into multiple segments by changing the representation into something more meaningful��

]]> 13 Joanne Chang <![CDATA[Fast-Track Production AI with Pretrained Models and NVIDIA TAO Toolkit 3.0]]> http://www.open-lab.net/blog/?p=33650 2022-08-21T23:52:02Z 2021-06-24T13:00:00Z

Today, NVIDIA announced new pretrained models and the general availability of TAO Toolkit 3.0, a core component of the NVIDIA Train, Adapt, and Optimize (TAO)...]]>

Today, NVIDIA announced new pretrained models and the general availability of TAO Toolkit 3.0, a core component of the NVIDIA Train, Adapt, and Optimize (TAO)...

tao_stack

Today, NVIDIA announced new pretrained models and the general availability of TAO Toolkit 3.0, a core component of the NVIDIA Train, Adapt, and Optimize (TAO) platform-guided workflow for creating AI. The new release includes a variety of highly accurate and performant pretrained models in computer vision and conversational AI, as well as a set of powerful productivity features that boost AI��

]]> 0 Nikolai Liubimov <![CDATA[Generating High-Quality Labels for Speech Recognition with Label Studio and NVIDIA NeMo]]> http://www.open-lab.net/blog/?p=31971 2023-11-03T07:15:13Z 2021-05-24T18:09:03Z

You can save time and produce a more accurate result when processing audio data with automated speech recognition (ASR) models from NVIDIA NeMo and Label...]]>

You can save time and produce a more accurate result when processing audio data with automated speech recognition (ASR) models from NVIDIA NeMo and Label...

High-quality-labels-for-speech-recognition

You can save time and produce a more accurate result when processing audio data with automated speech recognition (ASR) models from NVIDIA NeMo and Label Studio. NVIDIA NeMo provides reusable neural modules that make it easy to create new neural network architectures, including prebuilt modules and ready-to-use models for ASR. With the power of NVIDIA NeMo, you can get audio transcriptions��

]]> 0 Nyla Worker <![CDATA[Fast-Tracking Hand Gesture Recognition AI Applications with Pretrained Models from NGC]]> http://www.open-lab.net/blog/?p=29789 2023-03-22T01:11:58Z 2021-04-12T19:12:00Z

One of the main challenges and goals when creating an AI application is producing a robust model that is performant with high accuracy. Building such a deep...]]>

One of the main challenges and goals when creating an AI application is producing a robust model that is performant with high accuracy. Building such a deep...

ngc-pretrained-models

One of the main challenges and goals when creating an AI application is producing a robust model that is performant with high accuracy. Building such a deep learning model is time consuming. It can take weeks or months of retraining, fine-tuning, and optimizing until the model satisfies the necessary requirements. For many developers, building a deep learning AI pipeline from scratch is not a��

]]> 0 Yue Zhu <![CDATA[Creating a Real-Time License Plate Detection and Recognition App]]> http://www.open-lab.net/blog/?p=23943 2022-10-18T22:52:32Z 2021-02-25T16:00:57Z

Automatic license plate recognition (ALPR) on stationary to fast-moving vehicles is one of the common intelligent video analytics applications for smart cities....]]>

Automatic license plate recognition (ALPR) on stationary to fast-moving vehicles is one of the common intelligent video analytics applications for smart cities....

Santana row_Audi_thickerbbox (2)

Automatic license plate recognition (ALPR) on stationary to fast-moving vehicles is one of the common intelligent video analytics applications for smart cities. Some of the common use cases include parking assistance systems, automated toll booths, vehicle registration and identification for delivery and logistics at ports, and medical supply transporting warehouses. Being able to do this in real��

]]> 40 Brad Nemire <![CDATA[Bring AI to Market Fast with Pretrained Models and NVIDIA TAO Toolkit 3.0]]> https://news.www.open-lab.net/?p=19367 2023-12-30T01:49:23Z 2021-02-25T16:00:00Z

Intelligent vision and speech-enabled services have now become mainstream, impacting almost every aspect of our everyday life. AI-enabled video and audio...]]>

Intelligent vision and speech-enabled services have now become mainstream, impacting almost every aspect of our everyday life. AI-enabled video and audio...

TLT DeepStream featured

Intelligent vision and speech-enabled services have now become mainstream, impacting almost every aspect of our everyday life. AI-enabled video and audio analytics are enhancing applications from consumer products to enterprise services. Smart speakers at home. Smart kiosks or chatbots in retail stores. Interactive robots on factory floors. Intelligent patient monitoring systems at hospitals.

]]> 0 Akhil Docca <![CDATA[Building and Deploying a Face Mask Detection Application Using NGC Collections]]> http://www.open-lab.net/blog/?p=23685 2022-10-10T18:57:05Z 2021-01-27T23:11:34Z

AI workflows are complex. Building an AI application is no trivial task, as it takes various stakeholders with domain expertise to develop and deploy the...]]>

AI workflows are complex. Building an AI application is no trivial task, as it takes various stakeholders with domain expertise to develop and deploy the...

AI workflows are complex. Building an AI application is no trivial task, as it takes various stakeholders with domain expertise to develop and deploy the application at scale. Data scientists and developers need easy access to software building blocks, such as models and containers, that are not only secure and highly performant, but which have the necessary underlying architecture to build their��

]]> 5 Raghav Mani <![CDATA[Speeding Up Development of Speech and Language Models with NVIDIA NeMo]]> http://www.open-lab.net/blog/?p=17649 2023-03-22T01:09:09Z 2020-10-05T13:00:00Z

[stextbox id="info"]This is an updated version of Neural Modules for Fast Development of Speech and Language Models. This post upgrades the NeMo diagram with...]]>

[stextbox id="info"]This is an updated version of Neural Modules for Fast Development of Speech and Language Models. This post upgrades the NeMo diagram with...

NeMo-featured-image

This is an updated version of Neural Modules for Fast Development of Speech and Language Models. This post upgrades the NeMo diagram with PyTorch and PyTorch Lightning support and updates the tutorial with the new code base. As a researcher building state-of-the-art speech and language models, you must be able to quickly experiment with novel network architectures.

]]> 0 Vanessa Braunstein <![CDATA[Building State-of-the-Art Biomedical and Clinical NLP Models with BioMegatron]]> http://www.open-lab.net/blog/?p=21390 2023-03-22T01:09:06Z 2020-10-05T13:00:00Z

With the advent of new deep learning approaches based on transformer architecture, natural language processing (NLP) techniques have undergone a revolution in...]]>

With the advent of new deep learning approaches based on transformer architecture, natural language processing (NLP) techniques have undergone a revolution in...

abstract

With the advent of new deep learning approaches based on transformer architecture, natural language processing (NLP) techniques have undergone a revolution in performance and capabilities. Cutting-edge NLP models are becoming the core of modern search engines, voice assistants, chatbots, and more. Modern NLP models can synthesize human-like text and answer questions posed in natural language.

]]> 1 Varun Praveen <![CDATA[Improving INT8 Accuracy Using Quantization Aware Training and the NVIDIA TAO Toolkit]]> http://www.open-lab.net/blog/?p=19113 2022-08-21T23:40:25Z 2020-08-04T13:00:00Z

Deep neural network (DNN) models are routinely used in applications requiring analysis of video stream content. These may include object detection,...]]>

Deep neural network (DNN) models are routinely used in applications requiring analysis of video stream content. These may include object detection,...

tao_workflow_detectnet_v2

Deep neural network (DNN) models are routinely used in applications requiring analysis of video stream content. These may include object detection, classification, and segmentation. Typically, these models are trained on servers with high-end GPUs, either in stand-alone servers, such as NVIDIA DGX1, or on servers available in data centers or private or public clouds. Such systems often use��

]]> 0 Yu Wang <![CDATA[Training with Custom Pretrained Models Using the NVIDIA Transfer Learning Toolkit]]> http://www.open-lab.net/blog/?p=17196 2022-08-21T23:39:59Z 2020-04-30T21:28:00Z

Supervised training of deep neural networks is now a common method of creating AI applications. To achieve accurate AI for your application, you generally need...]]>

Supervised training of deep neural networks is now a common method of creating AI applications. To achieve accurate AI for your application, you generally need...

tlt-inference-throughput

]]> 0 Yan Cheng <![CDATA[Powering AutoML-enabled AI Model Training with Clara Train]]> http://www.open-lab.net/blog/?p=17073 2022-08-21T23:39:57Z 2020-04-15T21:44:00Z

Deep neural networks (DNNs) have been successfully applied to volume segmentation and other medical imaging tasks. They are capable of achieving...]]>

Deep neural networks (DNNs) have been successfully applied to volume segmentation and other medical imaging tasks. They are capable of achieving...

automl-controller-network-training-flow

Deep neural networks (DNNs) have been successfully applied to volume segmentation and other medical imaging tasks. They are capable of achieving state-of-the-art accuracy and can augment the medical imaging workflow with AI-powered insights. However, training robust AI models for medical imaging analysis is time-consuming and tedious and requires iterative experimentation with parameter��

]]> 0 Jin Li <![CDATA[Jump-start AI Training with NGC Pretrained Models On-Premises and in the Cloud]]> http://www.open-lab.net/blog/?p=16731 2022-08-21T23:39:52Z 2020-03-26T20:51:00Z

Figure 1. NGC software stack. The process of building an AI-powered solution from start to finish can be daunting. First, datasets must be curated and...]]>

Figure 1. NGC software stack. The process of building an AI-powered solution from start to finish can be daunting. First, datasets must be curated and...

NGC Software stack_center

The process of building an AI-powered solution from start to finish can be daunting. First, datasets must be curated and pre-processed. Next, models need to be trained and tested for inference performance, and then finally deployed into a usable, customer-facing application. At each step along the way, developers are constantly face time-consuming challenges, such as building efficient��

]]> 0 ��˳��97caoporen��