Generative AI – NVIDIA Technical Blog

Generative AI – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-05-16T23:50:38Z http://www.open-lab.net/blog/feed/ Chintan Patel <![CDATA[Build Agents and Understand Long Docs with Mistral Medium 3 and NVIDIA NIM]]> http://www.open-lab.net/blog/?p=99879 2025-05-16T15:48:30Z 2025-05-16T15:48:28Z

Developers building powerful multimodal applications now have a new state-of-the-art model designed for enterprise-scale performance with Mistral Medium 3....]]>

Developers building powerful multimodal applications now have a new state-of-the-art model designed for enterprise-scale performance with Mistral Medium 3. Mistral Medium 3 combines high-performance, efficiency, and versatility in a compact deployment footprint. Designed for commercial and on-prem use cases, this dense model runs efficiently on NVIDIA Hopper GPUs…

]]> Elias Wolfberg <![CDATA[AI Helps Uncover Potential Alzheimer��s Cause and Treatment]]> http://www.open-lab.net/blog/?p=100058 2025-05-15T19:07:22Z 2025-05-15T15:00:30Z

A gene that can be an early indicator for Alzheimer��s disease actually is a cause of the degenerative-brain disorder, said researchers at the University of...]]>

A gene that can be an early indicator for Alzheimer’s disease actually is a cause of the degenerative-brain disorder, said researchers at the University of California, San Diego. That finding, which they discovered using AI, could result in new treatment options. In a paper published in April in the scientific journal Cell, a team at UCSD found that the gene PHGDH—previously considered a…

]]> Vinh Nguyen <![CDATA[Build Custom Reasoning Models with Advanced, Open Post-Training Datasets]]> http://www.open-lab.net/blog/?p=98680 2025-05-15T19:07:23Z 2025-05-14T16:33:26Z

Synthetic data has become a standard part of large language model (LLM) post-training procedures. Using a large number of synthetically generated examples from...]]>

Synthetic data has become a standard part of large language model (LLM) post-training procedures. Using a large number of synthetically generated examples from either a single or cohort of open-source, commercially permissible LLMs, a base LLM is finetuned either with supervised finetuning or RLHF to gain instruction-following and reasoning skills. This process can be seen as a knowledge…

]]> Brad Nemire <![CDATA[Get Trained and Certified at GTC Paris at VivaTech 2025]]> http://www.open-lab.net/blog/?p=100034 2025-05-15T19:07:25Z 2025-05-14T16:16:06Z

Join us at GTC Paris on June 10th and choose from six full-day, instructor-led workshops.]]>

Join us at GTC Paris on June 10th and choose from six full-day, instructor-led workshops.

]]> Alex Zeltov <![CDATA[Accelerated AI Inference with NVIDIA NIM on Azure AI Foundry]]> http://www.open-lab.net/blog/?p=99911 2025-05-15T19:07:29Z 2025-05-12T17:59:36Z

The integration of NVIDIA NIM microservices into Azure AI Foundry marks a major leap forward in enterprise AI development. By combining NIM microservices with...]]>

The integration of NVIDIA NIM microservices into Azure AI Foundry marks a major leap forward in enterprise AI development. By combining NIM microservices with Azure’s scalable, secure infrastructure, organizations can now deploy powerful, ready-to-use AI models more efficiently than ever before. NIM microservices are containerized for GPU-accelerated inferencing for pretrained and customized…

]]> Shashank Verma <![CDATA[Run Hugging Face Models Instantly with Day-0 Support from NVIDIA NeMo Framework]]> http://www.open-lab.net/blog/?p=99933 2025-05-15T19:07:31Z 2025-05-12T17:48:24Z

As organizations strive to maximize the value of their generative AI investments, accessing the latest model developments is crucial to continued success. By...]]>

As organizations strive to maximize the value of their generative AI investments, accessing the latest model developments is crucial to continued success. By using state-of-the-art models on Day-0, teams can harness these innovations efficiently, maintain relevance, and be competitive. The past year has seen a flurry of exciting model series releases in the open-source community…

]]> Rucha Apte <![CDATA[Applying Specialized LLMs with Reasoning Capabilities to Accelerate Battery Research]]> http://www.open-lab.net/blog/?p=99794 2025-05-15T19:07:33Z 2025-05-09T16:00:00Z

Scientific research in complex fields like battery innovation is often slowed by manual evaluation of materials, limiting progress to just dozens of candidates...]]>

Scientific research in complex fields like battery innovation is often slowed by manual evaluation of materials, limiting progress to just dozens of candidates per day. In this blog post, we explore how domain-adapted large language models (LLMs), enhanced with reasoning capabilities, are transforming scientific research, especially in high-stakes, complex domains like battery innovation.

]]> Wenqi Glantz <![CDATA[Extending the NVIDIA Agent Intelligence Toolkit to Support New Agentic Frameworks]]> http://www.open-lab.net/blog/?p=99799 2025-05-15T19:07:35Z 2025-05-08T18:30:00Z

NVIDIA Agent Intelligence toolkit is an open-source library for efficiently connecting and optimizing teams of AI agents. It focuses on enabling developers to...]]>

NVIDIA Agent Intelligence toolkit is an open-source library for efficiently connecting and optimizing teams of AI agents. It focuses on enabling developers to quickly build, evaluate, profile, and accelerate complex agentic AI workflows?—?systems in which multiple AI agents collaborate to perform tasks. The Agent Intelligence toolkit acts as a unifying framework that integrates existing…

]]> Ruilong Li <![CDATA[Revolutionizing Neural Reconstruction and Rendering in gsplat with 3DGUT]]> http://www.open-lab.net/blog/?p=99680 2025-05-15T19:07:38Z 2025-05-08T16:09:03Z

Realistic 3D simulation is becoming a cornerstone of modern AI and graphics, from training autonomous vehicles (AV) to powering robotics and digital twins....]]>

Realistic 3D simulation is becoming a cornerstone of modern AI and graphics, from training autonomous vehicles (AV) to powering robotics and digital twins. Neural rendering techniques like NeRFs and 3D Gaussian Splatting (3DGS) have revolutionized how 3D scenes are reconstructed and visualized from raw sensor data. In this post, we introduce the implementation of 3D Gaussian Unscented…

]]> Camden Spehl <![CDATA[Concept?Driven AI Teaching Assistant Guides Students to Deeper Insights]]> http://www.open-lab.net/blog/?p=99719 2025-05-15T19:07:42Z 2025-05-07T20:57:51Z

In today's educational landscape, generative AI tools have become both a blessing and a challenge. While these tools offer unprecedented access to information,...]]>

In today’s educational landscape, generative AI tools have become both a blessing and a challenge. While these tools offer unprecedented access to information, they’ve also created new concerns about academic integrity. Increasingly, students rely on AI to generate direct answers to homework questions, often at the expense of developing critical thinking skills and mastering core concepts.

]]> Nirmal Kumar Juluru <![CDATA[Building Nemotron-CC, A High-Quality Trillion Token Dataset for LLM Pretraining from Common Crawl Using NVIDIA NeMo Curator]]> http://www.open-lab.net/blog/?p=99540 2025-05-15T19:07:43Z 2025-05-07T16:22:31Z

Curating high-quality pretraining datasets is critical for enterprise developers aiming to train state-of-the-art large language models (LLMs). To enable...]]>

Curating high-quality pretraining datasets is critical for enterprise developers aiming to train state-of-the-art large language models (LLMs). To enable developers to build highly accurate LLMs, NVIDIA previously released Nemotron-CC, a 6.3-trillion-token English language Common Crawl (CC) dataset. Today, the NVIDIA NeMo Curator team is excited to share that the pipeline used to build the…

]]> Vinh Nguyen <![CDATA[LLM Inference Benchmarking Guide: NVIDIA GenAI-Perf and NIM]]> http://www.open-lab.net/blog/?p=99180 2025-05-15T19:07:45Z 2025-05-06T17:35:39Z

This is the second post in the LLM Benchmarking series, which shows how to use GenAI-Perf to benchmark the Meta Llama 3 model when deployed with NVIDIA NIM.?...]]>

This is the second post in the LLM Benchmarking series, which shows how to use GenAI-Perf to benchmark the Meta Llama 3 model when deployed with NVIDIA NIM. When building LLM-based applications, it is critical to understand the performance characteristics of these models on a given hardware. This serves multiple purposes: As a client-side LLM-focused benchmarking tool…

]]> Ankit Patel <![CDATA[Integrate and Deploy Tongyi Qwen3 Models into Production Applications with NVIDIA]]> http://www.open-lab.net/blog/?p=99462 2025-05-15T19:07:50Z 2025-05-02T22:00:00Z

Alibaba recently released Tongyi Qwen3, a family of open-source hybrid-reasoning large language models (LLMs). The Qwen3 family consists of two MoE models,...]]>

Alibaba recently released Tongyi Qwen3, a family of open-source hybrid-reasoning large language models (LLMs). The Qwen3 family consists of two MoE models, 235B-A22B (235B total parameters and 22B active parameters) and 30B-A3B, and six dense models, including the 0.6B, 1.7B, 4B, 8B, 14B, 32B versions. With ultra-fast token generation, developers can efficiently integrate and deploy Qwen3…

]]> Brad Nemire <![CDATA[HackAI Challenge Winners Announced]]> http://www.open-lab.net/blog/?p=99563 2025-05-15T19:08:27Z 2025-05-02T16:31:11Z

Explore the groundbreaking projects and real-world impacts of the HackAI Challenge powered by NVIDIA AI Workbench and Dell Precision.]]>

Explore the groundbreaking projects and real-world impacts of the HackAI Challenge powered by NVIDIA AI Workbench and Dell Precision.

]]> Babak Hejazi <![CDATA[Boosting Matrix Multiplication Speed and Flexibility with NVIDIA cuBLAS 12.9]]> http://www.open-lab.net/blog/?p=99184 2025-05-15T19:08:28Z 2025-05-01T20:00:00Z

The NVIDIA CUDA-X math libraries empower developers to build accelerated applications for AI, scientific computing, data processing, and more. Two...]]>

The NVIDIA CUDA-X math libraries empower developers to build accelerated applications for AI, scientific computing, data processing, and more. Two of the most important applications of CUDA-X libraries are training and inference LLMs, whether for use in everyday consumer applications or highly specialized scientific domains like drug discovery. Multiple CUDA-X libraries are indispensable…

]]> Jonathan Bikoff <![CDATA[Spotlight: Personal AI Brings AI Receptionists to Small Business Owners with NVIDIA Riva]]> http://www.open-lab.net/blog/?p=99402 2025-05-16T23:50:38Z 2025-04-29T22:44:07Z

It's 10 p.m. on a Tuesday when the phone rings at the Sapochnick Law Firm, a specialized law practice in San Diego, California. The caller, a client of the...]]>

It’s 10 p.m. on a Tuesday when the phone rings at the Sapochnick Law Firm, a specialized law practice in San Diego, California. The caller, a client of the firm, is anxious as the phone rings. They received an important letter containing‌ potentially life-changing news, and had urgent questions for their lawyer. The client quickly realizes the Sapochnick team likely left the office hours ago…

]]> 1 Joseph Lucas <![CDATA[Structuring Applications to Secure the KV Cache]]> http://www.open-lab.net/blog/?p=99425 2025-05-15T19:08:32Z 2025-04-29T22:43:01Z

When interacting with transformer-based models like large language models (LLMs) and vision-language models (VLMs), the structure of the input shapes the...]]>

When interacting with transformer-based models like large language models (LLMs) and vision-language models (VLMs), the structure of the input shapes the model’s output. But prompts are often more than a simple user query. In practice, they optimize the response by dynamically assembling data from various sources such as system instructions, context data, and user input.

]]> Sama Bali <![CDATA[Choosing Your First Local AI Project?]]> http://www.open-lab.net/blog/?p=99361 2025-05-15T19:08:34Z 2025-04-29T17:00:00Z

AI is rapidly moving beyond centralized cloud and data centers, becoming a powerful tool deployable directly on professional workstations. Thanks to advanced...]]>

AI is rapidly moving beyond centralized cloud and data centers, becoming a powerful tool deployable directly on professional workstations. Thanks to advanced hardware and optimized software, you can build, run, and experiment with sophisticated AI models at your desk or on the go. Welcome to the world of local AI development! Running and developing AI locally on a workstation offers…

]]> Meenakshi Kaushik <![CDATA[NVIDIA NIM Operator 2.0 Boosts AI Deployment with NVIDIA NeMo Microservices Support]]> http://www.open-lab.net/blog/?p=99309 2025-05-15T19:08:34Z 2025-04-29T16:00:00Z

The first release of NVIDIA NIM Operator simplified the deployment and lifecycle management of inference pipelines for NVIDIA NIM microservices, reducing the...]]>

The first release of NVIDIA NIM Operator simplified the deployment and lifecycle management of inference pipelines for NVIDIA NIM microservices, reducing the workload for MLOps, LLMOps engineers, and Kubernetes admins. It enabled easy and fast deployment, auto-scaling, and upgrading of NIM on Kubernetes clusters. Learn more about the first release. Our customers and partners have been using…

]]> Hsin Chen <![CDATA[Advancing Cybersecurity Operations with Agentic AI Systems]]> http://www.open-lab.net/blog/?p=99329 2025-05-15T19:08:37Z 2025-04-28T19:52:53Z

The age of passive AI is over. A new era is beginning, where AI doesn��t just respond��it thinks, plans, and acts. The rapid advancement of large language...]]>

The age of passive AI is over. A new era is beginning, where AI doesn’t just respond—it thinks, plans, and acts. The rapid advancement of large language models (LLMs) has unlocked the potential of agentic AI systems, enabling the automation of tedious tasks across many fields, including cybersecurity. Traditionally, AI applications in cybersecurity have focused primarily on detecting…

]]> Davide Paglieri <![CDATA[Benchmarking Agentic LLM and VLM Reasoning for Gaming with NVIDIA NIM]]> http://www.open-lab.net/blog/?p=99202 2025-05-15T19:08:40Z 2025-04-24T17:00:00Z

This is the first post in the LLM Benchmarking series, which shows how to use GenAI-Perf to benchmark the Meta Llama 3 model when deployed with NVIDIA NIM.?...]]>

This is the first post in the LLM Benchmarking series, which shows how to use GenAI-Perf to benchmark the Meta Llama 3 model when deployed with NVIDIA NIM. Researchers from the University College London (UCL) Deciding, Acting, and Reasoning with Knowledge (DARK) Lab leverage NVIDIA NIM microservices in their new game-based benchmark suite, Benchmarking Agentic LLM and VLM Reasoning On Games…

]]> Amit Bleiweiss <![CDATA[Spotlight: Qodo Innovates Efficient Code Search with NVIDIA DGX]]> http://www.open-lab.net/blog/?p=99041 2025-05-15T19:08:41Z 2025-04-23T22:23:32Z

Large language models (LLMs) have enabled AI tools that help you write more code faster, but as we ask these tools to take on more and more complex tasks, there...]]>

Large language models (LLMs) have enabled AI tools that help you write more code faster, but as we ask these tools to take on more and more complex tasks, there are limitations that become apparent. Challenges such as understanding the nuances of programming languages, complex dependencies, and adapting to codebase-specific context can lead to lower-quality code and cause bottlenecks down the line.

]]> Shashank Verma <![CDATA[Enhance Your AI Agent with Data Flywheels Using NVIDIA NeMo Microservices]]> http://www.open-lab.net/blog/?p=98721 2025-05-15T19:08:45Z 2025-04-23T13:00:00Z

Enterprise data is constantly changing. This presents significant challenges for maintaining AI system accuracy over time. As organizations increasingly rely on...]]>

Enterprise data is constantly changing. This presents significant challenges for maintaining AI system accuracy over time. As organizations increasingly rely on agentic AI systems to optimize business processes, keeping these systems aligned with evolving business needs and new data becomes crucial. This post dives into how to build an iteration of a data flywheel using NVIDIA NeMo…

]]> Brad Nemire <![CDATA[NVIDIA GTC Training Labs Now Available On Demand]]> http://www.open-lab.net/blog/?p=99074 2025-05-15T19:08:47Z 2025-04-22T17:26:28Z

Missed GTC? This year��s training labs are now available on demand to watch anywhere, anytime.]]>

Missed GTC? This year’s training labs are now available on demand to watch anywhere, anytime.

]]> Maximilian M��ller <![CDATA[Optimizing Transformer-Based Diffusion Models for Video Generation with NVIDIA TensorRT]]> http://www.open-lab.net/blog/?p=98927 2025-05-15T19:08:48Z 2025-04-21T18:44:38Z

State-of-the-art image diffusion models take tens of seconds to process a single image. This makes video diffusion even more challenging, requiring significant...]]>

State-of-the-art image diffusion models take tens of seconds to process a single image. This makes video diffusion even more challenging, requiring significant computational resources and high costs. By leveraging the latest FP8 quantization features on NVIDIA Hopper GPUs with NVIDIA TensorRT, it’s possible to significantly reduce inference costs and serve more users with fewer GPUs.

]]> Elias Wolfberg <![CDATA[AI Inspires Artists and Industrialists to Reimagine their Crafts]]> http://www.open-lab.net/blog/?p=99010 2025-05-15T19:08:50Z 2025-04-21T18:14:32Z

AI has become nearly synonymous with innovation. As it rushes onto the world stage, AI is seeding inspiration in creators and problem-solvers of all...]]>

AI has become nearly synonymous with innovation. As it rushes onto the world stage, AI is seeding inspiration in creators and problem-solvers of all stripes—from artists to more traditional industrial inventors. One of the world’s leading AI-first artists, Alexander Reben, has spent his career integrating AI into different artistic mediums. His current work explores AI and robotics and…

]]> Bartley Richardson https://www.linkedin.com/in/bartleyrichardson/%20 <![CDATA[Upcoming Event: NVIDIA Agent Toolkit Hackathon]]> http://www.open-lab.net/blog/?p=98965 2025-05-15T19:08:50Z 2025-04-18T17:06:38Z

Build a high-performance agentic AI system using the open-source NVIDIA Agent Intelligence toolkit -- contest runs May 12 to May 23.]]>

Build a high-performance agentic AI system using the open-source NVIDIA Agent Intelligence toolkit — contest runs May 12 to May 23.

]]> Daniel Rodriguez <![CDATA[Announcing ComputeEval, an Open-Source Framework for Evaluating LLMs on CUDA]]> http://www.open-lab.net/blog/?p=98885 2025-05-15T19:08:55Z 2025-04-16T16:48:07Z

Large language models (LLMs) are revolutionizing how developers code and how they learn to code. For seasoned or junior developers alike, today��s...]]>

Large language models (LLMs) are revolutionizing how developers code and how they learn to code. For seasoned or junior developers alike, today’s state-of-the-art models can generate Python scripts, React-based websites, and more. In the future, powerful AI models will assist developers in writing high-performance GPU code. This raises an important question: How can it be determined whether an LLM…

]]> Ziyue Xu <![CDATA[Efficient Federated Learning in the Era of LLMs with Message Quantization and Streaming]]> http://www.open-lab.net/blog/?p=98553 2025-05-01T18:35:52Z 2025-04-16T16:00:00Z

Federated learning (FL) has emerged as a promising approach for training machine learning models across distributed data sources while preserving data privacy....]]>

Federated learning (FL) has emerged as a promising approach for training machine learning models across distributed data sources while preserving data privacy. However, FL faces significant challenges related to communication overhead and local resource constraints when balancing model requirements and communication capabilities. Particularly in the current era of large language models…

]]> 1 Nirmal Kumar Juluru <![CDATA[NVIDIA Llama Nemotron Ultra Open Model Delivers Groundbreaking Reasoning Accuracy]]> http://www.open-lab.net/blog/?p=98855 2025-05-05T22:33:12Z 2025-04-15T18:00:00Z

AI is no longer just about generating text or images��it��s about deep reasoning, detailed problem-solving, and powerful adaptability for real-world...]]>

AI is no longer just about generating text or images—it’s about deep reasoning, detailed problem-solving, and powerful adaptability for real-world applications in business and in financial, customer, and healthcare services. Available today, the latest Llama Nemotron Ultra reasoning model from NVIDIA delivers leading accuracy among open-source models across intelligence and coding benchmarks…

]]> Tanya Lenz <![CDATA[Event: Data Filtering Challenge for Training Edge Language Models]]> http://www.open-lab.net/blog/?p=98542 2025-05-01T18:35:54Z 2025-04-15T15:00:00Z

You��re invited to join the challenge. Develop and apply innovative data filtering techniques to curate datasets that enhance edge LM performance.]]>

You’re invited to join the challenge. Develop and apply innovative data filtering techniques to curate datasets that enhance edge LM performance.

]]> Chintan Patel <![CDATA[Just Released: NVIDIA Llama Nemotron Ultra as NVIDIA NIM]]> http://www.open-lab.net/blog/?p=98656 2025-04-17T19:49:32Z 2025-04-10T18:59:20Z

Try NVIDIA Llama Nemotron Ultra as an NVIDIA NIM microservice. At only 253B total parameters, it offers reasoning performance that meets or beats top open...]]>

Try NVIDIA Llama Nemotron Ultra as an NVIDIA NIM microservice. At only 253B total parameters, it offers reasoning performance that meets or beats top open reasoning models like DeepSeek-R1 while offering considerably higher throughput due to its optimized sizing, and retaining excellent tool calling capabilities.

]]> Shai Shen-Orr <![CDATA[Curating Biological Findings from Scientific Literature with NVIDIA NIM]]> http://www.open-lab.net/blog/?p=98526 2025-04-28T23:18:36Z 2025-04-10T18:30:00Z

Scientific papers are highly heterogeneous, often employing diverse terminologies for the same entities, using varied methodologies to study biological...]]>

Scientific papers are highly heterogeneous, often employing diverse terminologies for the same entities, using varied methodologies to study biological phenomena, and presenting findings within distinct contexts. Extracting meaningful insights from these papers requires a profound understanding of biology, a critical evaluation of methodologies, and the ability to discern robust findings from…

]]> Tyler Whitehouse <![CDATA[Just Released: NVIDIA AI Workbench 2025.03.10]]> http://www.open-lab.net/blog/?p=98549 2025-04-17T19:35:34Z 2025-04-09T18:45:41Z

NVIDIA AI Workbench 2025.03.10 features streamlined onboarding and enhanced UX for multicontainer projects.]]>

NVIDIA AI Workbench 2025.03.10 features streamlined onboarding and enhanced UX for multicontainer projects.

]]> Chris Alexiuk <![CDATA[Build Enterprise AI Agents with Advanced Open NVIDIA Llama Nemotron Reasoning Models]]> http://www.open-lab.net/blog/?p=97155 2025-05-05T16:01:49Z 2025-04-08T22:05:00Z

This updated post was originally published on March 18, 2025. Organizations are embracing AI agents to enhance productivity and streamline operations. To...]]>

This updated post was originally published on March 18, 2025. Organizations are embracing AI agents to enhance productivity and streamline operations. To maximize their impact, these agents need strong reasoning abilities to navigate complex problems, uncover hidden connections, and make logical decisions autonomously in dynamic environments. Due to their ability to tackle complex…

]]> Vinay Raman <![CDATA[Evaluating and Enhancing RAG Pipeline Performance Using Synthetic Data?]]> http://www.open-lab.net/blog/?p=97927 2025-05-15T06:26:42Z 2025-04-07T18:39:06Z

As large language models (LLM) gain popularity in various question-answering systems, retrieval-augmented generation (RAG) pipelines have also become a focal...]]>

As large language models (LLM) gain popularity in various question-answering systems, retrieval-augmented generation (RAG) pipelines have also become a focal point. RAG pipelines combine the generation power of LLMs with external data sources and retrieval mechanisms, enabling models to access domain-specific information that may not have existed during fine-tuning.

]]> Elias Wolfberg <![CDATA[Startups Use AI to Deliver Better Maternal and Newborn Care]]> http://www.open-lab.net/blog/?p=98486 2025-04-22T23:55:26Z 2025-04-07T17:55:39Z

Nearly 300,000 women across the globe die each year due to complications arising from pregnancy or childbirth. The number of stillborns and babies that die...]]>

Nearly 300,000 women across the globe die each year due to complications arising from pregnancy or childbirth. The number of stillborns and babies that die within their first month tops nearly 4M every year. April 7 marks World Health Day, which this year focuses on raising awareness about efforts to end preventable maternal and newborn deaths. Giving women and infants better access to…

]]> Sama Bali <![CDATA[Event: HP & NVIDIA Developer Challenge]]> http://www.open-lab.net/blog/?p=98487 2025-04-17T19:35:39Z 2025-04-07T17:54:00Z

Join the hackathon to build open-source AI solutions, optimize models, enhance workflows, connect with peers, and win prizes.]]>

Join the hackathon to build open-source AI solutions, optimize models, enhance workflows, connect with peers, and win prizes.

]]> Anu Srivastava <![CDATA[NVIDIA Accelerates Inference on Meta Llama 4 Scout and Maverick]]> http://www.open-lab.net/blog/?p=98468 2025-04-22T23:57:03Z 2025-04-06T02:18:34Z

The newest generation of the popular Llama AI models is here with Llama 4 Scout and Llama 4 Maverick. Accelerated by NVIDIA open-source software, they can...]]>

The newest generation of the popular Llama AI models is here with Llama 4 Scout and Llama 4 Maverick. Accelerated by NVIDIA open-source software, they can achieve over 40K output tokens per second on NVIDIA Blackwell B200 GPUs, and are available to try as NVIDIA NIM microservices. The Llama 4 models are now natively multimodal and multilingual using a mixture-of-experts (MoE) architecture.

]]> 1 Ashraf Eassa <![CDATA[NVIDIA Blackwell Delivers Massive Performance Leaps in MLPerf Inference v5.0]]> http://www.open-lab.net/blog/?p=98367 2025-04-23T19:41:12Z 2025-04-02T18:14:48Z

The compute demands for large language model (LLM) inference are growing rapidly, fueled by the combination of growing model sizes, real-time latency...]]>

The compute demands for large language model (LLM) inference are growing rapidly, fueled by the combination of growing model sizes, real-time latency requirements, and, most recently, AI reasoning. At the same time, as AI adoption grows, the ability of an AI factory to serve as many users as possible, all while maintaining good per-user experiences, is key to maximizing the value it generates.

]]> Vinh Nguyen <![CDATA[LLM Inference Benchmarking: Fundamental Concepts]]> http://www.open-lab.net/blog/?p=98215 2025-05-09T18:23:04Z 2025-04-02T17:00:00Z

This is the first post in the large language model latency-throughput benchmarking series, which aims to instruct developers on common metrics used for LLM...]]>

This is the first post in the large language model latency-throughput benchmarking series, which aims to instruct developers on common metrics used for LLM benchmarking, fundamental concepts, and how to benchmark your LLM applications. The past few years have witnessed the rise in popularity of generative AI and large language models (LLMs), as part of a broad AI revolution.

]]> Arun Raman <![CDATA[Deploying the NVIDIA AI Blueprint for Cost-Efficient LLM Routing]]> http://www.open-lab.net/blog/?p=98006 2025-04-23T00:01:08Z 2025-03-26T22:01:20Z

Since the release of ChatGPT in November 2022, the capabilities of large language models (LLMs) have surged, and the number of available models has grown...]]>

Since the release of ChatGPT in November 2022, the capabilities of large language models (LLMs) have surged, and the number of available models has grown exponentially. With this expansion, LLMs now vary widely in cost, performance, and specialization. For example, straightforward tasks like text summarization can be efficiently handled by smaller, general-purpose models. In contrast…

]]> Cole Swain <![CDATA[Spotlight: Tomorrow.io?Transforms Global Weather Resilience with NVIDIA AI]]> http://www.open-lab.net/blog/?p=98023 2025-04-03T18:46:17Z 2025-03-26T21:19:34Z

From hyperlocal forecasts that guide daily operations to planet-scale models illuminating new climate insights, the world is entering a new frontier in weather...]]>

From hyperlocal forecasts that guide daily operations to planet-scale models illuminating new climate insights, the world is entering a new frontier in weather and climate resilience. The combination of space-based observations and GPU-accelerated AI delivers near-instant, context-rich insights to enterprises, governments, researchers, and solution providers worldwide. It also marks a rare…

]]> Wen Jie Ong <![CDATA[Accelerating the Future of Transportation with SES AI��s NVIDIA-Powered Innovation for Electric Vehicles]]> http://www.open-lab.net/blog/?p=97805 2025-04-23T00:04:13Z 2025-03-25T16:00:00Z

Electric vehicles (EVs) are transforming transportation, but challenges such as cost, longevity, and range remain barriers to widespread adoption. At the heart...]]>

Electric vehicles (EVs) are transforming transportation, but challenges such as cost, longevity, and range remain barriers to widespread adoption. At the heart of these challenges lies battery technology—specifically, the electrolyte, a critical component that enables energy storage and delivery. The electrolyte’s properties directly impact a battery’s charging speed, power output, stability…

]]> 1 Annamalai Chockalingam <![CDATA[Kickstart Your AI Journey on RTX AI PCs and Workstations with NVIDIA NIM Microservices]]> http://www.open-lab.net/blog/?p=97991 2025-04-03T18:47:34Z 2025-03-25T13:00:00Z

With emerging use cases such as digital humans, agents, podcasts, images, and video generation, generative AI is changing the way we interact with PCs. This...]]>

With emerging use cases such as digital humans, agents, podcasts, images, and video generation, generative AI is changing the way we interact with PCs. This paradigm shift calls for new ways of interfacing with and programming generative AI models. However, getting started can be daunting for PC developers and AI enthusiasts. Today, NVIDIA released a suite of NVIDIA NIM microservices on…

]]> Uttara Kumar <![CDATA[Boost Llama Model Performance on Microsoft Azure AI Foundry with NVIDIA TensorRT-LLM]]> http://www.open-lab.net/blog/?p=97008 2025-04-23T00:07:01Z 2025-03-20T15:00:00Z

Microsoft, in collaboration with NVIDIA, announced transformative performance improvements for the Meta Llama family of models on its Azure AI Foundry platform....]]>

Microsoft, in collaboration with NVIDIA, announced transformative performance improvements for the Meta Llama family of models on its Azure AI Foundry platform. These advancements, enabled by NVIDIA TensorRT-LLM optimizations, deliver significant gains in throughput, reduced latency, and improved cost efficiency, all while preserving the quality of model outputs. With these improvements…

]]> Dave Salvator <![CDATA[NVIDIA Blackwell Ultra for the Era of AI Reasoning]]> http://www.open-lab.net/blog/?p=96761 2025-03-20T22:34:30Z 2025-03-19T18:00:15Z

For years, advancements in AI have followed a clear trajectory through pretraining scaling: larger models, more data, and greater computational resources lead...]]>

For years, advancements in AI have followed a clear trajectory through pretraining scaling: larger models, more data, and greater computational resources lead to breakthrough capabilities. In the last 5 years, pretraining scaling has increased compute requirements at an incredible rate of 50M times. However, building more intelligent systems is no longer just about pretraining bigger models.

]]> Jonathan Ferrer Mestres <![CDATA[NVIDIA Earth-2 Powers Regional AI Weather Forecasting in the United Arab Emirates]]> http://www.open-lab.net/blog/?p=97074 2025-04-23T00:27:21Z 2025-03-19T16:01:00Z

In the United Arab Emirates (UAE), extreme weather events disrupt daily life, delaying flights, endangering transportation, and complicating urban planning....]]>

In the United Arab Emirates (UAE), extreme weather events disrupt daily life, delaying flights, endangering transportation, and complicating urban planning. High daytime temperatures limit human activity outdoors, while dense nighttime fog is a frequent cause of severe and often fatal car crashes. Meanwhile, 2024 saw the heaviest precipitation event in the country in 75 years…

]]> Michael Zephyr <![CDATA[MONAI Integrates Advanced Agentic Architectures to Establish Multimodal Medical AI Ecosystem]]> http://www.open-lab.net/blog/?p=97638 2025-04-23T00:26:59Z 2025-03-19T16:00:00Z

The growing volume and complexity of medical data��and the pressing need for early disease diagnosis and improved healthcare efficiency��are driving...]]>

The growing volume and complexity of medical data—and the pressing need for early disease diagnosis and improved healthcare efficiency—are driving unprecedented advancements in medical AI. Among the most transformative innovations in this field are multimodal AI models that simultaneously process text, images, and video. These models offer a more comprehensive understanding of patient data than…

]]> Kyle Tretina <![CDATA[Guiding Generative Molecular Design with Experimental Feedback Using Oracles]]> http://www.open-lab.net/blog/?p=96966 2025-03-25T17:23:57Z 2025-03-19T15:00:00Z

Generative chemistry with AI has the potential to revolutionize how scientists approach drug discovery and development, health, and materials science and...]]>

Generative chemistry with AI has the potential to revolutionize how scientists approach drug discovery and development, health, and materials science and engineering. Instead of manually designing molecules with “chemical intuition” or screening millions of existing chemicals, researchers can train neural networks to propose novel molecular structures tailored to the desired properties.

]]> TJ Chen <![CDATA[Shrink Genomics and Single-Cell Analysis Time to Minutes with NVIDIA Parabricks and NVIDIA AI Blueprints]]> http://www.open-lab.net/blog/?p=96979 2025-03-20T18:33:12Z 2025-03-19T15:00:00Z

NVIDIA Parabricks is a scalable genomics analysis software suite that solves omics challenges with accelerated computing and deep learning to unlock new...]]>

NVIDIA Parabricks is a scalable genomics analysis software suite that solves omics challenges with accelerated computing and deep learning to unlock new scientific breakthroughs. Released at NVIDIA GTC 2025, NVIDIA Parabricks v4.5 supports the growing quantity of data by including support for the latest NVIDIA GPU architectures, and improved alignment and variant calling with the…

]]> Hao Wang <![CDATA[Petabyte-Scale Video Processing with NVIDIA NeMo Curator on NVIDIA DGX Cloud]]> http://www.open-lab.net/blog/?p=97031 2025-03-20T17:07:03Z 2025-03-18T19:22:51Z

With the rise of physical AI, video content generation has surged exponentially. A single camera-equipped autonomous vehicle can generate more than 1 TB of...]]>

With the rise of physical AI, video content generation has surged exponentially. A single camera-equipped autonomous vehicle can generate more than 1 TB of video daily, while a robotics-powered manufacturing facility may produce 1 PB of data daily. To leverage this data for training and fine-tuning world foundation models (WFMs), you must first process it efficiently.

]]> 3 Ruchika Kharwar <![CDATA[NVIDIA NeMo Retriever Delivers Accurate Multimodal PDF Data Extraction 15x Faster]]> http://www.open-lab.net/blog/?p=97161 2025-04-23T00:13:16Z 2025-03-18T19:20:51Z

Enterprises are generating and storing more multimodal data than ever before, yet traditional retrieval systems remain largely text-focused. While they can...]]>

Enterprises are generating and storing more multimodal data than ever before, yet traditional retrieval systems remain largely text-focused. While they can surface insights from written content, they aren’t extracting critical information embedded in tables, charts, and infographics—often the most information-dense elements of a document. Without a multimodal retrieval system…

]]> Christian Munley <![CDATA[Improve AI Code Generation Using NVIDIA Agent Intelligence Toolkit]]> http://www.open-lab.net/blog/?p=96937 2025-04-23T00:14:38Z 2025-03-18T19:07:50Z

With the release of NVIDIA Agent Intelligence toolkit��an open-source library for connecting and optimizing teams of AI agents��developers, professionals, and...]]>

With the release of NVIDIA Agent Intelligence toolkit—an open-source library for connecting and optimizing teams of AI agents—developers, professionals, and researchers can create their own agentic AI applications. This tutorial shows you how to develop apps in the Agent Intelligence toolkit through an example of AI code generation. We build a test-driven coding agent using LangGraph and reasoning…

]]> 1 Sylendran Arunagiri <![CDATA[Maximize AI Agent Performance with Data Flywheels Using NVIDIA NeMo Microservices]]> http://www.open-lab.net/blog/?p=97046 2025-04-23T00:15:03Z 2025-03-18T19:05:30Z

As agentic AI systems evolve and become essential for optimizing business processes, it is crucial for developers to update them regularly to stay aligned with...]]>

As agentic AI systems evolve and become essential for optimizing business processes, it is crucial for developers to update them regularly to stay aligned with ever-changing business and user needs. Continuously refining these agents with AI and human feedback ensures that they remain effective and relevant. NVIDIA NeMo microservices is a fully accelerated, enterprise-grade solution designed…

]]> Amr Elmeleegy <![CDATA[Introducing NVIDIA Dynamo, A Low-Latency Distributed Inference Framework for Scaling Reasoning AI Models]]> http://www.open-lab.net/blog/?p=95274 2025-04-23T00:15:55Z 2025-03-18T17:50:00Z

NVIDIA announced the release of NVIDIA Dynamo today at GTC 2025. NVIDIA Dynamo is a high-throughput, low-latency open-source inference serving framework for...]]>

NVIDIA announced the release of NVIDIA Dynamo today at GTC 2025. NVIDIA Dynamo is a high-throughput, low-latency open-source inference serving framework for deploying generative AI and reasoning models in large-scale distributed environments. The framework boosts the number of requests served by up to 30x, when running the open-source DeepSeek-R1 models on NVIDIA Blackwell.

]]> 1 Ashraf Eassa <![CDATA[NVIDIA Blackwell Delivers World-Record DeepSeek-R1 Inference Performance]]> http://www.open-lab.net/blog/?p=97352 2025-04-23T00:23:25Z 2025-03-18T17:41:42Z

NVIDIA announced world-record DeepSeek-R1 inference performance at NVIDIA GTC 2025. A single NVIDIA DGX system with eight NVIDIA Blackwell GPUs can achieve over...]]>

NVIDIA announced world-record DeepSeek-R1 inference performance at NVIDIA GTC 2025. A single NVIDIA DGX system with eight NVIDIA Blackwell GPUs can achieve over 250 tokens per second per user or a maximum throughput of over 30,000 tokens per second on the massive, state-of-the-art 671 billion parameter DeepSeek-R1 model. These rapid advancements in performance at both ends of the performance…

]]> 1 Pranjali Joshi <![CDATA[Scale Synthetic Data and Physical AI Reasoning with NVIDIA Cosmos World Foundation Models]]> http://www.open-lab.net/blog/?p=97132 2025-04-23T00:31:38Z 2025-03-18T16:00:47Z

The next generation of AI-driven robots like humanoids and autonomous vehicles depends on high-fidelity, physics-aware training data. Without diverse and...]]>

The next generation of AI-driven robots like humanoids and autonomous vehicles depends on high-fidelity, physics-aware training data. Without diverse and representative datasets, these systems don’t get proper training and face testing risks due to poor generalization, limited exposure to real-world variations, and unpredictable behavior in edge cases. Collecting massive real-world datasets for…

]]> Allyson Vasquez <![CDATA[NVIDIA RTX Advances with Neural Rendering and Digital Human Technologies at GDC 2025]]> http://www.open-lab.net/blog/?p=97390 2025-04-23T20:53:36Z 2025-03-18T00:00:00Z

AI is transforming how we experience our favorite games. It is unlocking new levels of visuals, performance, and gameplay possibilities with neural rendering...]]>

AI is transforming how we experience our favorite games. It is unlocking new levels of visuals, performance, and gameplay possibilities with neural rendering and generative AI-powered characters. With game development becoming more complex, AI is also playing a role in helping artists and engineers realize their creative visions. At GDC 2025, NVIDIA is building upon NVIDIA RTX Kit…

]]> 4 Anu Srivastava <![CDATA[Lightweight, Multimodal, Multilingual Gemma 3 Models Are Streamlined for Performance]]> http://www.open-lab.net/blog/?p=96770 2025-04-23T00:33:31Z 2025-03-12T08:45:00Z

Building AI systems with foundation models requires a delicate balancing of resources such as memory, latency, storage, compute, and more. One size does not fit...]]>

Building AI systems with foundation models requires a delicate balancing of resources such as memory, latency, storage, compute, and more. One size does not fit all for developers managing cost and user experience when bringing generative AI capability to the rapidly growing ecosystem of AI-powered applications. You need options for high-quality, customizable models that can support large…

]]> Shubham Agrawal <![CDATA[Build Real-Time Multimodal XR Apps with NVIDIA AI Blueprint for Video Search and Summarization]]> http://www.open-lab.net/blog/?p=96842 2025-03-12T22:08:59Z 2025-03-11T17:30:00Z

With the recent advancements in generative AI and vision foundational models, VLMs present a new wave of visual computing wherein the models are capable of...]]>

With the recent advancements in generative AI and vision foundational models, VLMs present a new wave of visual computing wherein the models are capable of highly sophisticated perception and deep contextual understanding. These intelligent solutions offer a promising means of enhancing semantic comprehension in XR settings. By integrating VLMs, developers can significantly improve how XR…

]]> Chen Fu <![CDATA[Streamline LLM Deployment for Autonomous Vehicle Applications with NVIDIA DriveOS LLM SDK]]> http://www.open-lab.net/blog/?p=96776 2025-03-07T20:13:46Z 2025-03-10T19:30:00Z

Large language models (LLMs) have shown remarkable generalization capabilities in natural language processing (NLP). They are used in a wide range of...]]>

Large language models (LLMs) have shown remarkable generalization capabilities in natural language processing (NLP). They are used in a wide range of applications, including translation, digital assistants, recommendation systems, context analysis, code generation, cybersecurity, and more. In automotive applications, there is growing demand for LLM-based solutions for both autonomous driving and…

]]> 2 Shelby Thomas <![CDATA[Ensuring Reliable Model Training on NVIDIA DGX Cloud]]> http://www.open-lab.net/blog/?p=96789 2025-03-24T18:36:43Z 2025-03-10T16:26:44Z

Training AI models on massive GPU clusters presents significant challenges for model builders. Because manual intervention becomes impractical as job scale...]]>

Training AI models on massive GPU clusters presents significant challenges for model builders. Because manual intervention becomes impractical as job scale increases, automation is critical to maintaining high GPU utilization and training productivity. An exceptional training experience requires resilient systems that provide low-latency error attribution and automatic fail over based on root…

]]> Michelle Horton <![CDATA[Top Agentic AI Sessions at NVIDIA GTC 2025]]> http://www.open-lab.net/blog/?p=96836 2025-03-07T00:33:46Z 2025-03-07T00:33:44Z

Learn from and connect with leading AI developers building the next generation of AI agents.]]>

Learn from and connect with leading AI developers building the next generation of AI agents.

]]> Tanay Varshney <![CDATA[How Using a Reranking Microservice Can Improve Accuracy and Costs of Information Retrieval]]> http://www.open-lab.net/blog/?p=96363 2025-03-06T20:05:47Z 2025-03-06T18:33:38Z

Applications requiring high-performance information retrieval span a wide range of domains, including search engines, knowledge management systems, AI agents,...]]>

Applications requiring high-performance information retrieval span a wide range of domains, including search engines, knowledge management systems, AI agents, and AI assistants. These systems demand retrieval processes that are accurate and computationally efficient to deliver precise insights, enhance user experiences, and maintain scalability. Retrieval-augmented generation (RAG) is used to…

]]> Michelle Horton <![CDATA[Top Physical AI and Robotics Sessions at NVIDIA GTC 2025]]> http://www.open-lab.net/blog/?p=96765 2025-03-06T19:26:33Z 2025-03-06T00:59:23Z

Join these sessions to learn how accelerated computing, generative AI, and physics-based world simulation are advancing physical and embodied AI.]]>

Join these sessions to learn how accelerated computing, generative AI, and physics-based world simulation are advancing physical and embodied AI.

]]> Michelle Horton <![CDATA[Top Generative AI Sessions at NVIDIA GTC 2025]]> http://www.open-lab.net/blog/?p=96689 2025-03-06T19:51:57Z 2025-03-03T23:45:42Z

Discover cutting-edge AI and data science innovations from top generative AI teams at NVIDIA GTC 2025.]]>

Discover cutting-edge AI and data science innovations from top generative AI teams at NVIDIA GTC 2025.

]]> Aditi Bodhankar <![CDATA[Measuring the Effectiveness and Performance of AI Guardrails in Generative AI Applications]]> http://www.open-lab.net/blog/?p=96562 2025-04-23T02:40:19Z 2025-03-03T17:22:09Z

Safeguarding AI agents and other conversational AI applications to ensure safe, on-brand and reliable behavior is essential for enterprises. NVIDIA NeMo...]]>

Safeguarding AI agents and other conversational AI applications to ensure safe, on-brand and reliable behavior is essential for enterprises. NVIDIA NeMo Guardrails offers robust protection with AI guardrails for content safety, topic control, jailbreak detection, and more to evaluate and optimize guardrail performance. In this post, we explore techniques for measuring and optimizing your AI…

]]> 1 Mehran Maghoumi <![CDATA[Build an AI Agent with Expert Reasoning Capabilities Using the DeepSeek-R1 NIM]]> http://www.open-lab.net/blog/?p=96030 2025-03-06T19:52:48Z 2025-02-28T20:23:51Z

AI agents are transforming business operations by automating processes, optimizing decision-making, and streamlining actions. Their effectiveness hinges on...]]>

AI agents are transforming business operations by automating processes, optimizing decision-making, and streamlining actions. Their effectiveness hinges on expert reasoning, enabling smarter planning and efficient execution. Agentic AI applications could benefit from the capabilities of models such as DeepSeek-R1. Built for solving problems that require advanced AI reasoning…

]]> Sangjune Park <![CDATA[Spotlight: NAVER Place Optimizes SLM-Based Vertical Services with NVIDIA TensorRT-LLM]]> http://www.open-lab.net/blog/?p=96279 2025-04-23T02:32:43Z 2025-02-28T17:57:49Z

NAVER is a popular South Korean search engine company that offers Naver Place, a geo-based service that provides detailed information about millions of...]]>

As of 3/18/25, NVIDIA Triton Inference Server is now NVIDIA Dynamo. NAVER is a popular South Korean search engine company that offers Naver Place, a geo-based service that provides detailed information about millions of businesses and points of interest across Korea. Users can search about different places, leave reviews, and place bookings or orders in real time.

]]> Anu Srivastava <![CDATA[Latest Multimodal Addition to Microsoft Phi SLMs Trained on NVIDIA GPUs]]> http://www.open-lab.net/blog/?p=96519 2025-04-23T02:39:30Z 2025-02-26T22:05:00Z

Large language models (LLMs) have permeated every industry and changed the potential of technology. However, due to their massive size they are not practical...]]>

Large language models (LLMs) have permeated every industry and changed the potential of technology. However, due to their massive size they are not practical for the current resource constraints that many companies have. The rise of small language models (SLMs) bridge quality and cost by creating models with a smaller resource footprint. SLMs are a subset of language models that tend to…

]]> Yifan Wu <![CDATA[Accelerating Scientific Literature Reviews with NVIDIA NIM Microservices for LLMs]]> http://www.open-lab.net/blog/?p=96324 2025-04-23T02:38:59Z 2025-02-26T17:00:00Z

A well-crafted systematic review is often the initial step for researchers exploring a scientific field. For scientists new to this field, it provides a...]]>

A well-crafted systematic review is often the initial step for researchers exploring a scientific field. For scientists new to this field, it provides a structured overview of the domain. For experts, it refines their understanding and sparks new ideas. In 2024 alone, 218,650 review articles were indexed in the Web of Science database, highlighting the importance of these resources in research.

]]> Francesco Ciannella <![CDATA[Building a Simple VLM-Based Multimodal Information Retrieval System with NVIDIA NIM]]> http://www.open-lab.net/blog/?p=96151 2025-03-06T19:26:45Z 2025-02-26T17:00:00Z

In today��s data-driven world, the ability to retrieve accurate information from even modest amounts of data is vital for developers seeking streamlined,...]]>

In today’s data-driven world, the ability to retrieve accurate information from even modest amounts of data is vital for developers seeking streamlined, effective solutions for quick deployments, prototyping, or experimentation. One of the key challenges in information retrieval is managing the diverse modalities in unstructured datasets, including text, PDFs, images, tables, audio, video…

]]> 1 Shubham Agrawal <![CDATA[Vision Language Model Prompt Engineering Guide for Image and Video Understanding]]> http://www.open-lab.net/blog/?p=96229 2025-04-23T02:38:32Z 2025-02-26T16:25:34Z

Vision language models (VLMs) are evolving at a breakneck speed. In 2020, the first VLMs revolutionized the generative AI landscape by bringing visual...]]>

Vision language models (VLMs) are evolving at a breakneck speed. In 2020, the first VLMs revolutionized the generative AI landscape by bringing visual understanding to large language models (LLMs) through the use of a vision encoder. These initial VLMs were limited in their abilities, only able to understand text and single image inputs. Fast-forward a few years and VLMs are now capable of…

]]> Mark Ren <![CDATA[Configurable Graph-Based Task Solving with the Marco Multi-AI Agent Framework for Chip Design]]> http://www.open-lab.net/blog/?p=96209 2025-04-23T02:38:21Z 2025-02-25T22:17:28Z

Chip and hardware design presents numerous challenges stemming from its complexity and advancing technologies. These challenges result in longer turn-around...]]>

Chip and hardware design presents numerous challenges stemming from its complexity and advancing technologies. These challenges result in longer turn-around time (TAT) for optimizing performance, power, area, and cost (PPAC) during synthesis, verification, physical design, and reliability loops. Large language models (LLMs) have shown a remarkable capacity to comprehend and generate natural…

]]> Leon Derczynski <![CDATA[Defining LLM Red Teaming]]> http://www.open-lab.net/blog/?p=96239 2025-04-23T02:37:15Z 2025-02-25T18:49:26Z

There is an activity where people provide inputs to generative AI technologies, such as large language models (LLMs), to see if the outputs can be made to...]]>

There is an activity where people provide inputs to generative AI technologies, such as large language models (LLMs), to see if the outputs can be made to deviate from acceptable standards. This use of LLMs began in 2023 and has rapidly evolved to become a common industry practice and a cornerstone of trustworthy AI. How can we standardize and define LLM red teaming?

]]> Rich Harang <![CDATA[Agentic Autonomy Levels and Security]]> http://www.open-lab.net/blog/?p=96341 2025-04-23T02:36:53Z 2025-02-25T18:45:05Z

Agentic workflows are the next evolution in AI-powered tools. They enable developers to chain multiple AI models together to perform complex activities, enable...]]>

Agentic workflows are the next evolution in AI-powered tools. They enable developers to chain multiple AI models together to perform complex activities, enable AI models to use tools to access additional data or automate user actions, and enable AI models to operate autonomously, analyzing and performing complex tasks with a minimum of human involvement or interaction. Because of their power…

]]> Joe Bungo <![CDATA[NVIDIA Deep Learning Institute Releases New Generative AI Teaching Kit]]> http://www.open-lab.net/blog/?p=88388 2025-04-23T02:35:50Z 2025-02-25T17:47:49Z

Generative AI, powered by advanced machine learning models and deep neural networks, is revolutionizing industries by generating novel content and driving...]]>

Generative AI, powered by advanced machine learning models and deep neural networks, is revolutionizing industries by generating novel content and driving innovation in fields like healthcare, finance, and entertainment. NVIDIA is leading this transformation with its cutting-edge GPU architectures and software ecosystems, such as the H100 Tensor Core GPU and CUDA platform…

]]> 6 Charu Chaubal <![CDATA[NVIDIA AI Enterprise Adds Support for NVIDIA H200 NVL]]> http://www.open-lab.net/blog/?p=96424 2025-04-23T02:34:39Z 2025-02-24T22:37:47Z

NVIDIA AI Enterprise is the cloud-native software platform for the development and deployment of production-grade AI solutions. The latest release of the NVIDIA...]]>

NVIDIA AI Enterprise is the cloud-native software platform for the development and deployment of production-grade AI solutions. The latest release of the NVIDIA AI Enterprise infrastructure software collection adds support for the latest NVIDIA data center GPU, NVIDIA H200 NVL, giving your enterprise new options for powering cutting-edge use cases such as agentic and generative AI with some of the…

]]> Sama Bali <![CDATA[Transforming Product Design Workflows in Manufacturing with Generative AI]]> http://www.open-lab.net/blog/?p=96242 2025-04-23T02:42:26Z 2025-02-20T19:32:11Z

Traditional design and engineering workflows in the manufacturing industry have long been characterized by a sequential, iterative approach that is often...]]>

Traditional design and engineering workflows in the manufacturing industry have long been characterized by a sequential, iterative approach that is often time-consuming and resource intensive. These conventional methods typically involve stages such as requirement gathering, conceptual design, detailed design, analysis, prototyping, and testing, with each phase dependent on the results of previous…

]]> Sven Chilton <![CDATA[Deploying NVIDIA Riva Multilingual ASR with Whisper and Canary Architectures While Selectively Deactivating NMT]]> http://www.open-lab.net/blog/?p=95339 2025-04-23T02:42:38Z 2025-02-20T18:54:48Z

NVIDIA has consistently developed automatic speech recognition (ASR) models that set the benchmark in the industry. Earlier versions of NVIDIA Riva, a...]]>

NVIDIA has consistently developed automatic speech recognition (ASR) models that set the benchmark in the industry. Earlier versions of NVIDIA Riva, a collection of GPU-accelerated speech and translation AI microservices for ASR, TTS, and NMT, support English-Spanish and English-Japanese code-switching ASR models based on the Conformer architecture, along with a model supporting multiple…

]]> Tanya Lenz <![CDATA[Upcoming Livestream: Using the NVIDIA AI Blueprint for PDF to Podcast?]]> http://www.open-lab.net/blog/?p=96307 2025-02-20T18:11:39Z 2025-02-20T18:11:37Z

Join us on February 27 to learn how to transform PDFs into AI podcasts using the NVIDIA AI Blueprint.]]>

Join us on February 27 to learn how to transform PDFs into AI podcasts using the NVIDIA AI Blueprint.

]]> Allyson Vasquez <![CDATA[Bring NVIDIA ACE AI Characters to Games with the New In-Game Inferencing SDK]]> http://www.open-lab.net/blog/?p=96051 2025-04-23T02:43:15Z 2025-02-20T17:00:00Z

NVIDIA ACE is a suite of digital human technologies that bring game characters and digital assistants to life with generative AI. ACE on-device models enable...]]>

]]> Nitzan Simchi <![CDATA[Spotlight: Drug Discovery Startup Protai Advances Complex Structure Prediction with AlphaFold, Proteomics, and NVIDIA NIM]]> http://www.open-lab.net/blog/?p=96107 2025-04-23T02:44:24Z 2025-02-19T17:30:00Z

Generative AI, especially with breakthroughs like AlphaFold and RosettaFold, is transforming drug discovery and how biotech companies and research laboratories...]]>

Generative AI, especially with breakthroughs like AlphaFold and RosettaFold, is transforming drug discovery and how biotech companies and research laboratories study protein structures, unlocking groundbreaking insights into protein interactions. Proteins are dynamic entities. It has been postulated that a protein’s native state is known by its sequence of amino acids alone…

]]> Kyle Tretina <![CDATA[Understanding the Language of Life��s Biomolecules Across Evolution at a New Scale with Evo 2]]> http://www.open-lab.net/blog/?p=95589 2025-04-23T02:44:28Z 2025-02-19T17:14:51Z

AI has evolved from an experimental curiosity to a driving force within biological research. The convergence of deep learning algorithms, massive omics...]]>

AI has evolved from an experimental curiosity to a driving force within biological research. The convergence of deep learning algorithms, massive omics datasets, and automated laboratory workflows has allowed scientists to tackle problems once thought intractable—from rapid protein structure prediction to generative drug design, increasing the need for AI literacy among scientists.

]]> Brad Nemire <![CDATA[Featured Sessions for Students at NVIDIA GTC 2025]]> http://www.open-lab.net/blog/?p=96181 2025-02-20T15:52:32Z 2025-02-15T02:00:58Z

Learn from researchers, scientists, and industry leaders across a variety of topics including AI, robotics, and Data Science.]]>

Learn from researchers, scientists, and industry leaders across a variety of topics including AI, robotics, and Data Science.

]]> Anjali Shah <![CDATA[Optimizing Qwen2.5-Coder Throughput with NVIDIA TensorRT-LLM Lookahead Decoding]]> http://www.open-lab.net/blog/?p=96010 2025-04-23T02:44:36Z 2025-02-14T18:19:37Z

Large language models (LLMs) that specialize in coding have been steadily adopted into developer workflows. From pair programming to self-improving AI agents,...]]>

Large language models (LLMs) that specialize in coding have been steadily adopted into developer workflows. From pair programming to self-improving AI agents, these models assist developers with various tasks, including enhancing code, fixing bugs, generating tests, and writing documentation. To promote the development of open-source LLMs, the Qwen team recently released Qwen2.5-Coder…

]]> 1 Joanne Chang <![CDATA[Upcoming Webinar: Unlocking Video Analytics With AI Agents]]> http://www.open-lab.net/blog/?p=96135 2025-02-20T15:52:55Z 2025-02-13T22:05:57Z

Master prompt engineering, fine-tuning, and customization to build video analytics AI agents.]]>

Master prompt engineering, fine-tuning, and customization to build video analytics AI agents.

]]> Terry Chen <![CDATA[Automating GPU Kernel Generation with DeepSeek-R1 and Inference Time Scaling]]> http://www.open-lab.net/blog/?p=95998 2025-04-23T02:45:39Z 2025-02-12T18:00:00Z

As AI models extend their capabilities to solve more sophisticated challenges, a new scaling law known as test-time scaling or inference-time scaling is...]]>

As AI models extend their capabilities to solve more sophisticated challenges, a new scaling law known as test-time scaling or inference-time scaling is emerging. Also known as AI reasoning or long-thinking, this technique improves model performance by allocating additional computational resources during inference to evaluate multiple possible outcomes and then selecting the best one…

]]> 2 Gomathy Venkata Krishnan <![CDATA[LLM Model Pruning and Knowledge Distillation with NVIDIA NeMo Framework]]> http://www.open-lab.net/blog/?p=93451 2025-04-23T02:53:00Z 2025-02-12T17:54:52Z

Model pruning and knowledge distillation are powerful cost-effective strategies for obtaining smaller language models from an initial larger sibling. ...]]>

Model pruning and knowledge distillation are powerful cost-effective strategies for obtaining smaller language models from an initial larger sibling. The How to Prune and Distill Llama-3.1 8B to an NVIDIA Llama-3.1-Minitron 4B Model post discussed the best practices of using large language models (LLMs) that combine depth, width, attention, and MLP pruning with knowledge distillation…

]]> Emily Potyraj <![CDATA[NVIDIA DGX Cloud Introduces Ready-To-Use Templates to Benchmark AI Platform Performance]]> http://www.open-lab.net/blog/?p=95558 2025-05-06T17:01:29Z 2025-02-11T17:00:00Z

In the rapidly evolving landscape of AI systems and workloads, achieving optimal model training performance extends far beyond chip speed. It requires a...]]>

In the rapidly evolving landscape of AI systems and workloads, achieving optimal model training performance extends far beyond chip speed. It requires a comprehensive evaluation of the entire stack, from compute to networking to model framework. Navigating the complexities of AI system performance can be difficult. There are many application changes that you can make…

]]> Brad Nemire <![CDATA[Featured Researcher and Educator Sessions at NVIDIA GTC 2025]]> http://www.open-lab.net/blog/?p=95817 2025-02-06T19:33:45Z 2025-02-05T23:03:06Z

Explore the latest advancements in academia, including advanced research, innovative teaching methods, and the future of learning and technology.]]>

Explore the latest advancements in academia, including advanced research, innovative teaching methods, and the future of learning and technology.

]]> Cheng-Han (Hank) Du <![CDATA[Improving Translation Quality with Domain-Specific Fine-Tuning and NVIDIA NIM]]> http://www.open-lab.net/blog/?p=95756 2025-04-23T02:50:50Z 2025-02-05T21:30:00Z

Translation plays an essential role in enabling companies to expand across borders, with requirements varying significantly in terms of tone, accuracy, and...]]>

Translation plays an essential role in enabling companies to expand across borders, with requirements varying significantly in terms of tone, accuracy, and technical terminology handling. The emergence of sovereign AI has highlighted critical challenges in large language models (LLMs), particularly their struggle to capture nuanced cultural and linguistic contexts beyond English-dominant…

]]> 1 Shruthii Sathyanarayanan <![CDATA[Streamline Collaboration Across Local and Cloud Systems with NVIDIA AI Workbench]]> http://www.open-lab.net/blog/?p=95720 2025-04-23T02:48:08Z 2025-02-05T18:00:00Z

NVIDIA AI Workbench is a free development environment manager to develop, customize, and prototype AI applications on your GPUs. AI Workbench provides a...]]>

NVIDIA AI Workbench is a free development environment manager to develop, customize, and prototype AI applications on your GPUs. AI Workbench provides a frictionless experience across PCs, workstations, servers, and cloud for AI, data science, and machine learning (ML) projects. The user experience includes: This post provides details about the January 2025 release of NVIDIA AI Workbench…

]]> Pradeep Ramani <![CDATA[OpenAI Triton on NVIDIA Blackwell Boosts AI Performance and Programmability]]> http://www.open-lab.net/blog/?p=95388 2025-04-23T02:48:06Z 2025-02-05T18:00:00Z

Matrix multiplication and attention mechanisms are the computational backbone of modern AI workloads. While libraries like NVIDIA cuDNN provide highly optimized...]]>

Matrix multiplication and attention mechanisms are the computational backbone of modern AI workloads. While libraries like NVIDIA cuDNN provide highly optimized implementations, and frameworks such as CUTLASS offer deep customization, many developers and researchers need a middle ground that combines performance with programmability. The open-source Triton compiler on the NVIDIA Blackwell…

]]> Isabel Hulseman <![CDATA[New NVIDIA AI Blueprint: Build a Customizable RAG Pipeline]]> http://www.open-lab.net/blog/?p=95614 2025-02-13T20:44:16Z 2025-01-30T22:26:12Z

Connect AI applications to enterprise data using embedding and reranking models for information retrieval.]]>

Connect AI applications to enterprise data using embedding and reranking models for information retrieval.

]]> Eric Phan <![CDATA[How to Integrate NVIDIA DLSS 4 into Your Game with NVIDIA Streamline]]> http://www.open-lab.net/blog/?p=95492 2025-04-23T15:00:36Z 2025-01-30T14:00:00Z

NVIDIA DLSS 4 is the latest iteration of DLSS introduced with the NVIDIA GeForce RTX 50 Series GPUs. It includes several new features: DLSS Multi Frame...]]>

NVIDIA DLSS 4 is the latest iteration of DLSS introduced with the NVIDIA GeForce RTX 50 Series GPUs. It includes several new features: Here’s how you can get started with DLSS 4 in your integrations. This post focuses on the Streamline SDK, which provides a plug-and-play framework for simplified plugin integration. The NVIDIA Streamline SDK is an open-source framework that…

]]> Annamalai Chockalingam <![CDATA[New AI SDKs and Tools Released for NVIDIA Blackwell GeForce RTX 50 Series GPUs]]> http://www.open-lab.net/blog/?p=95526 2025-04-23T15:00:41Z 2025-01-30T14:00:00Z

NVIDIA recently announced a new generation of PC GPUs��the GeForce RTX 50 Series��alongside new AI-powered SDKs and tools for developers. Powered by the...]]>

NVIDIA recently announced a new generation of PC GPUs—the GeForce RTX 50 Series—alongside new AI-powered SDKs and tools for developers. Powered by the NVIDIA Blackwell architecture, fifth-generation Tensor Cores and fourth-generation RT Cores, the GeForce RTX 50 Series delivers breakthroughs in AI-driven rendering, including neural shaders, digital human technologies, geometry and lighting.

]]> Amit Bleiweiss <![CDATA[Mastering LLM Techniques: Evaluation]]> http://www.open-lab.net/blog/?p=95447 2025-04-23T15:01:33Z 2025-01-29T20:44:06Z

Evaluating large language models (LLMs) and retrieval-augmented generation (RAG) systems is a complex and nuanced process, reflecting the sophisticated and...]]>

Evaluating large language models (LLMs) and retrieval-augmented generation (RAG) systems is a complex and nuanced process, reflecting the sophisticated and multifaceted nature of these systems. Unlike traditional machine learning (ML) models, LLMs generate a wide range of diverse and often unpredictable outputs, making standard evaluation metrics insufficient. Key challenges include the…

]]> Edoardo Maria Ponti <![CDATA[Dynamic Memory Compression]]> http://www.open-lab.net/blog/?p=93500 2025-04-23T15:01:58Z 2025-01-24T17:43:42Z

Despite the success of large language models (LLMs) as general-purpose AI tools, their high demand for computational resources make their deployment challenging...]]>

Despite the success of large language models (LLMs) as general-purpose AI tools, their high demand for computational resources make their deployment challenging in many real-world scenarios. The sizes of the model and conversation state are limited by the available high-bandwidth memory, limiting the number of users that can be served and the maximum conversation length. At present…

]]> ��˳��97caoporen��