Kubernetes – NVIDIA Technical Blog

Kubernetes – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-05-16T23:50:38Z http://www.open-lab.net/blog/feed/ Meenakshi Kaushik <![CDATA[NVIDIA NIM Operator 2.0 Boosts AI Deployment with NVIDIA NeMo Microservices Support]]> http://www.open-lab.net/blog/?p=99309 2025-05-15T19:08:34Z 2025-04-29T16:00:00Z

The first release of NVIDIA NIM Operator simplified the deployment and lifecycle management of inference pipelines for NVIDIA NIM microservices, reducing the...]]>

The first release of NVIDIA NIM Operator simplified the deployment and lifecycle management of inference pipelines for NVIDIA NIM microservices, reducing the... Decorative image.

Decorative image.

The first release of NVIDIA NIM Operator simplified the deployment and lifecycle management of inference pipelines for NVIDIA NIM microservices, reducing the workload for MLOps, LLMOps engineers, and Kubernetes admins. It enabled easy and fast deployment, auto-scaling, and upgrading of NIM on Kubernetes clusters. Learn more about the first release. Our customers and partners have been using��

]]> 0 Ronen Dar <![CDATA[NVIDIA Open Sources Run:ai Scheduler to Foster Community Collaboration]]> http://www.open-lab.net/blog/?p=98094 2025-04-22T23:59:16Z 2025-04-01T09:00:00Z

Today, NVIDIA announced the open-source release of the KAI Scheduler, a Kubernetes-native GPU scheduling solution, now available under the Apache 2.0 license....]]>

Today, NVIDIA announced the open-source release of the KAI Scheduler, a Kubernetes-native GPU scheduling solution, now available under the Apache 2.0 license....

NVIDIA Open Sources Run:ai Scheduler

Today, NVIDIA announced the open-source release of the KAI Scheduler, a Kubernetes-native GPU scheduling solution, now available under the Apache 2.0 license. Originally developed within the Run:ai platform, KAI Scheduler is now available to the community while also continuing to be packaged and delivered as part of the NVIDIA Run:ai platform. This initiative underscores NVIDIA��s commitment to��

]]> 0 Ameya Parab <![CDATA[Practical Tips for Preventing GPU Fragmentation for Volcano Scheduler]]> http://www.open-lab.net/blog/?p=98171 2025-04-03T18:44:56Z 2025-03-31T20:00:54Z

At NVIDIA, we take pride in tackling complex infrastructure challenges with precision and innovation. When Volcano faced GPU underutilization in their NVIDIA...]]>

At NVIDIA, we take pride in tackling complex infrastructure challenges with precision and innovation. When Volcano faced GPU underutilization in their NVIDIA...

Practical Tips for Preventing GPU Fragmentation for Volcano Scheduler

At NVIDIA, we take pride in tackling complex infrastructure challenges with precision and innovation. When Volcano faced GPU underutilization in their NVIDIA DGX Cloud-provisioned Kubernetes cluster, we stepped in to deliver a solution that not only met but exceeded expectations. By combining advanced scheduling techniques with a deep understanding of distributed workloads��

]]> 0 Pradyumna Desale <![CDATA[Automating AI Factories with NVIDIA Mission Control]]> http://www.open-lab.net/blog/?p=98012 2025-04-03T18:47:00Z 2025-03-25T18:45:11Z

Advanced AI models such as DeepSeek-R1 are proving that enterprises can now build cutting-edge AI models specialized with their own data and expertise. These...]]>

Advanced AI models such as DeepSeek-R1 are proving that enterprises can now build cutting-edge AI models specialized with their own data and expertise. These...

graphic-ai-factory

Advanced AI models such as DeepSeek-R1 are proving that enterprises can now build cutting-edge AI models specialized with their own data and expertise. These models can be tailored to unique use cases, tackling diverse challenges like never before. Based on the success of early AI adopters, many organizations are shifting their focus to full-scale production AI factories. Yet the process of��

]]> 0 Brad Nemire <![CDATA[Upcoming Event: NVIDIA at KubeCon and CloudNativeCon Europe]]> http://www.open-lab.net/blog/?p=97970 2025-03-24T17:15:13Z 2025-03-24T17:15:10Z

Attending KubeCon? Meet NVIDIA at booth S750, join our startup mixer, or stop by our 15+ sessions.]]>

Attending KubeCon? Meet NVIDIA at booth S750, join our startup mixer, or stop by our 15+ sessions.

3752650-ent-digital-kubecon-generic-social-1200x628 copy

Attending KubeCon? Meet NVIDIA at booth S750, join our startup mixer, or stop by our 15+ sessions.

]]> 0 Gareth Sylvester-Bradley <![CDATA[Supercharging Live Media Workflows with NVIDIA NIM and NVIDIA Holoscan for Media]]> http://www.open-lab.net/blog/?p=96650 2025-03-06T19:26:34Z 2025-03-05T21:11:31Z

NVIDIA Holoscan for Media is an NVIDIA-accelerated platform designed for multi-vendor live production and AI. It will be showcased at GTC, highlighting NVIDIA...]]>

NVIDIA Holoscan for Media is an NVIDIA-accelerated platform designed for multi-vendor live production and AI. It will be showcased at GTC, highlighting NVIDIA... A picture of a person sitting in front of audiovisual equipment.

A picture of a person sitting in front of audiovisual equipment.

NVIDIA Holoscan for Media is an NVIDIA-accelerated platform designed for multi-vendor live production and AI. It will be showcased at GTC, highlighting NVIDIA NIM, AI SDKs, and microservices that enhance live production workflows. Built on Kubernetes, the container orchestration platform simplifies media timing, synchronization, and management through NVIDIA components such as the GPU and��

]]> 0 Juana Nakfour <![CDATA[Horizontal Autoscaling of NVIDIA NIM Microservices on Kubernetes]]> http://www.open-lab.net/blog/?p=94972 2025-04-23T15:02:12Z 2025-01-22T17:34:51Z

NVIDIA NIM microservices are model inference containers that can be deployed on Kubernetes. In a production environment, it��s important to understand the...]]>

NVIDIA NIM microservices are model inference containers that can be deployed on Kubernetes. In a production environment, it��s important to understand the... Decorative image of two cartoon llamas in sunglasses.

Decorative image of two cartoon llamas in sunglasses.

As of 3/18/25, NVIDIA Triton Inference Server is now NVIDIA Dynamo. NVIDIA NIM microservices are model inference containers that can be deployed on Kubernetes. In a production environment, it��s important to understand the compute and memory profile of these microservices to set up a successful autoscaling plan. In this post, we describe how to set up and use Kubernetes Horizontal Pod��

]]> 2 Dror Goldenberg <![CDATA[Powering the Next Wave of DPU-Accelerated Cloud Infrastructures with NVIDIA DOCA Platform Framework]]> http://www.open-lab.net/blog/?p=94889 2025-01-23T19:54:26Z 2025-01-13T17:30:25Z

Organizations are increasingly turning to accelerated computing to meet the demands of generative AI, 5G telecommunications, and sovereign clouds. NVIDIA has...]]>

Organizations are increasingly turning to accelerated computing to meet the demands of generative AI, 5G telecommunications, and sovereign clouds. NVIDIA has...

nvidia-doca-platform-framework

Organizations are increasingly turning to accelerated computing to meet the demands of generative AI, 5G telecommunications, and sovereign clouds. NVIDIA has unveiled the DOCA Platform Framework (DPF), providing foundational building blocks to unlock the power of NVIDIA BlueField DPUs and optimize GPU-accelerated computing platforms. Serving as both an orchestration framework and an implementation��

]]> 0 Amr Elmeleegy <![CDATA[Spotlight: Perplexity AI Serves 400 Million Search Queries a Month Using NVIDIA Inference Stack]]> http://www.open-lab.net/blog/?p=93396 2025-03-18T18:26:38Z 2024-12-05T17:58:43Z

The demand for AI-enabled services continues to grow rapidly, placing increasing pressure on IT and infrastructure teams. These teams are tasked with...]]>

The demand for AI-enabled services continues to grow rapidly, placing increasing pressure on IT and infrastructure teams. These teams are tasked with...

inference-perplexity-ai

As of 3/18/25, NVIDIA Triton Inference Server is now NVIDIA Dynamo. The demand for AI-enabled services continues to grow rapidly, placing increasing pressure on IT and infrastructure teams. These teams are tasked with provisioning the necessary hardware and software to meet that demand while simultaneously balancing cost efficiency with optimal user experience. This challenge was faced by the��

]]> 0 Maggie Zhang <![CDATA[Scaling LLMs with NVIDIA Triton and NVIDIA TensorRT-LLM Using Kubernetes]]> http://www.open-lab.net/blog/?p=90412 2025-03-18T18:18:17Z 2024-10-22T16:53:55Z

Large language models (LLMs) have been widely used for chatbots, content generation, summarization, classification, translation, and more. State-of-the-art LLMs...]]>

Large language models (LLMs) have been widely used for chatbots, content generation, summarization, classification, translation, and more. State-of-the-art LLMs...

llm-graphic

As of 3/18/25, NVIDIA Triton Inference Server is now NVIDIA Dynamo. Large language models (LLMs) have been widely used for chatbots, content generation, summarization, classification, translation, and more. State-of-the-art LLMs and foundation models, such as Llama, Gemma, GPT, and Nemotron, have demonstrated human-like understanding and generative abilities. Thanks to these models��

]]> 0 Charlie Huang <![CDATA[Scale High-Performance AI Inference with Google Kubernetes Engine and NVIDIA NIM]]> http://www.open-lab.net/blog/?p=90198 2024-10-30T18:57:03Z 2024-10-16T16:30:00Z

The rapid evolution of AI models has driven the need for more efficient and scalable inferencing solutions. As organizations strive to harness the power of AI,...]]>

The rapid evolution of AI models has driven the need for more efficient and scalable inferencing solutions. As organizations strive to harness the power of AI,...

nvidia-google-cloud-logos

The rapid evolution of AI models has driven the need for more efficient and scalable inferencing solutions. As organizations strive to harness the power of AI, they face challenges in deploying, managing, and scaling AI inference workloads. NVIDIA NIM and Google Kubernetes Engine (GKE) together offer a powerful solution to address these challenges. NVIDIA has collaborated with Google Cloud to��

]]> 0 Anurag Guda https://www.linkedin.com/in/anuragguda/ <![CDATA[Simplify AI Application Development with NVIDIA Cloud Native Stack]]> http://www.open-lab.net/blog/?p=89970 2024-10-29T21:00:38Z 2024-10-16T16:00:00Z

In the rapidly evolving landscape of AI and data science, the demand for scalable, efficient, and flexible infrastructure has never been higher. Traditional...]]>

In the rapidly evolving landscape of AI and data science, the demand for scalable, efficient, and flexible infrastructure has never been higher. Traditional... NCNS logo on a black background.

In the rapidly evolving landscape of AI and data science, the demand for scalable, efficient, and flexible infrastructure has never been higher. Traditional infrastructure can often struggle to meet the demands of modern AI workloads, leading to bottlenecks in development and deployment processes. As organizations strive to deploy AI models and data-intensive applications at scale��

]]> 0 Shiva Krishna Merla <![CDATA[Managing AI Inference Pipelines on Kubernetes with NVIDIA NIM Operator]]> http://www.open-lab.net/blog/?p=89541 2024-10-17T19:07:07Z 2024-09-30T21:50:06Z

Developers have shown a lot of excitement for NVIDIA NIM microservices, a set of easy-to-use cloud-native microservices that shortens the time-to-market and...]]>

Developers have shown a lot of excitement for NVIDIA NIM microservices, a set of easy-to-use cloud-native microservices that shortens the time-to-market and... Decorative image.

Decorative image.

Developers have shown a lot of excitement for NVIDIA NIM microservices, a set of easy-to-use cloud-native microservices that shortens the time-to-market and simplifies the deployment of generative AI models anywhere, across cloud, data centers, cloud, and GPU-accelerated workstations. To meet the demands of diverse use cases, NVIDIA is bringing to market a variety of different AI models��

]]> 4 Belen Tegegn <![CDATA[Create RAG Applications Using NVIDIA NIM and Haystack on Kubernetes]]> http://www.open-lab.net/blog/?p=84683 2024-07-10T15:28:32Z 2024-06-28T15:00:00Z

Step-by-step guide to build robust, scalable RAG apps with Haystack and NVIDIA NIMs on Kubernetes.]]>

Step-by-step guide to build robust, scalable RAG apps with Haystack and NVIDIA NIMs on Kubernetes.

NVIDIA RAG apps on NIM

Step-by-step guide to build robust, scalable RAG apps with Haystack and NVIDIA NIMs on Kubernetes.

]]> 0 Nirmal Kumar Juluru <![CDATA[Fine-Tune and Align LLMs Easily with NVIDIA NeMo Customizer]]> http://www.open-lab.net/blog/?p=80290 2025-02-17T05:26:51Z 2024-03-27T18:00:00Z

As large language models (LLMs) continue to gain traction in enterprise AI applications, the demand for custom models that can understand and integrate specific...]]>

As large language models (LLMs) continue to gain traction in enterprise AI applications, the demand for custom models that can understand and integrate specific...

nvidia-nemo-customizer-microservice-graphic

As large language models (LLMs) continue to gain traction in enterprise AI applications, the demand for custom models that can understand and integrate specific industry terminology, domain expertise, and unique organizational requirements becomes increasingly important. To address this growing need for customizing LLMs, the NVIDIA NeMo team has announced an early access program for NeMo��

]]> 0 Jacob Liberman <![CDATA[How to Take a RAG Application from Pilot to Production in Four Steps]]> http://www.open-lab.net/blog/?p=79558 2024-10-28T21:58:37Z 2024-03-18T22:00:00Z

Generative AI has the potential to transform every industry. Human workers are already using large language models (LLMs) to explain, reason about, and solve...]]>

Generative AI has the potential to transform every industry. Human workers are already using large language models (LLMs) to explain, reason about, and solve...

graphic-with-cloud-computer-text-woman

Generative AI has the potential to transform every industry. Human workers are already using large language models (LLMs) to explain, reason about, and solve difficult cognitive tasks. Retrieval-augmented generation (RAG) connects LLMs to data, expanding the usefulness of LLMs by giving them access to up-to-date and accurate information. Many enterprises have already started to explore how��

]]> 0 Gareth Sylvester-Bradley <![CDATA[Streamline Live Media Application Development with New Features in NVIDIA Holoscan for Media]]> http://www.open-lab.net/blog/?p=79072 2024-04-25T19:14:28Z 2024-03-12T20:00:00Z

NVIDIA Holoscan for Media is a software-defined platform for building and deploying applications for live media. Recent updates introduce a user-friendly...]]>

NVIDIA Holoscan for Media is a software-defined platform for building and deploying applications for live media. Recent updates introduce a user-friendly...

woman-sitting-desktop-computer

NVIDIA Holoscan for Media is a software-defined platform for building and deploying applications for live media. Recent updates introduce a user-friendly developer interface and new capabilities for application deployment to the platform. Holoscan for Media now includes Helm Dashboard, which delivers an intuitive user interface for orchestrating and managing Helm charts.

]]> 0 Gareth Sylvester-Bradley <![CDATA[Software-Defined Broadcast with NVIDIA Holoscan for Media]]> http://www.open-lab.net/blog/?p=70826 2024-03-13T19:05:53Z 2023-09-14T19:00:00Z

The broadcast industry is undergoing a transformation in how content is created, managed, distributed, and consumed. This transformation includes a shift from...]]>

The broadcast industry is undergoing a transformation in how content is created, managed, distributed, and consumed. This transformation includes a shift from...

Holoscan-media-Techblog-1480x830-1

The broadcast industry is undergoing a transformation in how content is created, managed, distributed, and consumed. This transformation includes a shift from traditional linear workflows bound by fixed-function devices to flexible and hybrid, software-defined systems that enable the future of live streaming. Developers can now apply to join the early access program for NVIDIA Holoscan for��

]]> 3 Shashank Gaur <![CDATA[Topic Modeling and Image Classification with Dataiku and NVIDIA Data Science]]> http://www.open-lab.net/blog/?p=62857 2023-11-03T07:15:04Z 2023-04-04T18:30:00Z

The Dataiku platform for everyday AI simplifies deep learning. Use cases are far-reaching, from image classification to object detection and natural language...]]>

The Dataiku platform for everyday AI simplifies deep learning. Use cases are far-reaching, from image classification to object detection and natural language...

Twitter topic model Dataiku diagram

The Dataiku platform for everyday AI simplifies deep learning. Use cases are far-reaching, from image classification to object detection and natural language processing (NLP). Dataiku helps you with labeling, model training, explainability, model deployment, and centralized management of code and code environments. This post dives into high-level Dataiku and NVIDIA integrations for image��

]]> 0 Joe Handzik <![CDATA[Designing Digital Twins with Flexible Workflows on NVIDIA Base Command Platform]]> http://www.open-lab.net/blog/?p=61885 2023-11-29T22:56:06Z 2023-03-20T16:00:00Z

NVIDIA Base Command Platform provides the capabilities to confidently develop complex software that meets the performance standards required by scientific...]]>

NVIDIA Base Command Platform provides the capabilities to confidently develop complex software that meets the performance standards required by scientific... Earth with digital twin

Earth with digital twin

NVIDIA Base Command Platform provides the capabilities to confidently develop complex software that meets the performance standards required by scientific computing workflows. The platform enables both cloud-hosted and on-premises solutions for AI development by providing developers with the tools needed to efficiently configure and manage AI workflows. Integrated data and user management simplify��

]]> 2 Maggie Zhang <![CDATA[Autoscaling NVIDIA Riva Deployment with Kubernetes for Speech AI in Production]]> http://www.open-lab.net/blog/?p=59514 2023-10-20T18:16:30Z 2023-01-12T17:30:00Z

Speech AI applications, from call centers to virtual assistants, rely heavily on automatic speech recognition (ASR) and text-to-speech (TTS). ASR can process...]]>

Speech AI applications, from call centers to virtual assistants, rely heavily on automatic speech recognition (ASR) and text-to-speech (TTS). ASR can process... Graphic with computer, cloud, and GPU icons

Graphic with computer, cloud, and GPU icons

Speech AI applications, from call centers to virtual assistants, rely heavily on automatic speech recognition (ASR) and text-to-speech (TTS). ASR can process the audio signal and transcribe the audio to text. Speech synthesis or TTS can generate high-quality, natural-sounding audio from the text in real time. The challenge of Speech AI is to achieve high accuracy and meet the latency requirements��

]]> 0 Charu Chaubal <![CDATA[Orchestrating Accelerated Virtual Machines with Kubernetes Using NVIDIA GPU Operator]]> http://www.open-lab.net/blog/?p=56695 2023-02-13T19:07:23Z 2022-10-31T16:30:00Z

Many organizations today run applications in containers to take advantage of the powerful orchestration and management provided by cloud-native platforms based...]]>

Many organizations today run applications in containers to take advantage of the powerful orchestration and management provided by cloud-native platforms based...

egx-cloud-native-core-stack-diagram-1600x950 (1)

Many organizations today run applications in containers to take advantage of the powerful orchestration and management provided by cloud-native platforms based on Kubernetes. However, virtual machines continue to remain as the predominant data center infrastructure platform for enterprises, and not all applications can be easily modified to run in containers. For example��

]]> 0 Kevin Klues <![CDATA[Improving GPU Utilization in Kubernetes]]> http://www.open-lab.net/blog/?p=49216 2022-06-16T20:42:13Z 2022-06-16T20:42:09Z

For scalable data center performance, NVIDIA GPUs have become a must-have. NVIDIA GPU parallel processing capabilities, supported by thousands of...]]>

For scalable data center performance, NVIDIA GPUs have become a must-have. NVIDIA GPU parallel processing capabilities, supported by thousands of...

K8 featured

For scalable data center performance, NVIDIA GPUs have become a must-have. NVIDIA GPU parallel processing capabilities, supported by thousands of computing cores, are essential to accelerating a wide variety of applications across different industries. The most compute-intensive applications across diverse industries use GPUs today: Different applications across this spectrum can��

]]> 11 Ash Bhalgat https://www.linkedin.com/in/ashbhalgat/ <![CDATA[Accelerating Cloud-Ready Infrastructure and Kubernetes with Red Hat OpenShift and the NVIDIA BlueField DPU]]> http://www.open-lab.net/blog/?p=47062 2022-05-05T19:15:43Z 2022-04-26T13:00:00Z

The IT world is moving to cloud, and cloud is built on containers managed with Kubernetes. We believe the next logical step is to accelerate this infrastructure...]]>

The IT world is moving to cloud, and cloud is built on containers managed with Kubernetes. We believe the next logical step is to accelerate this infrastructure... An animated visualization of Red Hat Openshift running on the NVIDIA BlueField DPU

An animated visualization of Red Hat Openshift running on the NVIDIA BlueField DPU

The IT world is moving to cloud, and cloud is built on containers managed with Kubernetes. We believe the next logical step is to accelerate this infrastructure with data processing units (DPUs) for greater performance, efficiency, and security. Red Hat and NVIDIA are building an integrated cloud-ready infrastructure solution with the management and automation of Red Hat OpenShift combined��

]]> 1 Charlie Huang <![CDATA[Expanding Hybrid-Cloud Support in Virtualized Data Centers with New NVIDIA AI Enterprise Integrations]]> http://www.open-lab.net/blog/?p=45215 2023-02-13T18:46:27Z 2022-03-15T05:11:33Z

The new year has been off to a great start with NVIDIA AI Enterprise 1.1 providing production support for container orchestration and Kubernetes cluster...]]>

The new year has been off to a great start with NVIDIA AI Enterprise 1.1 providing production support for container orchestration and Kubernetes cluster...

ai-enterprise-ga-launch-catalog

The new year has been off to a great start with NVIDIA AI Enterprise 1.1 providing production support for container orchestration and Kubernetes cluster management using VMware vSphere with Tanzu 7.0 update 3c, delivering AI/ML workloads to every business in VMs, containers, or Kubernetes. New NVIDIA AI Enterprise labs for IT admins and MLOps are available on NVIDIA LaunchPad��

]]> 0 Uttara Kumar <![CDATA[Deploy AI Workloads at Scale with Bottlerocket and NVIDIA-Powered Amazon EC2 Instances]]> http://www.open-lab.net/blog/?p=44139 2022-03-10T20:09:18Z 2022-03-08T00:28:25Z

Deploying AI-powered services like voice-based assistants, e-commerce product recommendations, and contact-center automation into production at scale is...]]>

Deploying AI-powered services like voice-based assistants, e-commerce product recommendations, and contact-center automation into production at scale is...

AWS-NVIDIA-Bottlerocket

Deploying AI-powered services like voice-based assistants, e-commerce product recommendations, and contact-center automation into production at scale is challenging. Delivering the best end-user experience while reducing operational costs requires accounting for multiple factors. These include composition and performance of underlying infrastructure, flexibility to scale resources based on user��

]]> 0 Erik Bohnhorst <![CDATA[GPU Operator 1.9 Adds Support for DGX A100 with DGX OS]]> http://www.open-lab.net/blog/?p=42193 2022-08-21T23:53:13Z 2021-12-07T18:00:00Z

Editor's note: Interested in GPU Operator? Register for our upcoming webinar on January 20th, "How to Easily use GPUs with Kubernetes". NVIDIA GPU Operator...]]>

Editor's note: Interested in GPU Operator? Register for our upcoming webinar on January 20th, "How to Easily use GPUs with Kubernetes". NVIDIA GPU Operator... GPU Operator 1.9 includes support for NVIDIA DGX A100 systems and streamlined installation processes.

GPU Operator 1.9 includes support for NVIDIA DGX A100 systems and streamlined installation processes.

Editor��s note: Interested in GPU Operator? Register for our upcoming webinar on January 20th, ��How to Easily use GPUs with Kubernetes��. NVIDIA GPU Operator allows organizations to easily scale NVIDIA GPUs on Kubernetes. By simplifying the deployment and management of GPUs with Kubernetes, the GPU Operator enables infrastructure teams to scale GPU applications error-free, within minutes��

]]> 0 Maggie Zhang <![CDATA[Deploying NVIDIA Triton at Scale with MIG and Kubernetes]]> http://www.open-lab.net/blog/?p=31573 2025-03-18T18:20:18Z 2021-08-26T03:00:00Z

NVIDIA Triton Inference Server is an open-source AI model serving software that simplifies the deployment of trained AI models at scale in production. Clients...]]>

NVIDIA Triton Inference Server is an open-source AI model serving software that simplifies the deployment of trained AI models at scale in production. Clients...

NGINX-plus-load-balancer

Join the NVIDIA Triton and NVIDIA TensorRT community to stay current on the latest product updates, bug fixes, content, best practices, and more. As of 3/18/25, NVIDIA Triton Inference Server is now NVIDIA Dynamo. NVIDIA Triton Inference Server is an open-source AI model serving software that simplifies the deployment of trained AI models at scale in production.

]]> 0 Uttara Kumar <![CDATA[One-click Deployment of NVIDIA Triton Inference Server to Simplify AI Inference on Google Kubernetes Engine (GKE)]]> http://www.open-lab.net/blog/?p=36650 2022-11-14T21:40:49Z 2021-08-23T20:30:29Z

The rapid growth in artificial intelligence is driving up the size of data sets, as well as the size and complexity of networks. AI-enabled applications like...]]>

The rapid growth in artificial intelligence is driving up the size of data sets, as well as the size and complexity of networks. AI-enabled applications like...

NVIDIA Triton Inference Server featured

Join the NVIDIA Triton and NVIDIA TensorRT community to stay current on the latest product updates, bug fixes, content, best practices, and more. The rapid growth in artificial intelligence is driving up the size of data sets, as well as the size and complexity of networks. AI-enabled applications like e-commerce product recommendations, voice-based assistants, and contact center automation��

]]> 0 Troy Estes <![CDATA[GPU Operator 1.8 Adds Support for HGX and Upgrades]]> http://www.open-lab.net/blog/?p=35200 2022-08-21T23:52:21Z 2021-08-20T16:00:00Z

Editor's note: Interested in GPU Operator? Register for our upcoming webinar on January 20th, "How to Easily use GPUs with Kubernetes". In the last post, we...]]>

Editor's note: Interested in GPU Operator? Register for our upcoming webinar on January 20th, "How to Easily use GPUs with Kubernetes". In the last post, we...

NVIDIA-GPU-Operators-2

Editor��s note: Interested in GPU Operator? Register for our upcoming webinar on January 20th, ��How to Easily use GPUs with Kubernetes��. In the last post, we looked at how the GPU Operator has evolved, adding a rich feature set to handle GPU discovery, support for the new Multi-Instance GPU (MIG) capability of the NVIDIA Ampere Architecture, vGPU, and certification for use with Red Hat OpenShift.

]]> 1 Michelle Horton <![CDATA[Upcoming Webinar: Building a Computer Vision Service Using NVIDIA NGC and Google Cloud]]> http://www.open-lab.net/blog/?p=35451 2023-08-18T19:32:56Z 2021-08-02T17:54:29Z

The NGC team is hosting a webinar and live Q&A. Topics include how to use containers from the NGC catalog deployed from Google Cloud Marketplace to GKE, a...]]>

The NGC team is hosting a webinar and live Q&A. Topics include how to use containers from the NGC catalog deployed from Google Cloud Marketplace to GKE, a... Model intersection with people and cars.

Model intersection with people and cars.

The NGC team is hosting a webinar and live Q&A. Topics include how to use containers from the NGC catalog deployed from Google Cloud Marketplace to GKE, a managed Kubernetes service on Google Cloud, that easily builds, deploys, and runs AI solutions. Building a Computer Vision Service Using NVIDIA NGC and Google Cloud August 25 at 10 a.m. PT Organizations are using computer vision to��

]]> 3 Troy Estes <![CDATA[On-Demand Session: Accelerating Kubernetes with NVIDIA Operators]]> http://www.open-lab.net/blog/?p=34699 2022-08-21T23:52:15Z 2021-07-22T16:00:00Z

Kubernetes is an open-source container-orchestration system for automating computer application deployment, scaling, and management. It��s an extremely popular...]]>

Kubernetes is an open-source container-orchestration system for automating computer application deployment, scaling, and management. It��s an extremely popular...

NVIDIA GPU Operators

Kubernetes is an open-source container-orchestration system for automating computer application deployment, scaling, and management. It��s an extremely popular tool, and can be used for automated rollouts and rollbacks, horizontal scaling, storage orchestration, and more. For many organizations, Kubernetes is a key component to their infrastructure. A critical step to installing and scaling��

]]> 0 Erez Scop <![CDATA[Kubernetes for Network Engineers]]> http://www.open-lab.net/blog/?p=25004 2022-08-21T23:41:10Z 2021-07-16T01:05:02Z

Using the same orchestration on-premise and on the public cloud allows a high level of agility and ease of operations. You can use the same API across bare...]]>

Using the same orchestration on-premise and on the public cloud allows a high level of agility and ease of operations. You can use the same API across bare...

kubernetes-explainer

Using the same orchestration on-premise and on the public cloud allows a high level of agility and ease of operations. You can use the same API across bare metal and public clouds. Kubernetes is an open-source, container-orchestration system for automating the deployment, scaling, and management of containerized applications. It was originally designed by Google and is now maintained by the Cloud��

]]> 1 Itay Ozery <![CDATA[Streamlining Kubernetes Networking in Scale-out GPU Clusters with the new NVIDIA Network Operator 1.0]]> http://www.open-lab.net/blog/?p=34099 2022-08-21T23:52:06Z 2021-07-12T16:00:00Z

The growing prevalence of GPU-accelerated computing in the cloud, enterprise, and at the edge increasingly relies on robust and powerful network...]]>

The growing prevalence of GPU-accelerated computing in the cloud, enterprise, and at the edge increasingly relies on robust and powerful network...

Network-operator-featured-image

The growing prevalence of GPU-accelerated computing in the cloud, enterprise, and at the edge increasingly relies on robust and powerful network infrastructures. NVIDIA ConnectX SmartNICs and NVIDIA BlueField DPUs provide high-throughput, low-latency connectivity that enables the scaling of GPU resources across a fleet of nodes. To address the demand for cloud-native AI workloads��

]]> 0 Troy Estes <![CDATA[Adding MIG, Preinstalled Drivers, and More to NVIDIA GPU Operator]]> http://www.open-lab.net/blog/?p=34105 2022-08-21T23:52:06Z 2021-07-02T16:00:00Z

Editor's note: Interested in GPU Operator? Register for our upcoming webinar on January 20th, "How to Easily use GPUs with Kubernetes". Reliably provisioning...]]>

Editor's note: Interested in GPU Operator? Register for our upcoming webinar on January 20th, "How to Easily use GPUs with Kubernetes". Reliably provisioning... The Network Operator and GPU Operators are installed side by side on a Kubernetes node, powered by the NVIDIA EGX software stack and NVIDIA-certified server hardware platform

The Network Operator and GPU Operators are installed side by side on a Kubernetes node, powered by the NVIDIA EGX software stack and NVIDIA-certified server hardware platform

Editor��s note: Interested in GPU Operator? Register for our upcoming webinar on January 20th, ��How to Easily use GPUs with Kubernetes��. Reliably provisioning servers with GPUs in Kubernetes can quickly become complex as multiple components must be installed and managed to use GPUs. The GPU Operator, based on the Operator Framework, simplifies the initial deployment and management of GPU��

]]> 1 Shankar Chandrasekaran <![CDATA[Scaling Inference in High Energy Particle Physics at Fermilab Using NVIDIA Triton Inference Server]]> http://www.open-lab.net/blog/?p=31033 2022-11-14T21:43:11Z 2021-04-30T18:58:47Z

High-energy physics research aims to understand the mysteries of the universe by describing the fundamental constituents of matter and the interactions between...]]>

High-energy physics research aims to understand the mysteries of the universe by describing the fundamental constituents of matter and the interactions between...

protodune1-s

Join the NVIDIA Triton and NVIDIA TensorRT community to stay current on the latest product updates, bug fixes, content, best practices, and more. High-energy physics research aims to understand the mysteries of the universe by describing the fundamental constituents of matter and the interactions between them. Diverse experiments exist on Earth to re-create the first instants of the universe.

]]> 0 Shankar Chandrasekaran <![CDATA[Simplifying AI Inference in Production with NVIDIA Triton]]> http://www.open-lab.net/blog/?p=30016 2023-03-22T01:11:54Z 2021-04-12T19:31:00Z

AI machine learning is unlocking breakthrough applications in fields such as online product recommendations, image classification, chatbots, forecasting, and...]]>

AI machine learning is unlocking breakthrough applications in fields such as online product recommendations, image classification, chatbots, forecasting, and...

triton-featured Image-250x150

Join the NVIDIA Triton and NVIDIA TensorRT community to stay current on the latest product updates, bug fixes, content, best practices, and more. AI machine learning is unlocking breakthrough applications in fields such as online product recommendations, image classification, chatbots, forecasting, and manufacturing quality inspection. There are two parts to AI: training and inference.

]]> 3 Kevin Klues <![CDATA[Announcing containerd Support for the NVIDIA GPU Operator]]> http://www.open-lab.net/blog/?p=23841 2022-08-21T23:41:01Z 2021-02-02T00:52:44Z

Editor's note: Interested in GPU Operator? Register for our upcoming webinar on January 20th, "How to Easily use GPUs with Kubernetes". For many years, docker...]]>

Editor's note: Interested in GPU Operator? Register for our upcoming webinar on January 20th, "How to Easily use GPUs with Kubernetes". For many years, docker...

gpu-operator-containerd-support

Editor��s note: Interested in GPU Operator? Register for our upcoming webinar on January 20th, ��How to Easily use GPUs with Kubernetes��. For many years, was the only container runtime supported by Kubernetes. Over time, support for other runtimes has not only become possible but often preferred, as standardization around a common container runtime interface (CRI) has solidified in the��

]]> 14 Erik Bohnhorst <![CDATA[Adding More Support in NVIDIA GPU Operator]]> http://www.open-lab.net/blog/?p=23095 2023-04-04T17:00:41Z 2021-01-26T23:12:47Z

Editor's note: Interested in GPU Operator? Register for our upcoming webinar on January 20th, "How to Easily use GPUs with Kubernetes". Reliably provisioning...]]>

Editor's note: Interested in GPU Operator? Register for our upcoming webinar on January 20th, "How to Easily use GPUs with Kubernetes". Reliably provisioning...

whats-new-gpu-operator

Editor��s note: Interested in GPU Operator? Register for our upcoming webinar on January 20th, ��How to Easily use GPUs with Kubernetes��. Reliably provisioning servers with GPUs can quickly become complex as multiple components must be installed and managed to use GPUs with Kubernetes. The GPU Operator simplifies the initial deployment and management and is based on the Operator Framework.

]]> 0 Shankar Chandrasekaran <![CDATA[Deploying AI Deep Learning Models with NVIDIA Triton Inference Server]]> http://www.open-lab.net/blog/?p=22881 2022-08-21T23:40:50Z 2020-12-18T03:30:09Z

In the world of machine learning, models are trained using existing data sets and then deployed to do inference on new data. In a previous post, Simplifying and...]]>

In the world of machine learning, models are trained using existing data sets and then deployed to do inference on new data. In a previous post, Simplifying and...

triton

In the world of machine learning, models are trained using existing data sets and then deployed to do inference on new data. In a previous post, Simplifying and Scaling Inference Serving with NVIDIA Triton 2.3, we discussed inference workflow and the need for an efficient inference serving solution. In that post, we introduced Triton Inference Server and its benefits and looked at the new features��

]]> 0 Arts Yang <![CDATA[Getting Kubernetes Ready for the NVIDIA A100 GPU with Multi-Instance GPU]]> http://www.open-lab.net/blog/?p=22271 2023-07-27T19:59:33Z 2020-12-01T00:30:00Z

Multi-Instance GPU (MIG) is a new feature of the latest generation of NVIDIA GPUs, such as A100. It enables users to maximize the utilization of a single GPU by...]]>

Multi-Instance GPU (MIG) is a new feature of the latest generation of NVIDIA GPUs, such as A100. It enables users to maximize the utilization of a single GPU by...

mig-single-or-mixed-strategy

]]> 4 James Sohn <![CDATA[Deploying a Natural Language Processing Service on a Kubernetes Cluster with Helm Charts from NVIDIA NGC]]> http://www.open-lab.net/blog/?p=22018 2022-08-21T23:40:46Z 2020-11-11T22:39:07Z

Conversational AI solutions such as chatbots are now deployed in the data center, on the cloud, and at the edge to deliver lower latency and high quality of...]]>

Conversational AI solutions such as chatbots are now deployed in the data center, on the cloud, and at the edge to deliver lower latency and high quality of...

Stack diagram

Conversational AI solutions such as chatbots are now deployed in the data center, on the cloud, and at the edge to deliver lower latency and high quality of service while meeting an ever-increasing demand. The strategic decision to run AI inference on any or all these compute platforms varies not only by the use case but also evolves over time with the business. Hence��

]]> 4 Shankar Chandrasekaran <![CDATA[Simplifying and Scaling Inference Serving with NVIDIA Triton 2.3]]> http://www.open-lab.net/blog/?p=21209 2023-03-22T01:09:07Z 2020-10-05T13:00:00Z

AI, machine learning (ML), and deep learning (DL) are effective tools for solving diverse computing problems such as product recommendations, customer...]]>

AI, machine learning (ML), and deep learning (DL) are effective tools for solving diverse computing problems such as product recommendations, customer...

Triton Inference Server

AI, machine learning (ML), and deep learning (DL) are effective tools for solving diverse computing problems such as product recommendations, customer interactions, financial risk assessment, manufacturing defect detection, and more. Using an AI model in production, called inference serving, is the most complex part of incorporating AI in applications. Triton Inference Server takes care of all the��

]]> 0 Jacob Liberman <![CDATA[Deploying GPUDirect RDMA on the EGX Stack with the NVIDIA Network Operator]]> http://www.open-lab.net/blog/?p=21167 2022-08-21T23:40:40Z 2020-09-30T01:30:56Z

Edge computing takes place close to the data source to reduce network stress and improve latency. GPUs are an ideal compute engine for edge computing because...]]>

Edge computing takes place close to the data source to reduce network stress and improve latency. GPUs are an ideal compute engine for edge computing because...

egx-stack-w-network-operator

Edge computing takes place close to the data source to reduce network stress and improve latency. GPUs are an ideal compute engine for edge computing because they are programmable and deliver phenomenal performance per dollar. However, the complexity associated with managing a fleet of edge devices can erode the GPU��s favorable economics. In 2019, NVIDIA introduced the GPU Operator to��

]]> 0 James Sohn <![CDATA[Simplifying AI Inference with NVIDIA Triton Inference Server from NVIDIA NGC]]> http://www.open-lab.net/blog/?p=19889 2022-10-10T18:57:20Z 2020-08-25T00:12:17Z

Seamlessly deploying AI services at scale in production is as critical as creating the most accurate AI model. Conversational AI services, for example, need...]]>

Seamlessly deploying AI services at scale in production is as critical as creating the most accurate AI model. Conversational AI services, for example, need...

Triton Inference Server Featured

Seamlessly deploying AI services at scale in production is as critical as creating the most accurate AI model. Conversational AI services, for example, need multiple models handling functions of automatic speech recognition (ASR), natural language understanding (NLU), and text-to-speech (TTS) to complete the application pipeline. To provide real-time conversation to users��

]]> 3 Ankita Sharma <![CDATA[Deploying AI Applications with NVIDIA EGX on NVIDIA Jetson Xavier NX Microservers]]> http://www.open-lab.net/blog/?p=19706 2022-08-21T23:40:34Z 2020-08-18T23:31:22Z

Modern expectations for agile capabilities and constant innovation��with zero downtime��calls for a change in how software for embedded and edge devices are...]]>

Modern expectations for agile capabilities and constant innovation��with zero downtime��calls for a change in how software for embedded and edge devices are...

Metropolis1-300x189

Modern expectations for agile capabilities and constant innovation��with zero downtime��calls for a change in how software for embedded and edge devices are developed and deployed. Adopting cloud-native paradigms like microservices, containerization, and container orchestration at the edge is the way forward but complexity of deployment, management, and security concerns gets in the way of scaling.

]]> 9 Itay Ozery <![CDATA[Accelerating Bare Metal Kubernetes Workloads, the Right Way]]> http://www.open-lab.net/blog/?p=18182 2022-08-21T23:40:14Z 2020-06-18T19:53:00Z

[stextbox id="info"]This post was originally published on the Mellanox blog.[/stextbox] In my previous Kubernetes post, Provision Bare-Metal Kubernetes Like a...]]>

[stextbox id="info"]This post was originally published on the Mellanox blog.[/stextbox] In my previous Kubernetes post, Provision Bare-Metal Kubernetes Like a...

helm-the-ship

This post was originally published on the Mellanox blog. In my previous Kubernetes post, Provision Bare-Metal Kubernetes Like a Cloud Giant!, I discussed the benefits of using BlueField DPU-programmable SmartNICs to simplify provisioning of Kubernetes clusters in bare-metal infrastructures. A key takeaway from this post was the current rapid shift toward bare metal Kubernetes��

]]> 0 Pramod Ramarao <![CDATA[NVIDIA GPU Operator: Simplifying GPU Management in Kubernetes]]> http://www.open-lab.net/blog/?p=15766 2022-08-21T23:39:38Z 2019-10-22T00:00:40Z

Editor's note: Interested in GPU Operator? Register for our upcoming webinar on January 20th, "How to Easily use GPUs with Kubernetes". Over the last few years,...]]>

Editor's note: Interested in GPU Operator? Register for our upcoming webinar on January 20th, "How to Easily use GPUs with Kubernetes". Over the last few years,...

NV GPU Operator

Editor��s note: Interested in GPU Operator? Register for our upcoming webinar on January 20th, ��How to Easily use GPUs with Kubernetes��. Over the last few years, NVIDIA has leveraged GPU containers in a variety of ways for testing, development and running AI workloads in production at scale. Containers optimized for NVIDIA GPUs and systems such as the DGX and OEM NGC-Ready servers are available��

]]> 0 Jared Conway <![CDATA[Zero to Data Science: Making Data Science Teams Productive with Kubernetes and RAPIDS]]> http://www.open-lab.net/blog/?p=14795 2022-08-21T23:39:30Z 2019-06-15T00:00:40Z

Data collected on a vast scale has fundamentally changed the way organizations do business, driving demand for teams to provide meaningful?data science,...]]>

Data collected on a vast scale has fundamentally changed the way organizations do business, driving demand for teams to provide meaningful?data science,...

Data collected on a vast scale has fundamentally changed the way organizations do business, driving demand for teams to provide meaningful data science, machine learning, and deep learning-based business insights quickly. Data science leaders, plus the Dev Ops and IT teams supporting them, constantly look for ways to make their teams productive while optimizing their costs and minimizing��

]]> 0 Shashank Prasanna <![CDATA[Kubernetes For AI Hyperparameter Search Experiments]]> http://www.open-lab.net/blog/?p=13162 2022-08-21T23:39:17Z 2018-12-14T15:10:15Z

The software industry has recently seen a huge shift in how software deployments are done thanks to technologies such as containers and orchestrators. While...]]>

The software industry has recently seen a huge shift in how software deployments are done thanks to technologies such as containers and orchestrators. While...

hyperparam_job_map

The software industry has recently seen a huge shift in how software deployments are done thanks to technologies such as containers and orchestrators. While container technologies have been around, credit goes to Docker for making containers mainstream, by greatly simplifying the process of creating, managing and deploying containerized applications. We��re now seeing a similar paradigm shift for��

]]> 0 Satinder Nijjar <![CDATA[Maximizing NVIDIA DGX with Kubernetes]]> http://www.open-lab.net/blog/?p=10997 2022-08-21T23:38:55Z 2018-06-27T21:13:18Z

[stextbox id="info"]For the latest information about how to deploy Kubernetes on NVIDIA GPUs, see the Kubernetes section of the NVIDIA Data Center...]]>

[stextbox id="info"]For the latest information about how to deploy Kubernetes on NVIDIA GPUs, see the Kubernetes section of the NVIDIA Data Center...

dgx-2_square

For the latest information about how to deploy Kubernetes on NVIDIA GPUs, see the Kubernetes section of the NVIDIA Data Center documentation. NVIDIA GPU Cloud (NGC) provides access to a number of containers for deep learning, HPC, and HPC visualization, as well as containers with applications from our NVIDIA partners �C all optimized for NVIDIA GPUs and DGX systems.

]]> 0 ��˳��97caoporen��