• <xmp id="om0om">
  • <table id="om0om"><noscript id="om0om"></noscript></table>
  • New NVIDIA NIM Agent Blueprint for container security helps enterprises build safe AI.   Learn More

    NVIDIA NIM for Developers

    NVIDIA NIM?, part of NVIDIA AI Enterprise, provides containers to self-host GPU-accelerated inferencing microservices for pretrained and customized AI models across clouds, data centers, and workstations. Upon deployment with a single command, NIM microservices expose industry-standard APIs for simple integration into AI applications, development frameworks, and workflows. Built on pre-optimized inference engines from NVIDIA and the community, including NVIDIA? TensorRT? and TensorRT-LLM, NIM microservices automatically optimize response latency and throughput for each combination of foundation model and GPU system detected at runtime. NIM containers also provide standard observability data feeds and built-in support for autoscaling on Kubernetes on GPUs.

    Try NVIDIA-Hosted APIsGet Started With NIM


    How It Works

    NVIDIA NIM helps overcome the challenges of building AI applications, providing developers with industry-standard APIs for building powerful copilots, chatbots, and AI assistants while making it easy for IT and DevOps teams to self-host AI models in their own managed environments. Built on robust foundations, including inference engines like TensorRT, TensorRT-LLM, and PyTorch, NIM is engineered to facilitate seamless AI inferencing at scale.

    Watch Video

    NVIDIA NIM inference microservices stack diagram

    Introductory Blog

    Learn about NIM’s architecture, key features, and components.

    Documentation

    Access guides, API reference information, and release notes.

    Introductory Video

    Learn how to deploy NIM on your infrastructure using a single command.

    Deployment Guide

    Get step-by-step instructions for self-hosting NIM on any NVIDIA accelerated infrastructure.


    Build With NVIDIA NIM

    Get Superior Model Performance

    Improve AI application performance and efficiency with accelerated engines from NVIDIA and the community, including TensorRT, TensorRT-LLM, and more—prebuilt and optimized for low-latency, high-throughput inferencing on specific NVIDIA GPU systems.

    Run AI Models Anywhere

    Maintain security and control of applications and data with prebuilt microservices that can be deployed on NVIDIA GPUs anywhere—workstation, data center, or cloud. Download NIM inference microservices for self-hosted deployment, or take advantage of dedicated endpoints on Hugging Face to spin up instances in your preferred cloud.

    Customize AI Models for Your Use Case

    Improve accuracy for specific use cases by deploying NIM inference microservices for models fine-tuned with your own data.

    Maximize Operationalization and Scale

    Get detailed observability metrics for dashboarding, and access Helm charts and guides for scaling NIM on Kubernetes.


    NVIDIA NIM Examples and Blueprints

    Build RAG Applications With Standard APIs

    Get started prototyping your AI application with NIM hosted in the NVIDIA API catalog. Using generative AI examples from GitHub, see how to easily deploy a retrieval-augmented generation (RAG) pipeline for chat Q&A using hosted endpoints. Developers can get 1,000 inference credits free on any of the available models to begin developing their application.

    Explore RAG LLM Generative AI Examples

    Jump-Start Development With NIM Blueprints

    NVIDIA NIM Agent Blueprints are reference workflows for canonical generative AI use cases. Enterprises can build and operationalize custom AI applications — creating data-driven AI flywheels — using NIM Agent Blueprints along with NIM microservices and NeMo framework, all part of the NVIDIA AI Enterprise Platform. NIM Agent Blueprints also include partner microservices, one or more AI agents, reference code, customization documentation and a Helm chart for deployment.

    Explore NVIDIA NIM Agent Blueprints

    Deploy NIM on Cloud via Hugging Face

    Simplify and accelerate the deployment of generative AI models on Hugging Face with NIM. With just a few clicks, deploy optimized models like Llama 3 on preferred cloud platforms.

    Deploy NIM on Hugging Face

    Get Started With NVIDIA NIM

    Explore different options for building and deploying optimized AI applications using the latest models with NVIDIA NIM.

    Decorative image of building AI application with NVIDIA NIM API

    Try

    Begin building your AI application with NVIDIA-hosted NIM APIs.

    Visit the NVIDIA API Catalog
    Decorative image of joining NVIDIA Developer Program for free access to NIM

    Develop

    Get free access to NIM for research, development, and testing through the NVIDIA Developer Program. Questions? Check out the FAQ.

    Join and Get Access to Self-Hosting NIM
    Decorative image of deploying with NVIDIA AI Enterprise

    Deploy

    Move from pilot to production with the assurance of security, API stability, and support with NVIDIA AI Enterprise.

    Request a Free 90-Day NVIDIA AI Enterprise License

    NVIDIA NIM Learning Library

    Getting Started Blog

    Learn how to use NIM microservices APIs across the most popular generative AI application frameworks like Haystack, LangChain, and LlamaIndex.

    Benchmarking Guide

    Learn how to benchmark deployment of LLMs , popular metrics and parameters, as well as a step-by-step guide.

    Documentation

    Learn more about high-performance features, applications, architecture, release notes, and more for NVIDIA NIM for LLMs.


    More Resources

     Decorative image representing Developer Community

    Community

    Decorative image representing training and certification

    Training and Certification

    Decorative image representing Inception for Startups

    Inception for Startups

    Decorative image representing Inception for Startups

    Tech Blogs


    Ethical AI

    NVIDIA’s platforms and application frameworks enable developers to build a wide array of AI applications. Consider potential algorithmic bias when choosing or creating the models being deployed. Work with the model’s developer to ensure that it meets the requirements for the relevant industry and use case; that the necessary instruction and documentation are provided to understand error rates, confidence intervals, and results; and that the model is being used under the conditions and in the manner intended.

    Learn about the latest NVIDIA NIM models, applications, and tools.

    Sign Up

    人人超碰97caoporen国产