AI agents powered by large language models are transforming enterprise workflows, but high inference costs and latency can limit their scalability and user experience. To address this, NVIDIA recently announced the NVIDIA AI Blueprint for Building Data Flywheels. It��s an enterprise-ready workflow that helps optimize AI agents by automated experimentation to find efficient models that reduce��
]]>Join us on June 26 to learn how to distill cost-efficient models with the NVIDIA Data Flywheel Blueprint.
]]>As enterprises generate and consume increasing volumes of diverse data, extracting insights from multimodal documents, like PDFs and presentations, has become a major challenge. Traditional text-only extraction and basic retrieval-augmented generation (RAG) pipelines fall short, failing to capture the full value of these complex documents. The result? Missed insights, inefficient workflows��
]]>A chunking strategy is the method of breaking down large documents into smaller, manageable pieces for AI retrieval. Poor chunking leads to irrelevant results, inefficiency, and reduced business value. It determines how effectively relevant information is fetched for accurate AI responses. With so many options available��page-level, section-level, or token-based chunking with various sizes��how do��
]]>Large language models (LLMs) have created unprecedented opportunities across various industries. However, moving LLMs from research and development into reliable, scalable, and maintainable production systems presents unique operational challenges. LLMOps, or large language model operations, are designed to address these challenges. Building upon the principles of traditional machine��
]]>Generalist robotics have arrived, powered by advances in mechatronics and robot AI foundation models. But a key bottleneck remains: robots need vast training data for skills like assembly and inspection, and manual demonstrations aren��t scalable. The NVIDIA Isaac GR00T-Dreams blueprint, built on NVIDIA Cosmos, solves this challenge by generating massive synthetic trajectory data from just a single��
]]>Biomedical research and drug discovery have long been constrained by labor-intensive processes. In order to kick-off a drug discovery campaign, researchers typically comb through numerous scientific papers for details about known protein targets and small molecule pairs. Reading��and deeply comprehending��a single paper takes one to six hours, while summarizing findings without AI assistance��
]]>Enterprise data is exploding��petabytes of emails, reports, Slack messages, and databases pile up faster than anyone can read. Employees are left searching for answers in a sea of information, as ��68% of available data in an organization goes unused,�� according to market researcher Gartner1. That��s now possible with today��s availability of AI-Q, an open-source NVIDIA Blueprint that puts your��
]]>As enterprise adoption of agentic AI accelerates, teams face a growing challenge of scaling intelligent applications while managing inference costs. Large language models (LLMs) offer strong performance but come with substantial computational demands, often resulting in high latency and costs. At the same time, many development workflows��such as evaluation, data curation��
]]>Vision language models (VLMs) have transformed video analytics by enabling broader perception and richer contextual understanding compared to traditional computer vision (CV) models. However, challenges like limited context length and lack of audio transcription still exist, restricting how much video a VLM can process at a time. To overcome this, the NVIDIA AI Blueprint for video search and��
]]>Announced at COMPUTEX 2025, the NVIDIA Omniverse Blueprint for AI factory digital twins has expanded to support OpenUSD schemas. The blueprint features new tools to simulate more aspects of data center design across power, cooling, and networking infrastructure. Engineering teams can now design and test entire AI factories in a realistic virtual world, helping to catch issues early so they can��
]]>In today��s educational landscape, generative AI tools have become both a blessing and a challenge. While these tools offer unprecedented access to information, they��ve also created new concerns about academic integrity. Increasingly, students rely on AI to generate direct answers to homework questions, often at the expense of developing critical thinking skills and mastering core concepts.
]]>Missed GTC? This year��s training labs are now available on demand to watch anywhere, anytime.
]]>Build a high-performance agentic AI system using the open-source NVIDIA Agent Intelligence toolkit �� contest runs May 12 to May 23.
]]>The worldwide adoption of generative AI has driven massive demand for accelerated compute hardware globally. In enterprises, this has accelerated the deployment of accelerated private cloud infrastructure. At the regional level, this demand for compute infrastructure has given rise to a new category of cloud providers who offer accelerated compute (GPU) capacity for AI workloads, also known as GPU��
]]>Industrial enterprises are embracing physical AI and autonomous systems to transform their operations. This involves deploying heterogeneous robot fleets that include mobile robots, humanoid assistants, intelligent cameras, and AI agents throughout factories and warehouses. To harness the full potential of these physical AI enabled systems, companies rely on digital twins of their facilities��
]]>Since the release of ChatGPT in November 2022, the capabilities of large language models (LLMs) have surged, and the number of available models has grown exponentially. With this expansion, LLMs now vary widely in cost, performance, and specialization. For example, straightforward tasks like text summarization can be efficiently handled by smaller, general-purpose models. In contrast��
]]>With emerging use cases such as digital humans, agents, podcasts, images, and video generation, generative AI is changing the way we interact with PCs. This paradigm shift calls for new ways of interfacing with and programming generative AI models. However, getting started can be daunting for PC developers and AI enthusiasts. Today, NVIDIA released a suite of NVIDIA NIM microservices on��
]]>Generative chemistry with AI has the potential to revolutionize how scientists approach drug discovery and development, health, and materials science and engineering. Instead of manually designing molecules with ��chemical intuition�� or screening millions of existing chemicals, researchers can train neural networks to propose novel molecular structures tailored to the desired properties.
]]>NVIDIA DGX Cloud Serverless Inference is an auto-scaling AI inference solution that enables application deployment with speed and reliability. Powered by NVIDIA Cloud Functions (NVCF), DGX Cloud Serverless Inference abstracts multi-cluster infrastructure setups across multi-cloud and on-premises environments for GPU-accelerated workloads. Whether managing AI workloads��
]]>Enterprises are generating and storing more multimodal data than ever before, yet traditional retrieval systems remain largely text-focused. While they can surface insights from written content, they aren��t extracting critical information embedded in tables, charts, and infographics��often the most information-dense elements of a document. Without a multimodal retrieval system��
]]>With the recent advancements in generative AI and vision foundational models, VLMs present a new wave of visual computing wherein the models are capable of highly sophisticated perception and deep contextual understanding. These intelligent solutions offer a promising means of enhancing semantic comprehension in XR settings. By integrating VLMs, developers can significantly improve how XR��
]]>Applications requiring high-performance information retrieval span a wide range of domains, including search engines, knowledge management systems, AI agents, and AI assistants. These systems demand retrieval processes that are accurate and computationally efficient to deliver precise insights, enhance user experiences, and maintain scalability. Retrieval-augmented generation (RAG) is used to��
]]>AI agents are transforming business operations by automating processes, optimizing decision-making, and streamlining actions. Their effectiveness hinges on expert reasoning, enabling smarter planning and efficient execution. Agentic AI applications could benefit from the capabilities of models such as DeepSeek-R1. Built for solving problems that require advanced AI reasoning��
]]>A well-crafted systematic review is often the initial step for researchers exploring a scientific field. For scientists new to this field, it provides a structured overview of the domain. For experts, it refines their understanding and sparks new ideas. In 2024 alone, 218,650 review articles were indexed in the Web of Science database, highlighting the importance of these resources in research.
]]>Agentic workflows are the next evolution in AI-powered tools. They enable developers to chain multiple AI models together to perform complex activities, enable AI models to use tools to access additional data or automate user actions, and enable AI models to operate autonomously, analyzing and performing complex tasks with a minimum of human involvement or interaction. Because of their power��
]]>Join us on February 27 to learn how to transform PDFs into AI podcasts using the NVIDIA AI Blueprint.
]]>Generative AI, especially with breakthroughs like AlphaFold and RosettaFold, is transforming drug discovery and how biotech companies and research laboratories study protein structures, unlocking groundbreaking insights into protein interactions. Proteins are dynamic entities. It has been postulated that a protein��s native state is known by its sequence of amino acids alone��
]]>Connect AI applications to enterprise data using embedding and reranking models for information retrieval.
]]>AI agents present a significant opportunity for businesses to scale and elevate customer service and support interactions. By automating routine inquiries and enhancing response times, these agents improve efficiency and customer satisfaction, helping organizations stay competitive. However, alongside these benefits, AI agents come with risks. Large language models (LLMs) are vulnerable to��
]]>Designing a therapeutic protein that specifically binds its target in drug discovery is a staggering challenge. Traditional workflows are often a painstaking trial-and-error process��iterating through thousands of candidates, each synthesis and validation round taking months if not years. Considering the average human protein is 430 amino acids long, the number of possible designs translates to��
]]>This post was originally published July 29, 2024 but has been extensively revised with NVIDIA AI Blueprint information. Traditional video analytics applications and their development workflow are typically built on fixed-function, limited models that are designed to detect and identify only a select set of predefined objects. With generative AI, NVIDIA NIM microservices��
]]>In today��s fast-paced business environment, providing exceptional customer service is no longer just a nice-to-have��it��s a necessity. Whether addressing technical issues, resolving billing questions, or providing service updates, customers expect quick, accurate, and personalized responses at their convenience. However, achieving this level of service comes with significant challenges.
]]>The evolution of modern application development has led to a significant shift toward microservice-based architectures. This approach offers great flexibility and scalability, but it also introduces new complexities, particularly in the realm of security. In the past, engineering teams were responsible for a handful of security aspects in their monolithic applications. Now, with microservices��
]]>Today, brands and their creative agencies are under huge strain to create and deliver high-quality, accurate product images at scale, from campaign key visuals to packshots for e-commerce. Audience-targeted content, such as personalized and localized visual variations, adds additional layers of complexity to production. Production costs, short timelines, resources��
]]>Everything that is manufactured is first simulated with advanced physics solvers. Real-time digital twins (RTDTs) are the cutting edge of computer-aided engineering (CAE) simulation, because they enable immediate feedback in the engineering design loop. They empower engineers to innovate freely and rapidly explore new designs by experiencing in real time the effects of any change in the simulation.
]]>NVIDIA AI Workbench is a free development environment manager that streamlines data science, AI, and machine learning (ML) projects on systems of choice. The goal is to provide a frictionless way to create, compute, and collaborate on and across PCs, workstations, data centers, and clouds. The basic user experience is straightforward: This post explores highlights of the October release��
]]>Addressing software security issues is becoming more challenging as the number of vulnerabilities reported in the CVE database continues to grow at an accelerated pace. Assessing a single container for vulnerabilities requires the collection, comprehension, and synthesis of hundreds of pieces of information. With over 200K vulnerabilities reported at the end of 2023, the traditional approach to��
]]>Providing customers with quality service remains a top priority for businesses across industries, from answering questions and troubleshooting issues to facilitating online orders. As businesses scale operations and expand offerings globally to compete, the demand for seamless customer service grows exponentially. Searching knowledge base articles or navigating complex phone trees can be a��
]]>Trillions of PDF files are generated every year, each file likely consisting of multiple pages filled with various content types, including text, images, charts, and tables. This goldmine of data can only be used as quickly as humans can read and understand it. But with generative AI and retrieval-augmented generation (RAG), this untapped data can be used to uncover business insights that��
]]>Now available��NIM Agent Blueprints for digital humans, multimodal PDF data extraction, and drug discovery.
]]>