Tanay Varshney – NVIDIA Technical Blog

Tanay Varshney – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-05-23T19:07:01Z http://www.open-lab.net/blog/feed/ Tanay Varshney <![CDATA[An Easy Introduction to LLM Reasoning, AI Agents, and Test Time Scaling]]> http://www.open-lab.net/blog/?p=98984 2025-05-23T19:07:01Z 2025-05-23T19:06:39Z

Agents have been the primary drivers of applying large language models (LLMs) to solve complex problems. Since AutoGPT in 2023, various techniques have been...]]>

Agents have been the primary drivers of applying large language models (LLMs) to solve complex problems. Since AutoGPT in 2023, various techniques have been developed to build reliable agents across industries. The discourse around agentic reasoning and AI reasoning models further adds a layer of nuance when designing these applications. The rapid pace of this development also makes it hard for…

]]> Tanay Varshney <![CDATA[Build Enterprise AI Agents with Advanced Open NVIDIA Llama Nemotron Reasoning Models]]> http://www.open-lab.net/blog/?p=97155 2025-05-05T16:01:49Z 2025-04-08T22:05:00Z

This updated post was originally published on March 18, 2025. Organizations are embracing AI agents to enhance productivity and streamline operations. To...]]>

This updated post was originally published on March 18, 2025. Organizations are embracing AI agents to enhance productivity and streamline operations. To maximize their impact, these agents need strong reasoning abilities to navigate complex problems, uncover hidden connections, and make logical decisions autonomously in dynamic environments. Due to their ability to tackle complex…

]]> Tanay Varshney <![CDATA[NVIDIA NeMo Retriever Delivers Accurate Multimodal PDF Data Extraction 15x Faster]]> http://www.open-lab.net/blog/?p=97161 2025-04-23T00:13:16Z 2025-03-18T19:20:51Z

Enterprises are generating and storing more multimodal data than ever before, yet traditional retrieval systems remain largely text-focused. While they can...]]>

Enterprises are generating and storing more multimodal data than ever before, yet traditional retrieval systems remain largely text-focused. While they can surface insights from written content, they aren’t extracting critical information embedded in tables, charts, and infographics—often the most information-dense elements of a document. Without a multimodal retrieval system…

]]> Tanay Varshney <![CDATA[How Using a Reranking Microservice Can Improve Accuracy and Costs of Information Retrieval]]> http://www.open-lab.net/blog/?p=96363 2025-03-06T20:05:47Z 2025-03-06T18:33:38Z

Applications requiring high-performance information retrieval span a wide range of domains, including search engines, knowledge management systems, AI agents,...]]>

Applications requiring high-performance information retrieval span a wide range of domains, including search engines, knowledge management systems, AI agents, and AI assistants. These systems demand retrieval processes that are accurate and computationally efficient to deliver precise insights, enhance user experiences, and maintain scalability. Retrieval-augmented generation (RAG) is used to…

]]> Tanay Varshney <![CDATA[An Easy Introduction to Multimodal Retrieval-Augmented Generation for Video and Audio]]> http://www.open-lab.net/blog/?p=93893 2024-12-16T21:53:48Z 2024-12-16T17:00:00Z

Building a multimodal retrieval-augmented generation (RAG) system is challenging. The difficulty comes from capturing and indexing information from across...]]>

Building a multimodal retrieval-augmented generation (RAG) system is challenging. The difficulty comes from capturing and indexing information from across multiple modalities, including text, images, tables, audio, video, and more. In our previous post, An Easy Introduction to Multimodal Retrieval-Augmented Generation, we discussed how to tackle text and images. This post extends this conversation…

]]> Tanay Varshney <![CDATA[Build an Enterprise-Scale Multimodal PDF Data Extraction Pipeline with an NVIDIA AI Blueprint]]> http://www.open-lab.net/blog/?p=87948 2024-11-14T04:04:51Z 2024-08-28T15:00:00Z

Trillions of PDF files are generated every year, each file likely consisting of multiple pages filled with various content types, including text, images,...]]>

Trillions of PDF files are generated every year, each file likely consisting of multiple pages filled with various content types, including text, images, charts, and tables. This goldmine of data can only be used as quickly as humans can read and understand it. But with generative AI and retrieval-augmented generation (RAG), this untapped data can be used to uncover business insights that…

]]> Tanay Varshney <![CDATA[Develop Production-Grade Text Retrieval Pipelines for RAG with NVIDIA NeMo Retriever?]]> http://www.open-lab.net/blog/?p=85762 2024-10-28T21:50:54Z 2024-07-23T15:15:00Z

Enterprises are sitting on a goldmine of data waiting to be used to improve efficiency, save money, and ultimately enable higher productivity. With generative...]]>

Enterprises are sitting on a goldmine of data waiting to be used to improve efficiency, save money, and ultimately enable higher productivity. With generative AI, developers can build and deploy an agentic flow or a retrieval-augmented generation (RAG) chatbot, while ensuring the insights provided are based on the most accurate and up-to-date information. Building these solutions requires not…

]]> Tanay Varshney <![CDATA[Creating Synthetic Data Using Llama 3.1 405B]]> http://www.open-lab.net/blog/?p=85922 2024-08-08T18:48:35Z 2024-07-23T15:15:00Z

Synthetic data isn��t about creating new information. It's about transforming existing information to create different variants. For over a decade, synthetic...]]>

Synthetic data isn’t about creating new information. It’s about transforming existing information to create different variants. For over a decade, synthetic data has been used to improve model accuracy across the board—whether it is transforming images to improve object detection models, strengthening fraudulent credit card detection, or improving BERT models for QA. What’s new?

]]> Tanay Varshney <![CDATA[NVIDIA Text Embedding Model Tops MTEB Leaderboard]]> http://www.open-lab.net/blog/?p=83571 2024-10-28T21:57:46Z 2024-06-10T17:00:00Z

The latest embedding model from NVIDIA��NV-Embed��set a new record for embedding accuracy with a score of 69.32 on the Massive Text Embedding Benchmark...]]>

The latest embedding model from NVIDIA—NV-Embed—set a new record for embedding accuracy with a score of 69.32 on the Massive Text Embedding Benchmark (MTEB), which covers 56 embedding tasks. Highly accurate and effective models like NV-Embed are key to transforming vast amounts of data into actionable insights. NVIDIA provides top-performing models through the NVIDIA API catalog.

]]> Tanay Varshney <![CDATA[An Easy Introduction to Multimodal Retrieval-Augmented Generation]]> http://www.open-lab.net/blog/?p=79351 2024-12-16T17:22:56Z 2024-03-20T18:00:00Z

A retrieval-augmented generation (RAG) application has exponentially higher utility if it can work with a wide variety of data types��tables, graphs, charts,...]]>

A retrieval-augmented generation (RAG) application has exponentially higher utility if it can work with a wide variety of data types—tables, graphs, charts, and diagrams—and not just text. This requires a framework that can understand and generate responses by coherently interpreting textual, visual, and other forms of information. In this post, we discuss the challenges of tackling multiple…

]]> 7 Tanay Varshney <![CDATA[Translate Your Enterprise Data into Actionable Insights with NVIDIA NeMo Retriever]]> http://www.open-lab.net/blog/?p=79433 2024-10-28T21:47:47Z 2024-03-18T22:00:00Z

Across every industry, and every job function, generative AI is activating the potential within organizations��turning data into knowledge and empowering...]]>

Across every industry, and every job function, generative AI is activating the potential within organizations—turning data into knowledge and empowering employees to work more efficiently. Accurate, relevant information is critical for making data-backed decisions. For this reason, enterprises continue to invest in ways to improve how business data is stored, indexed, and accessed.

]]> 2 Tanay Varshney <![CDATA[Evaluating Retriever for Enterprise-Grade RAG]]> http://www.open-lab.net/blog/?p=78222 2024-10-28T21:59:05Z 2024-02-23T19:02:26Z

The conversation about designing and evaluating Retrieval-Augmented Generation (RAG) systems is a long, multi-faceted discussion. Even when we look at retrieval...]]>

The conversation about designing and evaluating Retrieval-Augmented Generation (RAG) systems is a long, multi-faceted discussion. Even when we look at retrieval on its own, developers selectively employ many techniques, such as query decomposition, re-writing, building soft filters, and more, to increase the accuracy of their RAG pipelines. While the techniques vary from system to system…

]]> 0 Tanay Varshney <![CDATA[Build an LLM-Powered API Agent for Task Execution]]> http://www.open-lab.net/blog/?p=77925 2024-05-02T16:46:58Z 2024-02-21T21:30:00Z

Developers have long been building interfaces like web apps to enable users to leverage the core products being built. To learn how to work with data in your...]]>

Developers have long been building interfaces like web apps to enable users to leverage the core products being built. To learn how to work with data in your large language model (LLM) application, see my previous post, Build an LLM-Powered Data Agent for Data Analysis. In this post, I discuss a method to add free-form conversation as another interface with APIs. It works toward a solution that…

]]> 0 Tanay Varshney <![CDATA[Build an LLM-Powered Data Agent for Data Analysis]]> http://www.open-lab.net/blog/?p=77831 2024-02-22T19:58:53Z 2024-02-20T19:30:00Z

An AI agent is a system consisting of planning capabilities, memory, and tools to perform tasks requested by a user. For complex tasks such as data analytics or...]]>

An AI agent is a system consisting of planning capabilities, memory, and tools to perform tasks requested by a user. For complex tasks such as data analytics or interacting with complex systems, your application may depend on ‌collaboration among different types of agents. For more context, see Introduction to LLM Agents and Building Your First LLM Agent Application. This post explains the…

]]> 1 Tanay Varshney <![CDATA[Building Your First LLM Agent Application]]> http://www.open-lab.net/blog/?p=74179 2025-01-09T03:33:26Z 2023-11-30T19:12:44Z

When building a large language model (LLM) agent application, there are four key components you need: an agent core, a memory module, agent tools, and a...]]>

When building a large language model (LLM) agent application, there are four key components you need: an agent core, a memory module, agent tools, and a planning module. Whether you are designing a question-answering agent, multi-modal agent, or swarm of agents, you can consider many implementation frameworks—from open-source to production-ready. For more information, see Introduction to LLM…

]]> 0 Tanay Varshney <![CDATA[Introduction to LLM Agents]]> http://www.open-lab.net/blog/?p=74178 2024-06-24T16:23:00Z 2023-11-30T17:00:00Z

Consider a large language model (LLM) application that is designed to help financial analysts answer questions about the performance of a company. With a...]]>

Consider a large language model (LLM) application that is designed to help financial analysts answer questions about the performance of a company. With a well-designed retrieval augmented generation (RAG) pipeline, analysts can answer questions like, “What was X corporation’s total revenue for FY 2022?” This information can be easily extracted from financial statements by a seasoned analyst.

]]> 0 Tanay Varshney <![CDATA[An Introduction to Large Language Models: Prompt Engineering and P-Tuning]]> http://www.open-lab.net/blog/?p=63707 2023-11-28T19:18:25Z 2023-04-26T16:00:00Z

ChatGPT has made quite an impression. Users are excited to use the AI chatbot to ask questions, write poems, imbue a persona for interaction, act as a personal...]]>

ChatGPT has made quite an impression. Users are excited to use the AI chatbot to ask questions, write poems, imbue a persona for interaction, act as a personal assistant, and more. Large language models (LLMs) power ChatGPT, and these models are the topic of this post. Before considering LLMs more carefully, we would first like to establish what a language model does. A language model gives…

]]> 0 Tanay Varshney <![CDATA[NVIDIA Enables Trustworthy, Safe, and Secure Large Language Model Conversational Systems]]> http://www.open-lab.net/blog/?p=63745 2024-11-20T23:04:35Z 2023-04-25T13:00:00Z

Large language models (LLMs) are incredibly powerful and capable of answering complex questions, performing feats of creative writing, developing, debugging...]]>

Large language models (LLMs) are incredibly powerful and capable of answering complex questions, performing feats of creative writing, developing, debugging source code, and so much more. You can build incredibly sophisticated LLM applications by connecting them to external tools, for example reading data from a real-time source, or enabling an LLM to decide what action to take given a user’s…

]]> 1 Tanay Varshney <![CDATA[Optimizing and Serving Models with NVIDIA TensorRT and NVIDIA Triton]]> http://www.open-lab.net/blog/?p=50553 2025-03-18T18:23:55Z 2022-07-20T16:00:00Z

Imagine that you have trained your model with PyTorch, TensorFlow, or the framework of your choice, are satisfied with its accuracy, and are considering...]]>

Join the NVIDIA Triton and NVIDIA TensorRT community to stay current on the latest product updates, bug fixes, content, best practices, and more. As of 3/18/25, NVIDIA Triton Inference Server is now NVIDIA Dynamo. Imagine that you have trained your model with PyTorch, TensorFlow, or the framework of your choice, are satisfied with its accuracy, and are considering deploying it as a…

]]> 1 Tanay Varshney <![CDATA[Accelerating AI Inference Workloads with NVIDIA A30 GPU]]> http://www.open-lab.net/blog/?p=47944 2022-08-30T18:58:43Z 2022-05-11T22:43:14Z

NVIDIA A30 GPU is built on the latest NVIDIA Ampere Architecture to accelerate diverse workloads like AI inference at scale, enterprise training, and HPC...]]>

NVIDIA A30 GPU is built on the latest NVIDIA Ampere Architecture to accelerate diverse workloads like AI inference at scale, enterprise training, and HPC applications for mainstream servers in data centers. The A30 PCIe card combines the third-generation Tensor Cores with large HBM2 memory (24 GB) and fast GPU memory bandwidth (933 GB/s) in a low-power envelope (maximum 165 W).

]]> 1 Tanay Varshney <![CDATA[Building and Deploying Conversational AI Models Using NVIDIA TAO Toolkit]]> http://www.open-lab.net/blog/?p=24079 2023-03-22T01:16:50Z 2021-11-09T16:15:24Z

Sign up for the latest Speech AI news from NVIDIA. Conversational AI is a set of technologies enabling human-like interactions between humans and devices based...]]>

Sign up for the latest Speech AI news from NVIDIA. Conversational AI is a set of technologies enabling human-like interactions between humans and devices based on the most natural interfaces for us: speech and natural language. Systems based on conversational AI can understand commands by recognizing speech and text, translating on-the-fly between different languages…

]]> 2 Tanay Varshney <![CDATA[Speech Recognition: Deploying Models to Production]]> http://www.open-lab.net/blog/?p=39744 2023-12-30T01:51:25Z 2021-11-09T09:37:00Z

This post is part of a series about generating accurate speech transcription. For part 1, see Speech Recognition: Generating Accurate Domain-Specific Audio...]]>

This post is part of a series about generating accurate speech transcription. For part 1, see Speech Recognition: Generating Accurate Domain-Specific Audio Transcriptions Using NVIDIA Riva. For part 2, see Speech Recognition: Customizing Models to Your Domain Using Transfer Learning. NVIDIA Riva is an AI speech SDK for developing real-time applications like transcription, virtual assistants…

]]> 0 Tanay Varshney <![CDATA[Speech Recognition: Customizing Models to Your Domain Using Transfer Learning]]> http://www.open-lab.net/blog/?p=39742 2023-03-22T01:16:53Z 2021-11-09T09:36:00Z

This post is part of a series about generating accurate speech transcription. For part 1, see Speech Recognition: Generating Accurate Transcriptions Using...]]>

This post is part of a series about generating accurate speech transcription. For part 1, see Speech Recognition: Generating Accurate Transcriptions Using NVIDIA Riva. For part 3, see Speech Recognition: Deploying Models to Production. Creating a new AI deep learning model from scratch is an extremely time– and resource-intensive process. A common solution to this problem is to employ…

]]> 0 Tanay Varshney <![CDATA[Speech Recognition: Generating Accurate Domain-Specific Audio Transcriptions Using NVIDIA Riva]]> http://www.open-lab.net/blog/?p=39715 2025-01-23T19:24:23Z 2021-11-09T09:35:00Z

This post is part of a series about generating accurate speech transcription. For part 2, see Speech Recognition: Customizing Models to Your Domain Using...]]>

This post is part of a series about generating accurate speech transcription. For part 2, see Speech Recognition: Customizing Models to Your Domain Using Transfer Learning. For part 3, see Speech Recognition: Deploying Models to Production. Every day millions of audio minutes are produced across several industries such as Telecommunications, Finance, and Unified Communications as a Service…

]]> 2 Tanay Varshney <![CDATA[Improving Real-Time Communication Experiences with NVIDIA Maxine]]> http://www.open-lab.net/blog/?p=39258 2023-11-02T20:14:10Z 2021-10-28T16:00:00Z

The audio and video quality of real-time communication applications such as virtual collaboration and content creation applications is the true gauge of...]]>

The audio and video quality of real-time communication applications such as virtual collaboration and content creation applications is the true gauge of users’ real-time communication experience. They rely heavily on network bandwidth and user equipment quality. Narrow network bandwidth and low-quality equipment produce unstable and noisy audio and video outputs. This problem is often…

]]> 0 Tanay Varshney <![CDATA[Transforming Noisy Low-Resolution into High-Quality Videos for Captivating End-User Experiences]]> http://www.open-lab.net/blog/?p=37627 2023-11-03T07:15:12Z 2021-09-21T20:05:12Z

Video conferencing, audio and video streaming, and telecommunications recently exploded due to pandemic-related closures and work-from-home policies....]]>

Video conferencing, audio and video streaming, and telecommunications recently exploded due to pandemic-related closures and work-from-home policies. Businesses, educational institutions, and public-sector agencies are experiencing a skyrocketing demand for virtual collaboration and content creation applications. The crucial part of online communication is the video stream, whether it’s a simple…

]]> 0 Tanay Varshney <![CDATA[Achieving Noise-Free Audio for Virtual Collaboration and Content Creation Applications]]> http://www.open-lab.net/blog/?p=37611 2023-11-03T07:15:12Z 2021-09-21T19:41:05Z

With audio and video streaming, conferencing, and telecommunication on the rise, it has become essential for developers to build applications with outstanding...]]>

With audio and video streaming, conferencing, and telecommunication on the rise, it has become essential for developers to build applications with outstanding audio quality and enable end users to communicate and collaborate effectively. Various background noises can disrupt communication, ranging from traffic and construction to dogs barking and babies crying. Moreover, a user could talk in a…

]]> 1 Tanay Varshney <![CDATA[SoftBank Solves Key Mobile Edge Computing Challenges Using NVIDIA Maxine]]> http://www.open-lab.net/blog/?p=35334 2023-10-25T23:51:31Z 2021-08-14T00:57:00Z

SoftBank is a global technology player that aspires to drive the Information Revolution. The company operates in broadband, fixed-line telecommunications,...]]>

SoftBank is a global technology player that aspires to drive the Information Revolution. The company operates in broadband, fixed-line telecommunications, ecommerce, information technology, finance, media, and marketing. To improve their users’ communication experience, and overcome the 5G capacity and coverage issues, SoftBank has used NVIDIA Maxine GPU-accelerated SDKs with state-of-the-art AI…

]]> 0 ��˳��97caoporen��