Annie Surla – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-06-12T18:51:02Z http://www.open-lab.net/blog/feed/ Annie Surla <![CDATA[New NVIDIA Llama Nemotron Nano Vision Language Model Tops OCR Benchmark for Accuracy]]> http://www.open-lab.net/blog/?p=100840 2025-06-12T18:50:47Z 2025-06-03T21:36:50Z Documents such as PDFs, graphs, charts, and dashboards are rich sources of data that, when extracted and organized, provide informative decision-making...]]>

Documents such as PDFs, graphs, charts, and dashboards are rich sources of data that, when extracted and organized, provide informative decision-making insights. From automating financial statement processing to improving business intelligence workflows, intelligent document processing is becoming a core component of AI solutions in enterprises. Organizations can accelerate the AI…

Source

]]>
Annie Surla <![CDATA[An Easy Introduction to LLM Reasoning, AI Agents, and Test Time Scaling]]> http://www.open-lab.net/blog/?p=98984 2025-06-12T18:51:02Z 2025-05-23T19:06:39Z Agents have been the primary drivers of applying large language models (LLMs) to solve complex problems. Since AutoGPT in 2023, various techniques have been...]]>

Agents have been the primary drivers of applying large language models (LLMs) to solve complex problems. Since AutoGPT in 2023, various techniques have been developed to build reliable agents across industries. The discourse around agentic reasoning and AI reasoning models further adds a layer of nuance when designing these applications. The rapid pace of this development also makes it hard for…

Source

]]>
Annie Surla <![CDATA[How Using a Reranking Microservice Can Improve Accuracy and Costs of Information Retrieval]]> http://www.open-lab.net/blog/?p=96363 2025-03-06T20:05:47Z 2025-03-06T18:33:38Z Applications requiring high-performance information retrieval span a wide range of domains, including search engines, knowledge management systems, AI agents,...]]>

Applications requiring high-performance information retrieval span a wide range of domains, including search engines, knowledge management systems, AI agents, and AI assistants. These systems demand retrieval processes that are accurate and computationally efficient to deliver precise insights, enhance user experiences, and maintain scalability. Retrieval-augmented generation (RAG) is used to…

Source

]]>
Annie Surla <![CDATA[An Easy Introduction to Multimodal Retrieval-Augmented Generation for Video and Audio]]> http://www.open-lab.net/blog/?p=93893 2024-12-16T21:53:48Z 2024-12-16T17:00:00Z Building a multimodal retrieval-augmented generation (RAG) system is challenging. The difficulty comes from capturing and indexing information from across...]]>

Building a multimodal retrieval-augmented generation (RAG) system is challenging. The difficulty comes from capturing and indexing information from across multiple modalities, including text, images, tables, audio, video, and more. In our previous post, An Easy Introduction to Multimodal Retrieval-Augmented Generation, we discussed how to tackle text and images. This post extends this conversation…

Source

]]>
Annie Surla <![CDATA[An Introduction to Model Merging for LLMs]]> http://www.open-lab.net/blog/?p=90842 2024-10-31T18:33:13Z 2024-10-28T18:30:00Z One challenge organizations face when customizing large language models (LLMs) is the need to run multiple experiments, which produces only one useful model....]]>

One challenge organizations face when customizing large language models (LLMs) is the need to run multiple experiments, which produces only one useful model. While the cost of experimentation is typically low, and the results well worth the effort, this experimentation process does involve “wasted” resources, such as compute assets spent without their product being utilized…

Source

]]>
2
Annie Surla <![CDATA[Build an Enterprise-Scale Multimodal PDF Data Extraction Pipeline with an NVIDIA AI Blueprint]]> http://www.open-lab.net/blog/?p=87948 2024-11-14T04:04:51Z 2024-08-28T15:00:00Z Trillions of PDF files are generated every year, each file likely consisting of multiple pages filled with various content types, including text, images,...]]>

Trillions of PDF files are generated every year, each file likely consisting of multiple pages filled with various content types, including text, images, charts, and tables. This goldmine of data can only be used as quickly as humans can read and understand it. But with generative AI and retrieval-augmented generation (RAG), this untapped data can be used to uncover business insights that…

Source

]]>
Annie Surla <![CDATA[An Easy Introduction to Multimodal Retrieval-Augmented Generation]]> http://www.open-lab.net/blog/?p=79351 2024-12-16T17:22:56Z 2024-03-20T18:00:00Z A retrieval-augmented generation (RAG) application has exponentially higher utility if it can work with a wide variety of data types��tables, graphs, charts,...]]>

A retrieval-augmented generation (RAG) application has exponentially higher utility if it can work with a wide variety of data types—tables, graphs, charts, and diagrams—and not just text. This requires a framework that can understand and generate responses by coherently interpreting textual, visual, and other forms of information. In this post, we discuss the challenges of tackling multiple…

Source

]]>
7
Annie Surla <![CDATA[How to Get Better Outputs from Your Large Language Model]]> http://www.open-lab.net/blog/?p=66169 2023-11-03T07:14:59Z 2023-06-14T16:18:05Z Large language models (LLMs) have generated excitement worldwide due to their ability to understand and process human language at a scale that is unprecedented....]]>

Large language models (LLMs) have generated excitement worldwide due to their ability to understand and process human language at a scale that is unprecedented. It has transformed the way that we interact with technology. Having been trained on a vast corpus of text, LLMs can manipulate and generate text for a wide variety of applications without much instruction or training. However…

Source

]]>
0
Annie Surla <![CDATA[An Introduction to Large Language Models: Prompt Engineering and P-Tuning]]> http://www.open-lab.net/blog/?p=63707 2023-11-28T19:18:25Z 2023-04-26T16:00:00Z ChatGPT has made quite an impression. Users are excited to use the AI chatbot to ask questions, write poems, imbue a persona for interaction, act as a personal...]]>

ChatGPT has made quite an impression. Users are excited to use the AI chatbot to ask questions, write poems, imbue a persona for interaction, act as a personal assistant, and more. Large language models (LLMs) power ChatGPT, and these models are the topic of this post. Before considering LLMs more carefully, we would first like to establish what a language model does. A language model gives…

Source

]]>
0
���˳���97caoporen����