Anu Srivastava – NVIDIA Technical Blog

Anu Srivastava – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-04-23T02:39:30Z http://www.open-lab.net/blog/feed/ Anu Srivastava <![CDATA[NVIDIA Accelerates Inference on Meta Llama 4 Scout and Maverick]]> http://www.open-lab.net/blog/?p=98468 2025-04-22T23:57:03Z 2025-04-06T02:18:34Z

The newest generation of the popular Llama AI models is here with Llama 4 Scout and Llama 4 Maverick. Accelerated by NVIDIA open-source software, they can...]]>

The newest generation of the popular Llama AI models is here with Llama 4 Scout and Llama 4 Maverick. Accelerated by NVIDIA open-source software, they can achieve over 40K output tokens per second on NVIDIA Blackwell B200 GPUs, and are available to try as NVIDIA NIM microservices. The Llama 4 models are now natively multimodal and multilingual using a mixture-of-experts (MoE) architecture.

]]> 1 Anu Srivastava <![CDATA[Lightweight, Multimodal, Multilingual Gemma 3 Models Are Streamlined for Performance]]> http://www.open-lab.net/blog/?p=96770 2025-04-23T00:33:31Z 2025-03-12T08:45:00Z

Building AI systems with foundation models requires a delicate balancing of resources such as memory, latency, storage, compute, and more. One size does not fit...]]>

Building AI systems with foundation models requires a delicate balancing of resources such as memory, latency, storage, compute, and more. One size does not fit all for developers managing cost and user experience when bringing generative AI capability to the rapidly growing ecosystem of AI-powered applications. You need options for high-quality, customizable models that can support large…

]]> Anu Srivastava <![CDATA[Latest Multimodal Addition to Microsoft Phi SLMs Trained on NVIDIA GPUs]]> http://www.open-lab.net/blog/?p=96519 2025-04-23T02:39:30Z 2025-02-26T22:05:00Z

Large language models (LLMs) have permeated every industry and changed the potential of technology. However, due to their massive size they are not practical...]]>

Large language models (LLMs) have permeated every industry and changed the potential of technology. However, due to their massive size they are not practical for the current resource constraints that many companies have. The rise of small language models (SLMs) bridge quality and cost by creating models with a smaller resource footprint. SLMs are a subset of language models that tend to…

]]> Anu Srivastava <![CDATA[Boost Llama 3.3 70B Inference Throughput 3x with NVIDIA TensorRT-LLM Speculative Decoding]]> http://www.open-lab.net/blog/?p=94146 2024-12-19T23:03:40Z 2024-12-17T17:00:00Z

Meta's Llama collection of open large language models (LLMs) continues to grow with the recent addition of Llama 3.3 70B, a text-only...]]>

Meta’s Llama collection of open large language models (LLMs) continues to grow with the recent addition of Llama 3.3 70B, a text-only instruction-tuned model. Llama 3.3 provides enhanced performance respective to the older Llama 3.1 70B model and can even match the capabilities of the larger, more computationally expensive Llama 3.1 405B model on several tasks including math, reasoning, coding…

]]> 2 ��˳��97caoporen��