Sergio Perez – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-06-26T18:55:58Z http://www.open-lab.net/blog/feed/ Sergio Perez <![CDATA[Benchmarking LLM Inference Costs for Smarter Scaling and Deployment]]> http://www.open-lab.net/blog/?p=102298 2025-06-26T18:55:58Z 2025-06-18T15:00:00Z This is the third post in the large language model latency-throughput benchmarking series, which aims to instruct developers on how to determine the cost of LLM...]]>

This is the third post in the large language model latency-throughput benchmarking series, which aims to instruct developers on how to determine the cost of LLM inference by estimating the total cost of ownership (TCO). See LLM Inference Benchmarking: Fundamental Concepts for background knowledge on common metrics for benchmarking and parameters. See LLM Inference Benchmarking Guide: NVIDIA…

Source

]]>
Sergio Perez <![CDATA[Floating-Point 8: An Introduction to Efficient, Lower-Precision AI Training]]> http://www.open-lab.net/blog/?p=101197 2025-06-12T18:50:43Z 2025-06-04T16:27:30Z With the growth of large language models (LLMs), deep learning is advancing both model architecture design and computational efficiency. Mixed precision...]]>

With the growth of large language models (LLMs), deep learning is advancing both model architecture design and computational efficiency. Mixed precision training, which strategically employs lower precision formats like brain floating point 16 (BF16) for computationally intensive operations while retaining the stability of 32-bit floating-point (FP32) where needed, has been a key strategy for…

Source

]]>
Sergio Perez <![CDATA[Continued Pretraining of State-of-the-Art LLMs for Sovereign AI and Regulated Industries with Domyn and NVIDIA DGX Cloud]]> http://www.open-lab.net/blog/?p=95012 2025-06-25T17:51:57Z 2025-01-16T12:00:00Z In recent years, large language models (LLMs) have achieved extraordinary progress in areas such as reasoning, code generation, machine translation, and...]]>

In recent years, large language models (LLMs) have achieved extraordinary progress in areas such as reasoning, code generation, machine translation, and summarization. However, despite their advanced capabilities, foundation models have limitations when it comes to domain-specific expertise such as finance or healthcare or capturing cultural and language nuances beyond English.

Source

]]>
���˳���97caoporen����