NVIDIA announced the release of NVIDIA Dynamo today at GTC 2025. NVIDIA Dynamo is a high-throughput, low-latency open-source inference serving framework for deploying generative AI and reasoning models in large-scale distributed environments. The framework boosts the number of requests served by up to 30x, when running the open-source DeepSeek-R1 models on NVIDIA Blackwell.
]]>Mixture of experts (MoE) large language model (LLM) architectures have recently emerged, both in proprietary LLMs such as GPT-4, as well as in community models with the open-source release of Mistral Mixtral 8x7B. The strong relative performance of the Mixtral model has raised much interest and numerous questions about MoE and its use in LLM architectures. So, what is MoE and why is it important?
]]>From credit card transactions, social networks, and recommendation systems to transportation networks and protein-protein interactions in biology, graphs are the go-to data structure for modeling and analyzing intricate connections. Graph neural networks (GNNs), with their ability to learn and reason over graph-structured data, have emerged as a game-changer across various domains. However…
]]>Fraud is a major problem for many financial services firms, costing billions of dollars each year, according to a recent Federal Trade Commission report. Financial fraud, fake reviews, bot assaults, account takeovers, and spam are all examples of online fraud and harmful activity. Although these firms employ techniques to combat online fraud, the methods can have severe limitations.
]]>In this post, we detail the recently released NVIDIA Time Series Prediction Platform (TSPP), a tool designed to compare easily and experiment with arbitrary combinations of forecasting models, time-series datasets, and other configurations. The TSPP also provides functionality to explore the hyperparameter search space, run accelerated model training using distributed training and Automatic Mixed…
]]>Recommender systems drive engagement on many of the most popular online platforms. As data volume grows exponentially, data scientists increasingly turn from traditional machine learning methods to highly expressive, deep learning models to improve recommendation quality. Often, the recommendations are framed as modeling the completion of a user-item matrix, in which the user-item entry is the…
]]>