As AI workloads grow in complexity and scale��from large language models (LLMs) to agentic AI reasoning and physical AI��the demand for faster, more scalable compute infrastructure has never been greater. Meeting these demands requires rethinking system architecture from the ground up. NVIDIA is advancing platform architecture with NVIDIA ConnectX-8 SuperNICs, the industry��s first SuperNIC to��
]]>High-performance computing and deep learning workloads are extremely sensitive to latency. Packet loss forces retransmission or stalls in the communication pipeline, which directly increases latency and disrupts the synchronization between GPUs. This can degrade the performance of collective operations such as all-reduce or broadcast, where every GPU��s participation is required before progressing.
]]>NVIDIA is breaking new ground by integrating silicon photonics directly with its NVIDIA Quantum and NVIDIA Spectrum switch ICs. At GTC 2025, we announced the world��s most advanced Silicon Photonics Switch systems, powered by cutting-edge 200G SerDes technology. This innovation, known as co-packaged silicon photonics, delivers significant benefits such as 3.5x lower power consumption��
]]>Advanced AI models such as DeepSeek-R1 are proving that enterprises can now build cutting-edge AI models specialized with their own data and expertise. These models can be tailored to unique use cases, tackling diverse challenges like never before. Based on the success of early AI adopters, many organizations are shifting their focus to full-scale production AI factories. Yet the process of��
]]>Explore the latest advancements in AI infrastructure, acceleration, and security from March 17-21.
]]>NVIDIA Enterprise Reference Architectures (Enterprise RAs) can reduce the time and cost of deploying AI infrastructure solutions. They provide a streamlined approach for building flexible and cost-effective accelerated infrastructure while ensuring compatibility and interoperability. The latest Enterprise RA details an optimized cluster configuration for systems integrated with NVIDIA GH200��
]]>AI factories rely on more than just compute fabrics. While the East-West network connecting the GPUs is critical to AI application performance, the storage fabric��connecting high-speed storage arrays��is equally important. Storage performance plays a key role across several stages of the AI lifecycle, including training checkpointing, inference techniques such as retrieval-augmented generation��
]]>Last month at the Supercomputing 2024 conference, NVIDIA announced the availability of NVIDIA H200 NVL, the latest NVIDIA Hopper platform. Optimized for enterprise workloads, NVIDIA H200 NVL is a versatile platform that delivers accelerated performance for a wide range of AI and HPC applications. With its dual-slot PCIe form-factor and 600W TGP, the H200 NVL enables flexible configuration options��
]]>In the era of generative AI, accelerated networking is essential to build high-performance computing fabrics for massively distributed AI workloads. NVIDIA continues to lead in this space, offering state-of-the-art Ethernet and InfiniBand solutions that maximize the performance and efficiency of AI factories and cloud data centers. At the core of these solutions are NVIDIA SuperNICs��a new��
]]>In today��s rapidly evolving technological landscape, staying ahead of the curve is not just a goal��it��s a necessity. The surge of innovations, particularly in AI, is driving dramatic changes across the technology stack. One area witnessing profound transformation is Ethernet networking, a cornerstone of digital communication that has been foundational to enterprise and data center��
]]>The new release includes support for Spectrum-X 1.1 RA and new features for AI Cloud Data Centers.
]]>The NVIDIA DOCA acceleration framework empowers developers with extensive libraries, drivers, and APIs to create high-performance applications and services for NVIDIA BlueField DPUs and SuperNICs. DOCA 2.7 is a comprehensive, feature-rich release that further underpins the scope and value of the DOCA software framework. It offers several new libraries, turn-key applications��
]]>Join Pure Storage and NVIDIA on April 25 to discover the benefits of enhancing LLMs with RAG for enterprise-scale generative AI applications.
]]>In the era of generative AI, where machines are not just learning from data but generating human-like text, images, video, and more, retrieval-augmented generation (RAG) stands out as a groundbreaking approach. A RAG workflow builds on large language models (LLMs), which can understand queries and generate responses. However, LLMs have limitations, including training complexity and a lack of��
]]>NVIDIA Spectrum-X is swiftly gaining traction as the leading networking platform tailored for AI in hyperscale cloud infrastructures. Spectrum-X networking technologies help enterprise customers accelerate generative AI workloads. NVIDIA announced significant OEM adoption of the platform in a November 2023 press release, along with an update on the NVIDIA Israel-1 Supercomputer powered by Spectrum��
]]>Accelerated networking combines CPUs, GPUs, DPUs (data processing units), or SuperNICs into an accelerated computing fabric specifically designed to optimize networking workloads. It uses specialized hardware to offload demanding tasks to enhance server capabilities. As AI and other new workloads continue to grow in complexity and scale, the need for accelerated networking becomes paramount.
]]>Traditional cloud data centers have served as the bedrock of computing infrastructure for over a decade, catering to a diverse range of users and applications. However, data centers have evolved in recent years to keep up with advancements in technology and the surging demand for AI-driven computing. This post explores the pivotal role that networking plays in shaping the future of data centers��
]]>As data generation continues to increase, linear performance scaling has become an absolute requirement for scale-out storage. Storage networks are like car roadway systems: if the road is not built for speed, the potential speed of a car does not matter. Even a Ferrari is slow on an unpaved dirt road full of obstacles. Scale-out storage performance can be hindered by the Ethernet fabric��
]]>Large language models (LLMs) and AI applications such as ChatGPT and DALL-E have recently seen rapid growth. Thanks to GPUs, CPUs, DPUs, high-speed storage, and AI-optimized software innovations, AI is now widely accessible. You can even deploy AI in the cloud or on-premises. Yet AI applications can be very taxing on the network, and this growth is burdening CPU and GPU servers��
]]>AI has seamlessly integrated into our lives and changed us in ways we couldn��t even imagine just a few years ago. In the past, the perception of AI was something futuristic and complex. Only giant corporations used AI on their supercomputers with HPC technologies to forecast weather and make breakthrough discoveries in healthcare and science. Today, thanks to GPUs, CPUs, high-speed storage��
]]>Everyone agrees that open solutions are the best solutions but, there are few truly open operating systems for Ethernet switches. At NVIDIA, we embraced open source for our Ethernet switches. Besides supporting SONiC, we have contributed many innovations to open-source community projects. This post was originally published on the Mellanox blog in June 2018 but has been updated.
]]>Enterprises of all sizes are increasingly leveraging virtualization and hyperconverged infrastructure (HCI). This technology delivers reliable and secure compute resources for operations while reducing data center footprint. HCI clusters rely on robust, feature-rich networking fabrics to deliver on-premises solutions that can seamlessly connect to the cloud. Microsoft Azure Stack HCI is a��
]]>PTP uses an algorithm and method for synchronizing clocks on various devices across packet-based networks to provide submicrosecond accuracy. NVIDIA Spectrum supports PTP in both one-step and two-step modes and can serve either as a boundary or a transparent clock. Here��s how the switch calculates and synchronizes time in one-step mode when acting as a transparent clock. Later in this post��
]]>Does the switch matter? The network fabric is key to the performance of modern data centers. There are many requirements for data center switches, but the most basic is to provide equal amounts of bandwidth to all clients so that resources are shared evenly. Without fair networking, all workloads experience unpredictable performance due to throughput deterioration, delay��
]]>This blog post was updated on 9/23/2024. NVIDIA is committed to your success when you choose SONiC (Software for Open Networking in the Cloud), the free, community-developed, Linux-based network operating system (NOS) hardened in the data centers of some of the largest cloud service providers. SONiC is an ideal choice for centers looking for a low-cost, scalable, and fully controllable NOS��
]]>