NVIDIA ConnectX-8 SuperNICs Advance AI Platform Architecture with PCIe Gen6 Connectivity

As AI workloads grow in complexity and scale—from large language models (LLMs) to agentic AI reasoning and physical AI—the demand for faster, more scalable compute infrastructure has never been greater. Meeting these demands requires rethinking system architecture from the ground up.

NVIDIA is advancing platform architecture with NVIDIA ConnectX-8 SuperNICs, the industry’s first SuperNIC to integrate a PCIe Gen6-capable switch with ultra-high-speed networking in a single device. Designed for modern AI infrastructure, ConnectX-8 delivers higher throughput while simplifying system design and improving power and cost efficiency.

Readying for PCIe Gen6 connectivity

In PCIe-based platforms, particularly those with eight or more GPUs, PCIe switches are critical for maximizing inter-GPU communication bandwidth and enabling scalable GPU topologies. Existing designs depend on standalone PCIe switches, which often introduce additional design complexity, and can limit performance and efficiency.

ConnectX-8 addresses this by integrating 48 lanes of PCIe Gen6 connectivity with an integrated PCIe Gen6 switch, consolidating GPU-to-GPU and GPU-to-NIC communication into a single, high-performance device. This eliminates the need for discrete PCIe switches, reducing component count and simplifying board design, resulting in a more cost-effective, scalable architecture for AI infrastructure.

Additionally, with native PCIe Gen6 support, ConnectX-8 accommodates the growing IO demands of next-generation GPUs, CPUs, and IO accelerators. It enables system architects to design forward-compatible platforms capable of fully utilizing the bandwidth of emerging high-throughput PCIe Gen6 devices.

Accelerating enterprise workloads with NVIDIA RTX PRO Servers

ConnectX-8 SuperNICs are now in full production and integrated with NVIDIA HGX B300 and NVIDIA GB300 NVL72 systems. Announced at COMPUTEX 2025, ConnectX-8 is featured in NVIDIA RTX PRO Servers from global system partners, supporting configurations with up to eight NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs.

Figure 1 compares two server architectures: a traditional design with discrete PCIe switches and an optimized configuration of NVIDIA RTX PRO Servers using NVIDIA ConnectX-8 SuperNICs with integrated PCIe Gen6 switches.

Side-by-side comparison of server designs: the traditional setup uses discrete PCIe switches to connect CPUs, GPUs, and NICs, while the optimized design uses ConnectX-8 SuperNICs that integrate PCIe switching and networking to directly connect CPUs and GPUs. — *Figure 1. A comparison of traditional (left) and optimized (right) server design with ConnectX-8 SuperNICs*

With the traditional design, the server layout includes two CPUs, eight GPUs (NVIDIA L40, for example), and five NICs, comprising four NVIDIA ConnectX-7 NICs and one NVIDIA BlueField-3 DPU. This setup also requires two to four discrete PCIe switches to enable GPU-to-GPU and GPU-to-NIC connectivity, adding complexity and increasing component count.

The optimized design replaces dedicated PCIe switches with ConnectX-8 SuperNICs, combining PCIe Gen6 switching and 800 Gb/s networking in a single device. This streamlined architecture supports up to 400 Gb/s of network bandwidth per every NVIDIA RTX PRO 6000 Blackwell GPU (based on a 2:1 GPU-to-NIC ratio), while significantly reducing system complexity.

It doubles the network bandwidth per GPU, helping eliminate IO bottlenecks and enabling faster data movement between GPUs, NICs, and storage. As a result, this NVIDIA RTX PRO Server platform delivers up to 2x higher NCCL all-to-all performance, accelerating collective communication patterns critical for multi-GPU and multi-node workloads and improving scalability across AI factories.

Building on Figure 1, Figure 2 offers a closer look at the server architectures, illustrating how the optimized design improves connectivity across three primary GPU communication paths:

GPU-to-GPU across two CPU sockets: In the traditional design this path may encounter host CPU and inter-socket bottlenecks, limiting to 25 GB/s or less based on the inter CPU link utilization. In contrast, the optimized CX8-based design enables up to 50 GB/s per GPU of IO bandwidth for all inter-GPU communication within the cluster, as NCCL routes all traffic directly through the network.
GPU-to-NIC communication: The optimized architecture provides each GPU with 50 GB/s of bandwidth in a 2:1 GPU-to-NIC configuration, regardless of whether the GPU or host system support PCIe Gen5 or Gen6.
GPU-to-GPU through the same PCIe switch: Systems equipped with PCIe Gen6 benefit from double the bandwidth compared to Gen5, significantly accelerating peer-to-peer GPU transfers over the same PCIe switch.

Side-by-side comparison of traditional and optimized server architectures. The traditional design (left) shows discrete PCIe switches connecting CPUs, GPUs, and NICs. The optimized design (right) features ConnectX-8 SuperNICs acting as both NICs and PCIe switches, streamlining connections across three GPU communication paths: GPU-to-GPU across CPU sockets, GPU-to-NIC, and GPU-to-GPU through the same PCIe switch. — *Figure 2. A comparison of traditional (left) and optimized (right) server design with ConnectX-8 SuperNICs, highlighting three key GPU communication paths*

By integrating PCIe switching directly into the SuperNIC, ConnectX-8 also simplifies board design, improves airflow, and enhances serviceability. This results in a more compact, power-efficient, and cost-effective platform. Supported by NVIDIA reference designs, this innovation helps system builders scale faster with better performance and lower TCO.

The future of PCIe-based AI infrastructure

NVIDIA ConnectX-8 is redefining what’s possible for PCIe-based systems. By combining a PCIe Gen6 switch and high-performance SuperNIC into a single, integrated device, ConnectX-8 streamlines server design, cuts component count, and unlocks the high-bandwidth communication paths required for modern AI workloads. The result is a simpler, more power-efficient platform with lower TCO and exceptional performance scalability.

Additionally, ConnectX-8 SuperNICs are enabling enhanced confidential computing capabilities in multi-GPU based platforms.

At COMPUTEX 2025, leading data center partners—including ASRock Rack, ASUS, Compal, Foxconn, GIGABYTE, Inventec, MiTAC, MSI, Pegatron, QCT, Supermicro, Wistron and Wiwynn—are showcasing advanced AI platform architectures powered by NVIDIA ConnectX-8 SuperNICs, integrated into NVIDIA RTX PRO Servers. To see how these innovations are shaping next-generation infrastructure, join NVIDIA founder and CEO Jensen Huang for the COMPUTEX 2025 keynote.

To learn more about the role of ConnectX-8 SuperNICs within the NVIDIA Spectrum-X platform and how they accelerate modern AI infrastructure, see Powering Next-Generation AI Networking with NVIDIA SuperNICs.