The exponential growth of AI workloads is increasing data center power demands. Traditional 54 V in-rack power distribution, designed for kilowatt (KW)-scale racks, isn’t designed to support the megawatt (MW)-scale racks coming soon to modern AI factories. NVIDIA is leading the transition to 800 V HVDC data center power infrastructure to support 1 MW IT racks and beyond, starting in 2027.
]]>Data centers are being re-architected for efficient delivery of AI workloads. This is a hugely complicated endeavor, and NVIDIA is now delivering AI factories based on the NVIDIA rack-scale architecture. To deliver the best performance for the AI factory, many accelerators need to work together at rack-scale with maximal bandwidth and minimal latency to support the largest number of users in the…
]]>As AI workloads grow in complexity and scale—from large language models (LLMs) to agentic AI reasoning and physical AI—the demand for faster, more scalable compute infrastructure has never been greater. Meeting these demands requires rethinking system architecture from the ground up. NVIDIA is advancing platform architecture with NVIDIA ConnectX-8 SuperNICs, the industry’s first SuperNIC to…
]]>The exponential growth of generative AI, large language models (LLMs), and high-performance computing has created unprecedented demands on data center infrastructure. Traditional server architectures struggle to accommodate the power density, thermal requirements, and rapid iteration cycles of modern accelerated computing. This post explains the benefits of NVIDIA MGX…
]]>The NVIDIA Grace CPU Superchip delivers outstanding performance and best-in-class energy efficiency for CPU workloads in the data center and in the cloud. The benefits of NVIDIA Grace include high-performance Arm Neoverse V2 cores, fast NVIDIA-designed Scalable Coherency Fabric, and low-power high-bandwidth LPDDR5X memory. These features make the Grace CPU ideal for data processing with…
]]>The NVIDIA Grace CPU is transforming data center design by offering a new level of power-efficient performance. Built specifically for data center scale, the Grace CPU is designed to handle demanding workloads while consuming less power. NVIDIA believes in the benefit of leveraging GPUs to accelerate every workload. However, not all workloads are accelerated. This is especially true for those…
]]>Accelerated computing is enabling giant leaps in performance and energy efficiency compared to traditional CPU computing. Delivering these advancements requires full-stack innovation at data-center scale, spanning chips, systems, networking, software, and algorithms. Choosing the right architecture for the right workload with the best energy efficiency is critical to maximizing the performance and…
]]>NVIDIA designed the NVIDIA Grace CPU to be a new kind of high-performance, data center CPU—one built to deliver breakthrough energy efficiency and optimized for performance at data center scale. Accelerated computing is enabling giant leaps in performance and energy efficiency compared to traditional CPU computing. To deliver these speedups, full-stack innovation at data center scale is…
]]>Many of the most exciting applications of large language models (LLMs), such as interactive speech bots, coding co-pilots, and search, need to begin responding to user queries quickly to deliver positive user experiences. The time that it takes for an LLM to ingest a user prompt (and context, which can be sizable) and begin outputting a response is called time to first token (TTFT).
]]>In the latest round of MLPerf Inference – a suite of standardized, peer-reviewed inference benchmarks – the NVIDIA platform delivered outstanding performance across the board. Among the many submissions made using the NVIDIA platform were results using the NVIDIA GH200 Grace Hopper Superchip. GH200 tightly couples an NVIDIA Grace CPU with an NVIDIA Hopper GPU using NVIDIA NVLink-C2C…
]]>Reservoir simulation helps reservoir engineers optimize their resource exploration approach by simulating complex scenarios and comparing with real-world field data. This extends to simulation of depleted reservoirs that could be repurposed for carbon storage from operations. Reservoir simulation is crucial for energy companies aiming to enhance operational efficiency in exploration and production.
]]>With the rapid growth of generative AI, CIOs and IT leaders are looking for ways to reclaim data center resources to accommodate new AI use cases that promise greater return on investment without impacting current operations. This is leading IT decision makers to reassess past infrastructure decisions and explore strategies to consolidate traditional workloads into fewer…
]]>The exponential growth in data processing demand is projected to reach 175 zettabytes by 2025. This contrasts sharply with the slowing pace of CPU performance improvements. For more than a decade, semiconductor advancements have not kept up with the pace predicted by Moore’s Law, leading to a pressing need for more efficient computing solutions. NVIDIA GPUs have emerged as the most efficient…
]]>What is the interest in trillion-parameter models? We know many of the use cases today and interest is growing due to the promise of an increased capacity for: The benefits are great, but training and deploying large models can be computationally expensive and resource-intensive. Computationally efficient, cost-effective, and energy-efficient systems, architected to deliver real-time…
]]>Large language model (LLM) applications are essential in enhancing productivity across industries through natural language. However, their effectiveness is often limited by the extent of their training data, resulting in poor performance when dealing with real-time events and new knowledge the LLM isn’t trained on. Retrieval-augmented generation (RAG) solves these problems.
]]>At AWS re:Invent 2023, AWS and NVIDIA announced that AWS will be the first cloud provider to offer NVIDIA GH200 Grace Hopper Superchips interconnected with NVIDIA NVLink technology through NVIDIA DGX Cloud and running on Amazon Elastic Compute Cloud (Amazon EC2). This is a game-changing technology for cloud computing. The NVIDIA GH200 NVL32, a rack-scale solution within NVIDIA DGX Cloud or an…
]]>The NVIDIA Grace CPU is the first data center CPU developed by NVIDIA. Combining NVIDIA expertise with Arm processors, on-chip fabrics, system-on-chip (SoC) design, and resilient high-bandwidth low-power memory technologies, the Grace CPU was built from the ground up to create the world’s first superchip for computing. At the heart of the superchip, lies the NVLink Chip-2-Chip (C2C).
]]>MLPerf is an industry-wide AI consortium that has developed a suite of performance benchmarks covering a range of leading AI workloads that are widely in use today. The latest MLPerf v0.7 training submission includes vision, language, recommenders, and reinforcement learning. NVIDIA submitted MLPerf v0.7 training results for all eight tests and the NVIDIA platform set records in all…
]]>