The NVIDIA Grace CPU is transforming data center design by offering a new level of power-efficient performance. Built specifically for data center scale, the Grace CPU is designed to handle demanding workloads while consuming less power. NVIDIA believes in the benefit of leveraging GPUs to accelerate every workload. However, not all workloads are accelerated. This is especially true for those��
]]>The NVIDIA Grace CPU is the first data center CPU developed by NVIDIA. Combining NVIDIA expertise with Arm processors, on-chip fabrics, system-on-chip (SoC) design, and resilient high-bandwidth low-power memory technologies, the Grace CPU was built from the ground up to create the world��s first superchip for computing. At the heart of the superchip, lies the NVLink Chip-2-Chip (C2C).
]]>The NVIDIA Arm HPC Developer Kit is an integrated hardware and software platform for creating, evaluating, and benchmarking HPC, AI, and scientific computing applications on a heterogeneous GPU- and CPU-accelerated computing system. NVIDIA announced its availability in March of 2021. The kit is designed as a stepping stone to the next-generation NVIDIA Grace Hopper Superchip for HPC and AI��
]]>This version 22.9 update to the NVIDIA HPC SDK includes fixes and minor enhancements.
]]>Organizations are rapidly becoming more advanced in the use of AI, and many are looking to leverage the latest technologies to maximize workload performance and efficiency. One of the most prevalent trends today is the use of CPUs based on Arm architecture to build data center servers. To ensure that these new systems are enterprise-ready and optimally configured, NVIDIA has approved the��
]]>AI processing requires full-stack innovation across hardware and software platforms to address the growing computational demands of neural networks. A key area to drive efficiency is using lower precision number formats to improve computational efficiency, reduce memory usage, and optimize for interconnect bandwidth. To realize these benefits, the industry has moved from 32-bit precisions to��
]]>This release includes enhancements, fixes, and new support for Arm SVE, Rocky Linux OS, and Amazon EC2 C7g instances, powered by the latest generation AWS Graviton3 processors.
]]>Today at AWS re:Invent 2021, AWS announced the general availability of Amazon EC2 G5g instances��bringing the first NVIDIA GPU-accelerated Arm-based instance to the AWS cloud. The new EC2 G5g instance features AWS Graviton2 processors, based on the 64-bit Arm Neoverse cores, and NVIDIA T4G Tensor Core GPUs, enhanced for graphics-intensive applications. This powerful combination creates an��
]]>In July of 2021, NVIDIA announced the availability of the NVIDIA Arm HPC Developer Kit for preordering, along with the NVIDIA HPC SDK. Since then NVIDIA and its partners have been working hard to get units into the hands of developers, to increase global availability, and enhance the software stack. The NVIDIA Arm HPC Developer Kit is based on the GIGABYTE G242-P32 2U server.
]]>AI continues to drive breakthrough innovation across industries, including consumer Internet, healthcare and life sciences, financial services, retail, manufacturing, and supercomputing. Researchers continue to push the boundaries of what��s possible with rapidly evolving models that are growing in size, complexity, and diversity. In addition, many of these complex, large-scale models need to��
]]>Today NVIDIA announced the availability of the NVIDIA Arm HPC Developer Kit with the NVIDIA HPC SDK version 21.7. The DevKit is an integrated hardware-software platform for creating, evaluating, and benchmarking HPC, AI, and scientific computing applications for Arm server based accelerated platforms. The HPC SDK v21.7 is the latest update of the software development kit, and fully supports the��
]]>Get the latest resources and news about the NVIDIA technologies that are accelerating the latest innovations in HPC from industry leaders and developers. Explore sessions and demos across a variety of HPC topics, ranging from weather forecasting and energy exploration to computational chemistry and molecular dynamics. The developer resources listed below are exclusively available to NVIDIA��
]]>Researchers are harnessing the power of NVIDIA GPUs more than ever before to find a cure for COVID-19. Leveraging popular molecular dynamics and quantum chemistry HPC applications, they are running thousands of experiments to predict which compounds can effectively bind with protein and block the virus from affecting our cells. NGC has recently introduced updated versions of these popular��
]]>The world��s ultimate embedded solution for AI developers, Jetson AGX Xavier, is now shipping as standalone production modules from NVIDIA. A member of NVIDIA��s AGX Systems for autonomous machines, Jetson AGX Xavier is ideal for deploying advanced AI and computer vision to the edge, enabling robotic platforms in the field with workstation-level performance and the ability to operate fully��
]]>NVIDIA Nsight Eclipse Edition is a full-featured, integrated development environment that lets you easily develop CUDA applications for either your local (x86) system or a remote (x86 or Arm) target. In this post, I will walk you through the process of remote-developing CUDA applications for the NVIDIA Jetson TX2, an Arm-based development kit. Note that this how-to also applies to Jetson TX1 and��
]]>Today at an AI meetup in San Francisco, NVIDIA launched Jetson TX2 and the JetPack 3.0 AI SDK. Jetson is the world��s leading low-power embedded platform, enabling server-class AI compute performance for edge devices everywhere. Jetson TX2 features an integrated 256-core NVIDIA Pascal GPU, a hex-core ARMv8 64-bit CPU complex, and 8GB of LPDDR4 memory with a 128-bit interface.
]]>GPUs have quickly become the go-to platform for accelerating machine learning applications for training and classification. Deep Neural Networks (DNNs) have grown in importance for many applications, from image classification and natural language processing to robotics and UAVs. To help researchers focus on solving core problems, NVIDIA introduced a library of primitives for deep neural networks��
]]>Today we��re excited to announce the release of the CUDA Toolkit version 6.5. CUDA 6.5 adds a number of features and improvements to the CUDA platform, including support for CUDA Fortran in developer tools, user-defined callback functions in cuFFT, new occupancy calculator APIs, and more. Last year we introduced CUDA on Arm, and in March we released the Jetson TK1 developer board��
]]>NVIDIA��s Tegra K1 (TK1) is the first Arm system-on-chip (SoC) with integrated CUDA. With 192 Kepler GPU cores and four Arm Cortex-A15 cores delivering a total of 327 GFLOPS of compute performance, TK1 has the capacity to process lots of data with CUDA while typically drawing less than 6W of power (including the SoC and DRAM). This brings game-changing performance to low-SWaP (Size��
]]>NVIDIA Nsight Eclipse Edition is a full-featured, integrated development environment that lets you easily develop CUDA applications for either your local (x86) system or a remote (x86 or Arm) target. In this post, I will walk you through the process of remote-developing CUDA applications for the NVIDIA Jetson TK1, an Arm-based development kit. Nsight supports two remote development modes: cross��
]]>In CUDACast #5, we saw how to use the new NVIDIA RPM and Debian packages to install the CUDA toolkit, samples, and driver on a supported Linux OS with a standard package manager. With CUDA 5.5, it is now possible to compile and run CUDA applications on Arm-based systems such as the Kayla development platform. In addition to native compilation on an Arm-based CPU system, it is also possible to��
]]>