• <xmp id="om0om">
  • <table id="om0om"><noscript id="om0om"></noscript></table>
  • For Deep Learning performance, please go here.


    Modern HPC data centers are key to solving some of the world’s most important scientific and engineering challenges. The NVIDIA Data Center GPUs fundamentally change the economics of the data center, delivering breakthrough performance with dramatically fewer servers, less power consumption, and reduced networking overhead, resulting in total cost savings of 5X-10X.

    The number of CPU-only servers replaced by a single GPU-accelerated server is called the node replacement factor (NRF). To arrive at NRF, we measure application performance with up to 8 CPU-only servers. Then we use linear scaling to scale beyond 8 servers to calculate the NRF. The NRF will vary by application.


    Detailed H200 application performance data is located below in alphabetical order.

    AMBER

    Molecular Dynamics

    Suite of programs to simulate molecular dynamics on biomolecule

    VERSION

    24-AT_24

    ACCELERATED FEATURES

    • PMEMD Explicit Solvent and GB Implicit Solvent

    SCALABILITY

    Multi-GPU and Single Node

    MORE INFORMATION

    http://ambermd.org/GPUSupport.php

    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H2002x H2004x H2008x H2001x H200 NVL2x H200 NVL4x H200 NVL8x H200 NVL
    AMBER [PME-Cellulose_NPT_4fs]ns/dayDC-Cellulose_NPTyes11.713276521,3332,6642935881,1762,359
    AMBER [PME-Cellulose_NPT_4fs]NRFDC-Cellulose_NPTyes1x28x56x114x227x25x50x100x201x
    AMBER [PME-Cellulose_NVE_4fs]ns/dayDC-Cellulose_NVEyes11.693306691,3952,7822995961,1932,398
    AMBER [PME-Cellulose_NVE_4fs]NRFDC-Cellulose_NVEyes1x28x57x119x238x26x51x102x205x
    AMBER [PME-FactorIX_NPT_4fs]ns/dayDC-FactorIX_NPTyes93.361,4062,8525,69012,4681,2632,5275,05510,180
    AMBER [PME-FactorIX_NPT_4fs]NRFDC-FactorIX_NPTyes1x15x31x61x134x14x27x54x109x
    AMBER [PME-FactorIX_NVE_4fs]ns/dayDC-FactorIX_NVEyes99.501,4302,8975,86311,8541,2892,5815,22610,422
    AMBER [PME-FactorIX_NVE_4fs]NRFDC-FactorIX_NVEyes1x14x29x59x119x13x26x53x105x
    AMBER [PME-JAC_NPT_4fs]ns/dayDC-JAC_NPTyes377.044,6899,48519,47937,6874,2508,42217,05631,382
    AMBER [PME-JAC_NPT_4fs]NRFDC-JAC_NPTyes1x12x25x52x100x11x22x45x83x
    AMBER [PME-JAC_NVE_4fs]ns/dayDC-JAC_NVEyes397.044,8519,69219,75938,2464,3378,64017,26932,541
    AMBER [PME-JAC_NVE_4fs]NRFDC-JAC_NVEyes1x12x24x50x96x11x22x43x82x
    AMBER [PME-STMV_NPT_4fs]ns/dayDC-STMV_NPTyes3.699418737574991182364728
    AMBER [PME-STMV_NPT_4fs]NRFDC-STMV_NPTyes1x25x51x102x203x25x49x99x197x
    AMBER [FEP-GTI_Complex 1fs]ns/dayFEP-GTI_Complexyes25.072004007991,5991823647281,456
    AMBER [FEP-GTI_Complex 1fs]NRFFEP-GTI_Complexyes1x8x16x32x64x7x15x29x58x

    AMBER is measured by running multiple independent instances using MPS


    Chroma

    Physics

    Lattice Quantum Chromodynamics (LQCD)

    VERSION

    V2025.01

    ACCELERATED FEATURES

    • Wilson-clover fermions, Krylov solvers, Domain-decomposition
    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H2002x H2004x H2008x H2001x H200 NVL2x H200 NVL4x H200 NVL8x H200 NVL
    ChromaFinal Timestep Time (Sec)HMC Mediumno10,037153885335160935946
    ChromaNRFHMC Mediumyes1x65x116x193x289x63x110x175x224x

    FUN3D

    Engineering

    Suite of tools actively developed at NASA for Aeronautics and Space Technology by modeling fluid flow

    VERSION

    14.1

    ACCELERATED FEATURES

    • Full range of Mach number regimes for the Reynolds-averaged Navier Stokes (RANS) formulation

    SCALABILITY

    Multi-GPU and Single-Node

    MORE INFORMATION

    https://fun3d.larc.nasa.gov

    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H2002x H2004x H2008x H2001x H200 NVL2x H200 NVL4x H200 NVL8x H200 NVL
    Fun3D [dpw_wbt0_crs-3.6Mn_5]Loop Time (Sec)dpw_wbt0_crs-3.6Mn_5no112241497251598
    Fun3D [dpw_wbt0_crs-3.6Mn_5]NRFdpw_wbt0_crs-3.6Mn_5yes1x5x10x16x20x4x10x15x18x
    Fun3D [waverider-5M]Loop Time (Sec)waverider-5Mno15433191183520119
    Fun3D [waverider-5M]NRFwaverider-5Myes1x5x13x23x31x4x12x21x26x
    Fun3D [waverider-5M w/chemistry]Loop Time (Sec)waverider-5M w/chemistryno4749148261697512819
    Fun3D [waverider-5M w/chemistry]NRFwaverider-5M w/chemistryyes1x5x15x27x44x5x14x25x36x
    Fun3D [waverider-20M]Loop Time (Sec)waverider-20Mno628--3620--3825
    Fun3D [waverider-20M]NRFwaverider-20Myes1x--23x41x--22x34x
    Fun3D [waverider-20M w/chemistry]Loop Time (Sec)waverider-20M w/chemistryno2,011--10254--10965
    Fun3D [waverider-20M w/chemistry]NRFwaverider-20M w/chemistryyes1x--29x55x--27x45x

    GROMACS

    Molecular Dynamics

    Simulation of biochemical molecules with complicated bond interactions

    VERSION

    h-bond - 2025-rc

    ACCELERATED FEATURES

    • Implicit (5x), Explicit (2x) Solvent
    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H2002x H2004x H2008x H2001x H200 NVL2x H200 NVL4x H200 NVL8x H200 NVL
    GROMACS [ADH Dodec]ns/dayADH Dodecyes3628571,6272,6735,3307731,4502,7005,430
    GROMACS [ADH Dodec]NRFADH Dodecyes1x2x4x7x15x2x4x7x15x
    GROMACS [STMV]ns/daySTMVyes2044761311984170123153
    GROMACS [STMV]NRFSTMVyes1x2x4x8x13x2x3x7x10x

    GROMACS [ADH Dodec] is measured by running multiple independent instances using MPS


    GTC

    Physics

    GTC is used for Gyrokinetic Particle Simulation of Turbulent Transport in Burning Plasmas

    VERSION

    V4.5 updated

    ACCELERATED FEATURES

    • Push, shift, and collision

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H2002x H2004x H2008x H2001x H200 NVL2x H200 NVL4x H200 NVL8x H200 NVL
    GTCMpush/Secmpi#proc.inyes1468211,5322,9995,4087491,4072,6914,780
    GTCNRFmpi#proc.inyes1x6x11x22x40x5x10x20x35x

    ICON

    Weather and Climate

    A global unified atmosphere model for numerical weather prediction and climate modeling research

    VERSION

    2024.8_RC

    ACCELERATED FEATURES

    • Full model of dynamics and physics

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    https://code.mpimet.mpg.de/projects/iconpublic

    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H2002x H2004x H2008x H2001x H200 NVL2x H200 NVL4x H200 NVL
    ICON [SLAM 191 - 160KM - no radiation]Integrate_nh (sec)SLAM 191 levels 160 km resolution without radiationno58717114111398180148116
    ICON [SLAM 191 - 160KM - no radiation]NRFSLAM 191 levels 160 km resolution without radiationyes1x3x4x5x6x3x4x5x
    ICON [QUBICC 160 km resolution]Integrate_nh (sec)QUBICC 160 km resolutionno466143102796715010782
    ICON [QUBICC 160 km resolution]NRFQUBICC 160 km resolutionyes1x3x5x6x7x3x4x6x

    LAMMPS

    Molecular Dynamics

    Classical molecular dynamics package

    VERSION

    patch_4Feb2025

    ACCELERATED FEATURES

    • Lennard-Jones, Gay-Berne, Tersoff, many more potentials
    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H2002x H2004x H2008x H2001x H200 NVL2x H200 NVL4x H200 NVL8x H200 NVL
    LAMMPS [LJ 2.5]ATOM-Time Steps/sLJ 2.5yes3.95E+081.44E+092.69E+094.72E+097.80E+091.32E+092.45E+093.78E+096.33E+09
    LAMMPS [LJ 2.5]NRFLJ 2.5yes1x4x7x13x21x3x6x10x17x
    LAMMPS [EAM]ATOM-Time Steps/sEAMyes1.44E+085.75E+081.09E+091.95E+093.17E+095.28E+081.00E+091.70E+092.52E+09
    LAMMPS [EAM]NRFEAMyes1x4x8x14x23x4x7x12x18x
    LAMMPS [ReaxFF/C]ATOM-Time Steps/sReaxFF/Cyes1.93E+061.15E+072.05E+073.33E+074.96E+071.06E+071.91E+072.98E+074.32E+07
    LAMMPS [ReaxFF/C]NRFReaxFF/Cyes1x6x16x26x38x6x15x23x33x
    LAMMPS [SNAP]ATOM-Time Steps/sSNAPyes1.61E+064.24E+068.49E+061.69E+073.36E+073.88E+067.71E+061.53E+073.05E+07
    LAMMPS [SNAP]NRFSNAPyes1x3x7x12x24x2x6x11x22x
    LAMMPS [Tersoff]ATOM-Time Steps/sTersoffyes2.21E+081.03E+091.91E+093.46E+095.89E+099.43E+081.75E+093.04E+094.93E+09
    LAMMPS [Tersoff]NRFTersoffyes1x5x10x18x31x4x9x16x26x

    MILC

    Physics

    Lattice Quantum Chromodynamics (LQCD) codes simulate how elemental particles are formed and bound by the “strong force” to create larger particles like protons and neutrons

    VERSION

    develop_cde2498

    ACCELERATED FEATURES

    • Staggered fermions, Krylov solvers, Gauge-link fattening

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    https://ngc.nvidia.com/catalog/containers/hpc:milc

    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H2002x H2004x H2008x H2001x H200 NVL2x H200 NVL4x H200 NVL8x H200 NVL
    MILCTotal Time (sec)Apex Mediumno13,7359815343051911,018580334263
    MILCNRFApex Mediumyes1x14x23x40x64x13x21x37x46x

    NAMD

    Molecular Dynamics

    Designed for high-performance simulation of large molecular systems

    VERSION

    3

    ACCELERATED FEATURES

    • Full electrostatics with PME and most simulation features

    SCALABILITY

    Up to 100M atom capable, multi-GPU, single node

    MORE INFORMATION

    http://www.ks.uiuc.edu/Research/namd/

    https://ngc.nvidia.com/catalog/containers/hpc:namd

    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H2002x H2004x H2008x H2001x H200 NVL2x H200 NVL4x H200 NVL8x H200 NVL
    NAMD [LaINDY ColVars]ns/dayLaINDY ColVarsyes50.568917735269884164327651
    NAMD [LaINDY ColVars]NRFLaINDY ColVarsyes1x2x4x7x14x2x3x6x13x
    NAMD [apoa1_nve_cuda]ns/dayapoa1_nve_cudayes108.793927841,5453,0173577001,4142,804
    NAMD [apoa1_nve_cuda]NRFapoa1_nve_cudayes1x4x7x14x28x3x6x13x26x
    NAMD [stmv_npt_cuda]ns/daystmv_npt_cudayes10.532551102203234693185
    NAMD [stmv_npt_cuda]NRFstmv_npt_cudayes1x2x5x10x19x2x4x9x18x
    NAMD [COVID-19 Spike Assembly]ns/dayCOVID-19 Spike Assemblyyes0.75361118358-
    NAMD [COVID-19 Spike Assembly]NRFCOVID-19 Spike Assemblyyes1x4x8x15x24x4x6x11x-
    NAMD [stmv_nve_cuda]ns/daystmv_nve_cudayes10.8732641282572958116232
    NAMD [stmv_nve_cuda]NRFstmv_nve_cudayes1x3x6x12x24x3x5x11x21x

    NAMD is measured by running multiple independent instances using MPS except NAMD [COVID-19 Spike Assembly] dataset
    Trifan A, Gorgun D, Salim M, et al. Intelligent resolution: Integrating Cryo-EM with AI-driven multi-resolution simulations to observe the severe acute respiratory syndrome coronavirus-2 replication-transcription machinery in action. The International Journal of High Performance Computing Applications. 2022;36(5-6):603-623. doi:10.1177/10943420221113513
    D. B. Sauer, N. Trebesch, J. J. Marden, N. Cocco, J. Song, A. Koide, S. Koide, E. Tajkhorshid, and D.-N. Wang. "Structural basis for the reaction cycle of DASS dicarboxylate transporters." eLife. 9, e61350 (2020). https://doi.org/10.7554/eLife.61350


    Quantum Espresso

    Material Science (Quantum Chemistry)

    An Open-source suite of computer codes for electronic structure calculations and materials modeling at the nanoscale

    VERSION

    V7.4

    ACCELERATED FEATURES

    • linear algebra (matrix multiply)
    • explicit computational kernels
    • 3D FFTs

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    http://www.quantum-espresso.org

    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)2x H2004x H2008x H2002x H200 NVL4x H200 NVL8x H200 NVL
    Quantum EspresssoTotal CPU Time (Sec)GRIR443no78411489501167754
    Quantum EspresssoNRFGRIR443yes1x12x16x28x12x19x26x

    RELION

    Microscopy

    Stand-alone computer program that employs an empirical Bayesianapproach to refinement of (multiple) 3D reconstructions or 2D class averages in electron cryo-microscopy (cryo-EM)

    VERSION

    5.0.0

    ACCELERATED FEATURES

    • Reduced memory requirements; high-resolution cryo-EM structure determination in a matter of day on a single workstation
    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H2002x H2004x H2001x H200 NVL2x H200 NVL4x H200 NVL8x H200 NVL
    Relion [Plasmodium Ribosome]Total Wall Clock (Sec)MB numbers Plasmodium Ribosime on Relion-3.0no8,9812,3551,2311,0512,3551,2311,051977
    Relion [Plasmodium Ribosome]NRFMB numbers Plasmodium Ribosime on Relion-3.0yes1x4x7x9x4x7x9x9x

    RTM

    Geoscience

    Reverse time migration (RTM) modeling is a critical component in the seismic processing workflow of oil and gas exploration

    VERSION

    nvidia_2024_01

    ACCELERATED FEATURES

    • Batch algorithm

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    http://www.tsunamidevelopment.com/assets/rtm.pdf

    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H2002x H2004x H2008x H2001x H200 NVL2x H200 NVL4x H200 NVL8x H200 NVL
    RTM [Isotropic Radius 4]Mcell/sIsotropic Radius 4yes21,047194,141385,616770,6721,545,937184,860368,249736,4851,476,228
    RTM [Isotropic Radius 4]NRFIsotropic Radius 4yes1x9x18x37x73x9x17x35x70x
    RTM [TTI Radius 8 1-pass]Mcell/sTTI Radius 8 1-passyes7,21331,58162,562125,334250,34225,81651,604103,104205,718
    RTM [TTI Radius 8 1-pass]NRFTTI Radius 8 1-passyes1x4x9x17x35x4x7x14x29x
    RTM [TTI RX 2Pass mgpu]Mcell/sTTI RX 2Pass mgpuyes7,21330,52759,893119,536238,88028,73857,080113,564227,150
    RTM [TTI RX 2Pass mgpu]NRFTTI RX 2Pass mgpuyes1x4x8x17x33x4x8x16x31x

    SPECFEM3D

    Geoscience

    Simulates Seismic wave propagation

    VERSION

    4.1.1

    ACCELERATED FEATURES

    • OpenCL and CUDA hardware accelerators, based on an automatic source-to-source transformation library

    SCALABILITY

    Multi-GPU and Single-Node

    MORE INFORMATION

    https://geodynamics.org/cig/software/specfem3d/

    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H2002x H2004x H2008x H2001x H200 NVL2x H200 NVL4x H200 NVL8x H200 NVL
    SPECFEM3DTotal Time (Sec)four_material_simple_modelno18638211294122128
    SPECFEM3DNRFfour_material_simple_modelyes1x5x10x18x24x4x9x17x25x


    Detailed GH200 96GB application performance data is located below in alphabetical order.

    AMBER

    Molecular Dynamics

    Suite of programs to simulate molecular dynamics on biomolecule

    VERSION

    24-AT_24

    ACCELERATED FEATURES

    • PMEMD Explicit Solvent and GB Implicit Solvent

    SCALABILITY

    Multi-GPU and Single Node

    MORE INFORMATION

    http://ambermd.org/GPUSupport.php

    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9654 (CPU-Only)1x GH200 96GB4x GH200 96GB
    AMBER [PME-Cellulose_NPT_4fs]ns/dayDC-Cellulose_NPTyes10.403051,296
    AMBER [PME-Cellulose_NPT_4fs]NRFDC-Cellulose_NPTyes1x29x125x
    AMBER [PME-Cellulose_NVE_4fs]ns/dayDC-Cellulose_NVEyes10.433071,302
    AMBER [PME-Cellulose_NVE_4fs]NRFDC-Cellulose_NVEyes1x29x125x
    AMBER [PME-FactorIX_NPT_4fs]ns/dayDC-FactorIX_NPTyes82.111,3395,510
    AMBER [PME-FactorIX_NPT_4fs]NRFDC-FactorIX_NPTyes1x16x67x
    AMBER [PME-FactorIX_NVE_4fs]ns/dayDC-FactorIX_NVEyes90.621,3705,642
    AMBER [PME-FactorIX_NVE_4fs]NRFDC-FactorIX_NVEyes1x15x62x
    AMBER [PME-JAC_NPT_4fs]ns/dayDC-JAC_NPTyes358.074,82718,286
    AMBER [PME-JAC_NPT_4fs]NRFDC-JAC_NPTyes1x13x51x
    AMBER [PME-JAC_NVE_4fs]ns/dayDC-JAC_NVEyes365.314,91618,673
    AMBER [PME-JAC_NVE_4fs]NRFDC-JAC_NVEyes1x13x51x
    AMBER [PME-STMV_NPT_4fs]ns/dayDC-STMV_NPTyes3.28101-
    AMBER [PME-STMV_NPT_4fs]NRFDC-STMV_NPTyes1x31x-
    AMBER [FEP-GTI_Complex 1fs]ns/dayFEP-GTI_Complexyes23.08205-
    AMBER [FEP-GTI_Complex 1fs]NRFFEP-GTI_Complexyes1x9x-

    AMBER is measured by running multiple independent instances using MPS


    Chroma

    Physics

    Lattice Quantum Chromodynamics (LQCD)

    VERSION

    V2024.10

    ACCELERATED FEATURES

    • Wilson-clover fermions, Krylov solvers, Domain-decomposition
    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9654 (CPU-Only)1x GH200 96GB4x GH200 96GB
    ChromaFinal Timestep Time (Sec)HMC Mediumno9,24016461
    ChromaNRFHMC Mediumyes1x58x155x

    FUN3D

    Engineering

    Suite of tools actively developed at NASA for Aeronautics and Space Technology by modeling fluid flow

    VERSION

    14.0.1

    ACCELERATED FEATURES

    • Full range of Mach number regimes for the Reynolds-averaged Navier Stokes (RANS) formulation

    SCALABILITY

    Multi-GPU and Single-Node

    MORE INFORMATION

    https://fun3d.larc.nasa.gov

    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9654 (CPU-Only)1x GH200 96GB4x GH200 96GB
    Fun3D [dpw_wbt0_crs-3.6Mn_5]Loop Time (Sec)dpw_wbt0_crs-3.6Mn_5no1272410
    Fun3D [dpw_wbt0_crs-3.6Mn_5]NRFdpw_wbt0_crs-3.6Mn_5yes1x7x17x
    Fun3D [waverider-5M]Loop Time (Sec)waverider-5Mno1793613
    Fun3D [waverider-5M]NRFwaverider-5Myes1x8x21x
    Fun3D [waverider-5M w/chemistry]Loop Time (Sec)waverider-5M w/chemistryno49810538
    Fun3D [waverider-5M w/chemistry]NRFwaverider-5M w/chemistryyes1x7x19x
    Fun3D [waverider-20M]Loop Time (Sec)waverider-20Mno682-48
    Fun3D [waverider-20M]NRFwaverider-20Myes1x-19x
    Fun3D [waverider-20M w/chemistry]Loop Time (Sec)waverider-20M w/chemistryno2,155-138
    Fun3D [waverider-20M w/chemistry]NRFwaverider-20M w/chemistryyes1x-23x

    GROMACS

    Molecular Dynamics

    Simulation of biochemical molecules with complicated bond interactions

    VERSION

    2024.3

    ACCELERATED FEATURES

    • Implicit (5x), Explicit (2x) Solvent
    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9654 (CPU-Only)1x GH200 96GB4x GH200 96GB
    GROMACS [ADH Dodec]ns/dayADH Dodecyes3708343,293
    GROMACS [ADH Dodec]NRFADH Dodecyes1x2x9x
    GROMACS [STMV]ns/daySTMVyes1947120
    GROMACS [STMV]NRFSTMVyes1x2x8x

    GROMACS [ADH Dodec] is measured by running multiple independent instances using MPS


    GTC

    Physics

    GTC is used for Gyrokinetic Particle Simulation of Turbulent Transport in Burning Plasmas

    VERSION

    V4.5 updated

    ACCELERATED FEATURES

    • Push, shift, and collision

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9654 (CPU-Only)1x GH200 96GB4x GH200 96GB
    GTCMpush/Secmpi#proc.inyes1368122,874
    GTCNRFmpi#proc.inyes1x6x22x

    ICON

    Weather and Climate

    A global unified atmosphere model for numerical weather prediction and climate modeling research

    VERSION

    2024.8_RC

    ACCELERATED FEATURES

    • Full model of dynamics and physics

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    https://code.mpimet.mpg.de/projects/iconpublic

    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9654 (CPU-Only)1x GH200 96GB4x GH200 96GB
    ICON [SLAM 191 - 160KM - no radiation]Integrate_nh (sec)SLAM 191 levels 160 km resolution without radiationno575175108
    ICON [SLAM 191 - 160KM - no radiation]NRFSLAM 191 levels 160 km resolution without radiationyes1x3x5x
    ICON [QUBICC 160 km resolution]Integrate_nh (sec)QUBICC 160 km resolutionno45914781
    ICON [QUBICC 160 km resolution]NRFQUBICC 160 km resolutionyes1x3x6x

    LAMMPS

    Molecular Dynamics

    Classical molecular dynamics package

    VERSION

    stable_29Aug2024

    ACCELERATED FEATURES

    • Lennard-Jones, Gay-Berne, Tersoff, many more potentials
    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9654 (CPU-Only)1x GH200 96GB
    LAMMPS [LJ 2.5]ATOM-Time Steps/sLJ 2.5yes3.28E+081.56E+09
    LAMMPS [LJ 2.5]NRFLJ 2.5yes1x5x
    LAMMPS [EAM]ATOM-Time Steps/sEAMyes1.33E+086.10E+08
    LAMMPS [EAM]NRFEAMyes1x5x
    LAMMPS [ReaxFF/C]ATOM-Time Steps/sReaxFF/Cyes1.84E+061.14E+07
    LAMMPS [ReaxFF/C]NRFReaxFF/Cyes1x9x
    LAMMPS [SNAP]ATOM-Time Steps/sSNAPyes1.53E+063.83E+06
    LAMMPS [SNAP]NRFSNAPyes1x3x
    LAMMPS [Tersoff]ATOM-Time Steps/sTersoffyes1.99E+081.08E+09
    LAMMPS [Tersoff]NRFTersoffyes1x6x

    MILC

    Physics

    Lattice Quantum Chromodynamics (LQCD) codes simulate how elemental particles are formed and bound by the “strong force” to create larger particles like protons and neutrons

    VERSION

    develop_cde2498

    ACCELERATED FEATURES

    • Staggered fermions, Krylov solvers, Gauge-link fattening

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    https://ngc.nvidia.com/catalog/containers/hpc:milc

    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9654 (CPU-Only)1x GH200 96GB4x GH200 96GB
    MILCTotal Time (sec)Apex Mediumno16,570935306
    MILCNRFApex Mediumyes1x16x48x

    NAMD

    Molecular Dynamics

    Designed for high-performance simulation of large molecular systems

    VERSION

    3

    ACCELERATED FEATURES

    • Full electrostatics with PME and most simulation features

    SCALABILITY

    Up to 100M atom capable, multi-GPU, single node

    MORE INFORMATION

    http://www.ks.uiuc.edu/Research/namd/

    https://ngc.nvidia.com/catalog/containers/hpc:namd

    Application Metric Test Modules Bigger is better AMD Dual Genoa 9654 (CPU-Only)1x GH200 96GB4x GH200 96GB
    NAMD [LaINDY ColVars] ns/day LaINDY ColVars yes 44.89 114 441
    NAMD [LaINDY ColVars] NRF LaINDY ColVars yes 1x 3x 10x
    NAMD [apoa1_nve_cuda] ns/day apoa1_nve_cuda yes 97.16 392 1,505
    NAMD [apoa1_nve_cuda] NRF apoa1_nve_cuda yes 1x 4x 15x
    NAMD [stmv_npt_cuda] ns/day stmv_npt_cuda yes 10.06 26 102
    NAMD [stmv_npt_cuda] NRF stmv_npt_cuda yes 1x 3x 10x
    NAMD [COVID-19 Spike Assembly] ns/day COVID-19 Spike Assembly yes 0.78 3 11
    NAMD [COVID-19 Spike Assembly] NRF COVID-19 Spike Assembly yes 1x 4x 14x
    NAMD [stmv_nve_cuda] ns/day stmv_nve_cuda yes 10.49 32 126
    NAMD [stmv_nve_cuda] NRF stmv_nve_cuda yes 1x 3x 12x

    NAMD is measured by running multiple independent instances using MPS except NAMD [COVID-19 Spike Assembly] dataset
    Trifan A, Gorgun D, Salim M, et al. Intelligent resolution: Integrating Cryo-EM with AI-driven multi-resolution simulations to observe the severe acute respiratory syndrome coronavirus-2 replication-transcription machinery in action. The International Journal of High Performance Computing Applications. 2022;36(5-6):603-623. doi:10.1177/10943420221113513
    D. B. Sauer, N. Trebesch, J. J. Marden, N. Cocco, J. Song, A. Koide, S. Koide, E. Tajkhorshid, and D.-N. Wang. "Structural basis for the reaction cycle of DASS dicarboxylate transporters." eLife. 9, e61350 (2020). https://doi.org/10.7554/eLife.61350


    RTM

    Geoscience

    Reverse time migration (RTM) modeling is a critical component in the seismic processing workflow of oil and gas exploration

    VERSION

    nvidia_2024_01

    ACCELERATED FEATURES

    • Batch algorithm

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    http://www.tsunamidevelopment.com/assets/rtm.pdf

    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9654 (CPU-Only)1x GH200 96GB4x GH200 96GB
    RTM [Isotropic Radius 4]Mcell/sIsotropic Radius 4yes21,047178,321708,595
    RTM [Isotropic Radius 4]NRFIsotropic Radius 4yes1x8x34x
    RTM [TTI Radius 8 1-pass]Mcell/sTTI Radius 8 1-passyes7,21331,584124,223
    RTM [TTI Radius 8 1-pass]NRFTTI Radius 8 1-passyes1x4x17x
    RTM [TTI RX 2Pass mgpu]Mcell/sTTI RX 2Pass mgpuyes7,21329,320115,804
    RTM [TTI RX 2Pass mgpu]NRFTTI RX 2Pass mgpuyes1x4x16x

    SPECFEM3D

    Geoscience

    Simulates Seismic wave propagation

    VERSION

    4.1.1

    ACCELERATED FEATURES

    • OpenCL and CUDA hardware accelerators, based on an automatic source-to-source transformation library

    SCALABILITY

    Multi-GPU and Single-Node

    MORE INFORMATION

    https://geodynamics.org/cig/software/specfem3d/

    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9654 (CPU-Only)1x GH200 96GB4x GH200 96GB
    SPECFEM3DTotal Time (Sec)four_material_simple_modelno1994113
    SPECFEM3DNRFfour_material_simple_modelyes1x4x18x


    Detailed H100 application performance data is located below in alphabetical order.

    AMBER

    Molecular Dynamics

    Suite of programs to simulate molecular dynamics on biomolecule

    VERSION

    24-AT_24

    ACCELERATED FEATURES

    • PMEMD Explicit Solvent and GB Implicit Solvent

    SCALABILITY

    Multi-GPU and Single Node

    MORE INFORMATION

    http://ambermd.org/GPUSupport.php

    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H100 SXM2x H100 SXM4x H100 SXM8x H100 SXM1x H100 NVL2x H100 NVL4x H100 NVL8x H100 NVL
    AMBER [PME-Cellulose_NPT_4fs]ns/dayDC-Cellulose_NPTyes11.713086161,2622,4762815551,1092,456
    AMBER [PME-Cellulose_NPT_4fs]NRFDC-Cellulose_NPTyes1x26x53x108x211x24x47x95x210x
    AMBER [PME-Cellulose_NVE_4fs]ns/dayDC-Cellulose_NVEyes11.693146291,2692,5952855631,1252,367
    AMBER [PME-Cellulose_NVE_4fs]NRFDC-Cellulose_NVEyes1x27x54x109x222x24x48x96x202x
    AMBER [PME-FactorIX_NPT_4fs]ns/dayDC-FactorIX_NPTyes93.361,3352,6645,39711,2951,2362,4544,8989,766
    AMBER [PME-FactorIX_NPT_4fs]NRFDC-FactorIX_NPTyes1x14x29x58x121x13x26x52x105x
    AMBER [PME-FactorIX_NVE_4fs]ns/dayDC-FactorIX_NVEyes99.501,3652,7405,60611,8401,2542,5135,2469,974
    AMBER [PME-FactorIX_NVE_4fs]NRFDC-FactorIX_NVEyes1x14x28x56x119x13x25x53x100x
    AMBER [PME-JAC_NPT_4fs]ns/dayDC-JAC_NPTyes377.044,5739,28618,51536,0904,2398,45317,80432,754
    AMBER [PME-JAC_NPT_4fs]NRFDC-JAC_NPTyes1x12x25x49x96x11x22x47x87x
    AMBER [PME-JAC_NVE_4fs]ns/dayDC-JAC_NVEyes397.044,7299,39519,26538,1194,2938,52817,02933,107
    AMBER [PME-JAC_NVE_4fs]NRFDC-JAC_NVEyes1x12x24x49x96x11x21x43x83x
    AMBER [PME-STMV_NPT_4fs]ns/dayDC-STMV_NPTyes3.698917835771392184368736
    AMBER [PME-STMV_NPT_4fs]NRFDC-STMV_NPTyes1x24x48x97x193x25x50x100x199x
    AMBER [FEP-GTI_Complex 1fs]ns/dayFEP-GTI_Complexyes25.071933867711,5431813627231,446
    AMBER [FEP-GTI_Complex 1fs]NRFFEP-GTI_Complexyes1x8x15x31x62x7x14x29x58x

    AMBER is measured by running multiple independent instances using MPS


    Chroma

    Physics

    Lattice Quantum Chromodynamics (LQCD)

    VERSION

    V2025.01

    ACCELERATED FEATURES

    • Wilson-clover fermions, Krylov solvers, Domain-decomposition
    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H100 SXM2x H100 SXM4x H100 SXM8x H100 SXM1x H100 NVL2x H100 NVL4x H100 NVL8x H100 NVL
    ChromaFinal Timestep Time (Sec)HMC Mediumno10,03726110663401901096849
    ChromaNRFHMC Mediumyes1x38x96x164x256x53x94x151x209x

    FUN3D

    Engineering

    Suite of tools actively developed at NASA for Aeronautics and Space Technology by modeling fluid flow

    VERSION

    14.1

    ACCELERATED FEATURES

    • Full range of Mach number regimes for the Reynolds-averaged Navier Stokes (RANS) formulation

    SCALABILITY

    Multi-GPU and Single-Node

    MORE INFORMATION

    https://fun3d.larc.nasa.gov

    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H100 SXM2x H100 SXM4x H100 SXM8x H100 SXM1x H100 NVL2x H100 NVL4x H100 NVL8x H100 NVL
    Fun3D [dpw_wbt0_crs-3.6Mn_5]Loop Time (Sec)dpw_wbt0_crs-3.6Mn_5no112271610829171010
    Fun3D [dpw_wbt0_crs-3.6Mn_5]NRFdpw_wbt0_crs-3.6Mn_5yes1x4x9x15x19x4x9x14x15x
    Fun3D [waverider-5M]Loop Time (Sec)waverider-5Mno154382112840221210
    Fun3D [waverider-5M]NRFwaverider-5Myes1x4x12x20x29x4x11x20x25x
    Fun3D [waverider-5M w/chemistry]Loop Time (Sec)waverider-5M w/chemistryno474104542918110583020
    Fun3D [waverider-5M w/chemistry]NRFwaverider-5M w/chemistryyes1x5x13x24x40x4x12x23x35x
    Fun3D [waverider-20M]Loop Time (Sec)waverider-20Mno628--4123--4326
    Fun3D [waverider-20M]NRFwaverider-20Myes1x--20x37x--19x33x
    Fun3D [waverider-20M w/chemistry]Loop Time (Sec)waverider-20M w/chemistryno2,011--11661--12568
    Fun3D [waverider-20M w/chemistry]NRFwaverider-20M w/chemistryyes1x--25x48x--24x43x

    GROMACS

    Molecular Dynamics

    Simulation of biochemical molecules with complicated bond interactions

    VERSION

    h-bond - 2025-rc

    ACCELERATED FEATURES

    • Implicit (5x), Explicit (2x) Solvent
    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H100 SXM2x H100 SXM4x H100 SXM8x H100 SXM1x H100 NVL2x H100 NVL4x H100 NVL8x H100 NVL
    GROMACS [ADH Dodec]ns/dayADH Dodecyes3628231,5402,7005,2957671,4322,6255,326
    GROMACS [ADH Dodec]NRFADH Dodecyes1x2x4x7x15x2x4x7x15x
    GROMACS [STMV]ns/daySTMVyes2044751302004170121144
    GROMACS [STMV]NRFSTMVyes1x2x4x8x13x2x3x7x9x

    GROMACS [ADH Dodec] is measured by running multiple independent instances using MPS


    GTC

    Physics

    GTC is used for Gyrokinetic Particle Simulation of Turbulent Transport in Burning Plasmas

    VERSION

    V4.5 updated

    ACCELERATED FEATURES

    • Push, shift, and collision

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H100 SXM2x H100 SXM4x H100 SXM8x H100 SXM1x H100 NVL2x H100 NVL4x H100 NVL8x H100 NVL
    GTCMpush/Secmpi#proc.inyes1467691,4362,8055,2357411,3962,6794,819
    GTCNRFmpi#proc.inyes1x5x10x20x38x5x10x20x35x

    LAMMPS

    Molecular Dynamics

    Classical molecular dynamics package

    VERSION

    patch_4Feb2025

    ACCELERATED FEATURES

    • Lennard-Jones, Gay-Berne, Tersoff, many more potentials
    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H100 SXM2x H100 SXM4x H100 SXM8x H100 SXM1x H100 NVL2x H100 NVL4x H100 NVL8x H100 NVL
    LAMMPS [LJ 2.5]ATOM-Time Steps/sLJ 2.5yes3.95E+081.33E+092.47E+094.38E+097.42E+091.16E+091.90E+093.39E+096.04E+09
    LAMMPS [LJ 2.5]NRFLJ 2.5yes1x3x6x12x20x3x5x9x16x
    LAMMPS [EAM]ATOM-Time Steps/sEAMyes1.44E+085.34E+081.02E+091.82E+093.02E+095.10E+088.51E+081.49E+092.49E+09
    LAMMPS [EAM]NRFEAMyes1x4x7x13x22x4x6x11x18x
    LAMMPS [ReaxFF/C]ATOM-Time Steps/sReaxFF/Cyes1.93E+061.07E+071.93E+073.15E+074.77E+079.49E+061.72E+072.89E+074.23E+07
    LAMMPS [ReaxFF/C]NRFReaxFF/Cyes1x6x15x24x37x5x13x22x33x
    LAMMPS [SNAP]ATOM-Time Steps/sSNAPyes1.61E+064.16E+068.35E+061.65E+073.29E+073.65E+066.37E+061.20E+072.68E+07
    LAMMPS [SNAP]NRFSNAPyes1x3x7x12x24x2x5x9x19x
    LAMMPS [Tersoff]ATOM-Time Steps/sTersoffyes2.21E+081.00E+091.79E+093.35E+095.69E+098.68E+081.49E+092.84E+09-
    LAMMPS [Tersoff]NRFTersoffyes1x5x10x18x30x4x7x15x-

    MILC

    Physics

    Lattice Quantum Chromodynamics (LQCD) codes simulate how elemental particles are formed and bound by the “strong force” to create larger particles like protons and neutrons

    VERSION

    develop_cde2498

    ACCELERATED FEATURES

    • Staggered fermions, Krylov solvers, Gauge-link fattening

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    https://ngc.nvidia.com/catalog/containers/hpc:milc

    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H100 SXM2x H100 SXM4x H100 SXM8x H100 SXM1x H100 NVL2x H100 NVL4x H100 NVL8x H100 NVL
    MILCTotal Time (sec)Apex Mediumno13,7351,1736323562161,212679373266
    MILCNRFApex Mediumyes1x12x19x34x57x11x18x33x46x

    NAMD

    Molecular Dynamics

    Designed for high-performance simulation of large molecular systems

    VERSION

    3

    ACCELERATED FEATURES

    • Full electrostatics with PME and most simulation features

    SCALABILITY

    Up to 100M atom capable, multi-GPU, single node

    MORE INFORMATION

    http://www.ks.uiuc.edu/Research/namd/

    https://ngc.nvidia.com/catalog/containers/hpc:namd

    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H100 SXM2x H100 SXM4x H100 SXM8x H100 SXM1x H100 NVL2x H100 NVL4x H100 NVL8x H100 NVL
    NAMD [apoa1_npt_cuda]ns/dayapoa1_npt_cudayes100.722995961,1812,3002735501,1062,209
    NAMD [apoa1_npt_cuda]NRFapoa1_npt_cudayes1x3x6x12x23x3x5x11x22x
    NAMD [LaINDY ColVars]ns/dayLaINDY ColVarsyes50.568717434668984162325646
    NAMD [LaINDY ColVars]NRFLaINDY ColVarsyes1x2x3x7x14x2x3x6x13x
    NAMD [apoa1_nve_cuda]ns/dayapoa1_nve_cudayes108.793817571,4942,9353537061,4122,737
    NAMD [apoa1_nve_cuda]NRFapoa1_nve_cudayes1x4x7x14x27x3x6x13x25x
    NAMD [stmv_npt_cuda]ns/daystmv_npt_cudayes10.53244997196234692184
    NAMD [stmv_npt_cuda]NRFstmv_npt_cudayes1x2x5x9x19x2x4x9x18x
    NAMD [COVID-19 Spike Assembly]ns/dayCOVID-19 Spike Assemblyyes0.75361118358-
    NAMD [COVID-19 Spike Assembly]NRFCOVID-19 Spike Assemblyyes1x4x8x14x24x4x6x10x-
    NAMD [stmv_nve_cuda]ns/daystmv_nve_cudayes10.8731621232472957114227
    NAMD [stmv_nve_cuda]NRFstmv_nve_cudayes1x3x6x11x23x3x5x10x21x

    NAMD is measured by running multiple independent instances using MPS except NAMD [COVID-19 Spike Assembly] dataset
    Trifan A, Gorgun D, Salim M, et al. Intelligent resolution: Integrating Cryo-EM with AI-driven multi-resolution simulations to observe the severe acute respiratory syndrome coronavirus-2 replication-transcription machinery in action. The International Journal of High Performance Computing Applications. 2022;36(5-6):603-623. doi:10.1177/10943420221113513
    D. B. Sauer, N. Trebesch, J. J. Marden, N. Cocco, J. Song, A. Koide, S. Koide, E. Tajkhorshid, and D.-N. Wang. "Structural basis for the reaction cycle of DASS dicarboxylate transporters." eLife. 9, e61350 (2020). https://doi.org/10.7554/eLife.61350


    RELION

    Microscopy

    Stand-alone computer program that employs an empirical Bayesianapproach to refinement of (multiple) 3D reconstructions or 2D class averages in electron cryo-microscopy (cryo-EM)

    VERSION

    5.0.0

    ACCELERATED FEATURES

    • Reduced memory requirements; high-resolution cryo-EM structure determination in a matter of day on a single workstation
    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H100 SXM2x H100 SXM4x H100 SXM8x H100 SXM1x H100 NVL2x H100 NVL4x H100 NVL
    Relion [Plasmodium Ribosome]Total Wall Clock (Sec)MB numbers Plasmodium Ribosime on Relion-3.0no8,9812,1371,2881,0591,0052,4581,2191,035
    Relion [Plasmodium Ribosome]NRFMB numbers Plasmodium Ribosime on Relion-3.0yes1x4x7x8x9x4x7x9x

    RTM

    Geoscience

    Reverse time migration (RTM) modeling is a critical component in the seismic processing workflow of oil and gas exploration

    VERSION

    nvidia_2024_01

    ACCELERATED FEATURES

    • Batch algorithm

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    http://www.tsunamidevelopment.com/assets/rtm.pdf

    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H100 SXM2x H100 SXM4x H100 SXM8x H100 SXM1x H100 NVL2x H100 NVL4x H100 NVL8x H100 NVL
    RTM [Isotropic Radius 4]Mcell/sIsotropic Radius 4yes21,047157,252313,545625,2421,250,439153,662292,630589,5621,214,770
    RTM [Isotropic Radius 4]NRFIsotropic Radius 4yes1x7x15x30x59x7x14x28x58x
    RTM [TTI Radius 8 1-pass]Mcell/sTTI Radius 8 1-passyes7,21330,82461,529122,504244,24625,59749,60794,039197,162
    RTM [TTI Radius 8 1-pass]NRFTTI Radius 8 1-passyes1x4x9x17x34x4x7x13x27x
    RTM [TTI RX 2Pass mgpu]Mcell/sTTI RX 2Pass mgpuyes7,21326,71153,090105,394210,08623,97846,00192,576186,835
    RTM [TTI RX 2Pass mgpu]NRFTTI RX 2Pass mgpuyes1x4x7x15x29x3x6x13x26x

    SPECFEM3D

    Geoscience

    Simulates Seismic wave propagation

    VERSION

    4.1.1

    ACCELERATED FEATURES

    • OpenCL and CUDA hardware accelerators, based on an automatic source-to-source transformation library

    SCALABILITY

    Multi-GPU and Single-Node

    MORE INFORMATION

    https://geodynamics.org/cig/software/specfem3d/

    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H100 SXM2x H100 SXM4x H100 SXM8x H100 SXM1x H100 NVL2x H100 NVL4x H100 NVL8x H100 NVL
    SPECFEM3DTotal Time (Sec)four_material_simple_modelno186462414105026149
    SPECFEM3DNRFfour_material_simple_modelyes1x4x9x16x22x4x6x15x24x


    Detailed L40S application performance data is located below in alphabetical order.

    AMBER

    Molecular Dynamics

    Suite of programs to simulate molecular dynamics on biomolecule

    VERSION

    24-AT_24

    ACCELERATED FEATURES

    • PMEMD Explicit Solvent and GB Implicit Solvent

    SCALABILITY

    Multi-GPU and Single Node

    MORE INFORMATION

    http://ambermd.org/GPUSupport.php

    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x L40S2x L40S4x L40S8x L40S
    AMBER [PME-Cellulose_NPT_4fs]ns/dayDC-Cellulose_NPTyes11.711793567281,582
    AMBER [PME-Cellulose_NPT_4fs]NRFDC-Cellulose_NPTyes1x15x30x62x135x
    AMBER [PME-Cellulose_NVE_4fs]ns/dayDC-Cellulose_NVEyes11.691833727391,580
    AMBER [PME-Cellulose_NVE_4fs]NRFDC-Cellulose_NVEyes1x16x32x63x135x
    AMBER [PME-FactorIX_NPT_4fs]ns/dayDC-FactorIX_NPTyes93.369772,0044,0178,935
    AMBER [PME-FactorIX_NPT_4fs]NRFDC-FactorIX_NPTyes1x10x21x43x96x
    AMBER [PME-FactorIX_NVE_4fs]ns/dayDC-FactorIX_NVEyes99.501,0202,0604,1669,026
    AMBER [PME-FactorIX_NVE_4fs]NRFDC-FactorIX_NVEyes1x10x21x42x91x
    AMBER [PME-JAC_NPT_4fs]ns/dayDC-JAC_NPTyes377.044,1508,38917,11235,769
    AMBER [PME-JAC_NPT_4fs]NRFDC-JAC_NPTyes1x11x22x45x95x
    AMBER [PME-JAC_NVE_4fs]ns/dayDC-JAC_NVEyes397.044,2408,70617,762-
    AMBER [PME-JAC_NVE_4fs]NRFDC-JAC_NVEyes1x11x22x45x-
    AMBER [PME-STMV_NPT_4fs]ns/dayDC-STMV_NPTyes3.6974148296592
    AMBER [PME-STMV_NPT_4fs]NRFDC-STMV_NPTyes1x20x40x80x160x
    AMBER [FEP-GTI_Complex 1fs]ns/dayFEP-GTI_Complexyes25.071943887761,552
    AMBER [FEP-GTI_Complex 1fs]NRFFEP-GTI_Complexyes1x8x15x31x62x

    AMBER is measured by running multiple independent instances using MPS


    Chroma

    Physics

    Lattice Quantum Chromodynamics (LQCD)

    VERSION

    V2025.01

    ACCELERATED FEATURES

    • Wilson-clover fermions, Krylov solvers, Domain-decomposition
    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)2x L40S4x L40S8x L40S
    ChromaFinal Timestep Time (Sec)HMC Mediumno10,037367343152
    ChromaNRFHMC Mediumyes1x28x30x67x

    FUN3D

    Engineering

    Suite of tools actively developed at NASA for Aeronautics and Space Technology by modeling fluid flow

    VERSION

    14.1

    ACCELERATED FEATURES

    • Full range of Mach number regimes for the Reynolds-averaged Navier Stokes (RANS) formulation

    SCALABILITY

    Multi-GPU and Single-Node

    MORE INFORMATION

    https://fun3d.larc.nasa.gov

    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)4x L40S8x L40S
    Fun3D [waverider-5M]Loop Time (Sec)waverider-5Mno1544123
    Fun3D [waverider-5M]NRFwaverider-5Myes1x6x10x
    Fun3D [waverider-5M w/chemistry]Loop Time (Sec)waverider-5M w/chemistryno47410557
    Fun3D [waverider-5M w/chemistry]NRFwaverider-5M w/chemistryyes1x7x12x
    Fun3D [waverider-20M]Loop Time (Sec)waverider-20Mno62816589
    Fun3D [waverider-20M]NRFwaverider-20Myes1x5x9x
    Fun3D [waverider-20M w/chemistry]Loop Time (Sec)waverider-20M w/chemistryno2,011-237
    Fun3D [waverider-20M w/chemistry]NRFwaverider-20M w/chemistryyes1x-12x

    GROMACS

    Molecular Dynamics

    Simulation of biochemical molecules with complicated bond interactions

    VERSION

    h-bond - 2025-rc

    ACCELERATED FEATURES

    • Implicit (5x), Explicit (2x) Solvent
    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x L40S2x L40S4x L40S8x L40S
    GROMACS [ADH Dodec]ns/dayADH Dodecyes3626401,3532,7125,520
    GROMACS [ADH Dodec]NRFADH Dodecyes1x2x4x7x15x
    GROMACS [STMV]ns/daySTMVyes204473113-
    GROMACS [STMV]NRFSTMVyes1x2x4x6x-

    GROMACS [ADH Dodec] is measured by running multiple independent instances using MPS


    GTC

    Physics

    GTC is used for Gyrokinetic Particle Simulation of Turbulent Transport in Burning Plasmas

    VERSION

    V4.5 updated

    ACCELERATED FEATURES

    • Push, shift, and collision

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x L40S2x L40S4x L40S8x L40S
    GTCMpush/Secmpi#proc.inyes1464397261,5833,007
    GTCNRFmpi#proc.inyes1x3x5x12x22x

    MILC

    Physics

    Lattice Quantum Chromodynamics (LQCD) codes simulate how elemental particles are formed and bound by the “strong force” to create larger particles like protons and neutrons

    VERSION

    develop_cde2498

    ACCELERATED FEATURES

    • Staggered fermions, Krylov solvers, Gauge-link fattening

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    https://ngc.nvidia.com/catalog/containers/hpc:milc

    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x L40S2x L40S4x L40S
    MILCTotal Time (sec)Apex Mediumno13,7354,0462,0471,438
    MILCNRFApex Mediumyes1x3x6x8x

    NAMD

    Molecular Dynamics

    Designed for high-performance simulation of large molecular systems

    VERSION

    3

    ACCELERATED FEATURES

    • Full electrostatics with PME and most simulation features

    SCALABILITY

    Up to 100M atom capable, multi-GPU, single node

    MORE INFORMATION

    http://www.ks.uiuc.edu/Research/namd/

    https://ngc.nvidia.com/catalog/containers/hpc:namd

    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x L40S2x L40S4x L40S8x L40S
    NAMD [apoa1_npt_cuda]ns/dayapoa1_npt_cudayes100.722304579001,816
    NAMD [apoa1_npt_cuda]NRFapoa1_npt_cudayes1x2x5x9x18x
    NAMD [LaINDY ColVars]ns/dayLaINDY ColVarsyes50.5662125248496
    NAMD [LaINDY ColVars]NRFLaINDY ColVarsyes1x1x2x5x10x
    NAMD [apoa1_nve_cuda]ns/dayapoa1_nve_cudayes108.793005971,2002,354
    NAMD [apoa1_nve_cuda]NRFapoa1_nve_cudayes1x3x5x11x22x
    NAMD [stmv_npt_cuda]ns/daystmv_npt_cudayes10.53173468136
    NAMD [stmv_npt_cuda]NRFstmv_npt_cudayes1x2x3x6x13x
    NAMD [stmv_nve_cuda]ns/daystmv_nve_cudayes10.87234692183
    NAMD [stmv_nve_cuda]NRFstmv_nve_cudayes1x2x4x8x17x

    NAMD is measured by running multiple independent instances using MPS except NAMD [COVID-19 Spike Assembly] dataset
    Trifan A, Gorgun D, Salim M, et al. Intelligent resolution: Integrating Cryo-EM with AI-driven multi-resolution simulations to observe the severe acute respiratory syndrome coronavirus-2 replication-transcription machinery in action. The International Journal of High Performance Computing Applications. 2022;36(5-6):603-623. doi:10.1177/10943420221113513
    D. B. Sauer, N. Trebesch, J. J. Marden, N. Cocco, J. Song, A. Koide, S. Koide, E. Tajkhorshid, and D.-N. Wang. "Structural basis for the reaction cycle of DASS dicarboxylate transporters." eLife. 9, e61350 (2020). https://doi.org/10.7554/eLife.61350


    RTM

    Geoscience

    Reverse time migration (RTM) modeling is a critical component in the seismic processing workflow of oil and gas exploration

    VERSION

    nvidia_2024_01

    ACCELERATED FEATURES

    • Batch algorithm

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    http://www.tsunamidevelopment.com/assets/rtm.pdf

    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x L40S2x L40S4x L40S8x L40S
    RTM [Isotropic Radius 4]Mcell/sIsotropic Radius 4yes21,04742,36684,432168,028336,068
    RTM [Isotropic Radius 4]NRFIsotropic Radius 4yes1x2x4x8x16x
    RTM [TTI Radius 8 1-pass]Mcell/sTTI Radius 8 1-passyes7,21314,64428,93757,176114,205
    RTM [TTI Radius 8 1-pass]NRFTTI Radius 8 1-passyes1x2x4x8x16x


    Detailed L4 application performance data is located below in alphabetical order.

    AMBER

    Molecular Dynamics

    Suite of programs to simulate molecular dynamics on biomolecule

    VERSION

    24-AT_24

    ACCELERATED FEATURES

    • PMEMD Explicit Solvent and GB Implicit Solvent

    SCALABILITY

    Multi-GPU and Single Node

    MORE INFORMATION

    http://ambermd.org/GPUSupport.php

    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x L4 2x L44x L48x L4
    AMBER [PME-Cellulose_NPT_4fs]ns/dayDC-Cellulose_NPTyes11.7155109220440
    AMBER [PME-Cellulose_NPT_4fs]NRFDC-Cellulose_NPTyes1x5x9x19x38x
    AMBER [PME-Cellulose_NVE_4fs]ns/dayDC-Cellulose_NVEyes11.6956111220442
    AMBER [PME-Cellulose_NVE_4fs]NRFDC-Cellulose_NVEyes1x5x10x19x38x
    AMBER [PME-FactorIX_NPT_4fs]ns/dayDC-FactorIX_NPTyes93.362665361,0652,145
    AMBER [PME-FactorIX_NPT_4fs]NRFDC-FactorIX_NPTyes1x3x6x11x23x
    AMBER [PME-FactorIX_NVE_4fs]ns/dayDC-FactorIX_NVEyes99.502725441,0932,231
    AMBER [PME-FactorIX_NVE_4fs]NRFDC-FactorIX_NVEyes1x3x5x11x22x
    AMBER [PME-JAC_NPT_4fs]ns/dayDC-JAC_NPTyes377.041,2812,5195,14410,383
    AMBER [PME-JAC_NPT_4fs]NRFDC-JAC_NPTyes1x3x7x14x28x
    AMBER [PME-JAC_NVE_4fs]ns/dayDC-JAC_NVEyes397.041,2802,5675,17610,395
    AMBER [PME-JAC_NVE_4fs]NRFDC-JAC_NVEyes1x3x6x13x26x
    AMBER [PME-STMV_NPT_4fs]ns/dayDC-STMV_NPTyes3.69214183166
    AMBER [PME-STMV_NPT_4fs]NRFDC-STMV_NPTyes1x6x11x22x45x
    AMBER [FEP-GTI_Complex 1fs]ns/dayFEP-GTI_Complexyes25.07113226451902
    AMBER [FEP-GTI_Complex 1fs]NRFFEP-GTI_Complexyes1x4x9x18x36x

    AMBER is measured by running multiple independent instances using MPS


    GTC

    Physics

    GTC is used for Gyrokinetic Particle Simulation of Turbulent Transport in Burning Plasmas

    VERSION

    V4.5 updated

    ACCELERATED FEATURES

    • Push, shift, and collision

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9654 (CPU-Only)4x L48x L4
    GTCMpush/Secmpi#proc.inyes1366571,244
    GTCNRFmpi#proc.inyes1x5x10x

    MILC

    Physics

    Lattice Quantum Chromodynamics (LQCD) codes simulate how elemental particles are formed and bound by the “strong force” to create larger particles like protons and neutrons

    VERSION

    develop_cde2498

    ACCELERATED FEATURES

    • Staggered fermions, Krylov solvers, Gauge-link fattening

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    https://ngc.nvidia.com/catalog/containers/hpc:milc

    ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9654 (CPU-Only)2x L44x L48x L4
    MILCTotal Time (sec)Apex Mediumno16,5705,8733,0001,618
    MILCNRFApex Mediumyes1x3x5x9x
    人人超碰97caoporen国产