• <xmp id="om0om">
  • <table id="om0om"><noscript id="om0om"></noscript></table>
  • For Deep Learning performance, please go here.


    Modern HPC data centers are key to solving some of the world’s most important scientific and engineering challenges. The NVIDIA Data Center GPUs fundamentally change the economics of the data center, delivering breakthrough performance with dramatically fewer servers, less power consumption, and reduced networking overhead, resulting in total cost savings of 5X-10X.

    The number of CPU-only servers replaced by a single GPU-accelerated server is called the node replacement factor (NRF). To arrive at NRF, we measure application performance with up to 8 CPU-only servers. Then we use linear scaling to scale beyond 8 servers to calculate the NRF. The NRF will vary by application.

    Engineering

    CPU Server: Dual Xeon Gold 6240@2.60GHz | GPU Server: Dual Intel SPR 8480C@2GHz with 4x NVIDIA H100 SXM 80GB | FUN3D Benchmark: dpw_wbt0_crs-3.6Mn_5, CUDA Version: 11.8

    Geoscience

    CPU Server: Dual Xeon Gold 6240@2.60GHz | GPU Server: Dual SPR 8480C@2GHz with 4x NVIDIA H100 SXM 80GB | ICON Benchmark: QUBICC 160 km resolution, CUDA Version: 11.8 | RTM Benchmark: Isotropic Radius 4, CUDA Version: 11.8 | SPECFEM3D Benchmark: four_material_simple_model, CUDA Version: 11.8

    Molecular Dynamics

    CPU Server: Dual Xeon Gold 6240@2.60GHz | GPU Server: Dual SPR 8480C@2GHz with 4x NVIDIA H100 SXM 80GB | AMBER Benchmark: DC-Cellulose_NPT, CUDA Version: 11.8 | GROMACS Benchmark: STMV, CUDA Version: 11.8 | LAMMPS Benchmark: SNAP, CUDA Version: 11.8 | NAMD Benchmark: apoa1_nve_cuda, CUDA Version: 11.8

    Physics

    CPU Server: Dual Xeon Gold 6240@2.60GHz | GPU Server: Dual SPR 8480C@2GHz with 4x NVIDIA H100 SXM 80GB | Chroma Benchmark: szscl21_24_128, CUDA Version: 11.3.1 | GTC Benchmark: moi#proc.in, CUDA Version: 11.8 | MILC Benchmark: Apex Medium, CUDA Version: 11.8


    Detailed H100 application performance data is located below in alphabetical order.

    AMBER

    Molecular Dynamics

    Suite of programs to simulate molecular dynamics on biomolecule

    VERSION

    22.0-AT_22.3

    ACCELERATED FEATURES

    • PMEMD Explicit Solvent and GB Implicit Solvent

    SCALABILITY

    Multi-GPU and Single Node

    MORE INFORMATION

    http://ambermd.org/GPUSupport.php

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x H100 SXM2x H100 SXM4x H100 SXM
    AMBER [PME-Cellulose_NPT_4fs]ns/dayDC-Cellulose_NPTyes4.133186331,264
    AMBER [PME-Cellulose_NPT_4fs]NRFDC-Cellulose_NPTyes1x77x153x306x
    AMBER [PME-Cellulose_NVE_4fs]ns/dayDC-Cellulose_NVEyes4.123136491,260
    AMBER [PME-Cellulose_NVE_4fs]NRFDC-Cellulose_NVEyes1x76x157x306x
    AMBER [PME-FactorIX_NPT_4fs]ns/dayDC-FactorIX_NPTyes20.711,3262,6805,330
    AMBER [PME-FactorIX_NPT_4fs]NRFDC-FactorIX_NPTyes1x64x129x257x
    AMBER [PME-FactorIX_NVE_4fs]ns/dayDC-FactorIX_NVEyes20.951,3562,7215,416
    AMBER [PME-FactorIX_NVE_4fs]NRFDC-FactorIX_NVEyes1x65x130x259x
    AMBER [PME-JAC_NPT_4fs]ns/dayDC-JAC_NPTyes84.614,5409,19717,967
    AMBER [PME-JAC_NPT_4fs]NRFDC-JAC_NPTyes1x54x109x212x
    AMBER [PME-JAC_NVE_4fs]ns/dayDC-JAC_NVEyes85.164,5909,30120,152
    AMBER [PME-JAC_NVE_4fs]NRFDC-JAC_NVEyes1x54x109x237x
    AMBER [PME-STMV_NPT_4fs]ns/dayDC-STMV_NPTyes1.3885170340
    AMBER [PME-STMV_NPT_4fs]NRFDC-STMV_NPTyes1x62x123x246x
    AMBER [FEP-GTI_Complex 1fs]ns/dayFEP-GTI_Complexyes9.89194388776
    AMBER [FEP-GTI_Complex 1fs]NRFFEP-GTI_Complexyes1x20x39x78x

    FUN3D

    Engineering

    Suite of tools actively developed at NASA for Aeronautics and Space Technology by modeling fluid flow

    VERSION

    13.7 (update 1)

    ACCELERATED FEATURES

    • Full range of Mach number regimes for the Reynolds-averaged Navier Stokes (RANS) formulation

    SCALABILITY

    Multi-GPU and Single-Node

    MORE INFORMATION

    https://fun3d.larc.nasa.gov

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x H100 SXM2x H100 SXM4x H100 SXM
    FUN3DLoop Time (Sec)dpw_wbt0_crs-3.6Mn_5no495301710
    FUN3DNRFdpw_wbt0_crs-3.6Mn_5yes1x21x37x59x

    GROMACS

    Molecular Dynamics

    Simulation of biochemical molecules with complicated bond interactions

    VERSION

    2022.3

    ACCELERATED FEATURES

    • Implicit (5x), Explicit (2x) Solvent
    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x H100 SXM2x H100 SXM4x H100 SXM
    GROMACS [ADH Dodec]ns/dayADH Dodecyes67626723896
    GROMACS [ADH Dodec]NRFADH Dodecyes1x12x14x18x
    GROMACS [Cellulose]ns/dayCelluloseyes19189246350
    GROMACS [Cellulose]NRFCelluloseyes1x14x19x27x
    GROMACS [STMV]ns/daySTMVyes44268115
    GROMACS [STMV]NRFSTMVyes1x10x17x28x

    GTC

    Physics

    GTC is used for Gyrokinetic Particle Simulation of Turbulent Transport in Burning Plasmas

    VERSION

    V 4.5 Updated

    ACCELERATED FEATURES

    • Push, shift, and collision

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x H100 SXM2x H100 SXM4x H100 SXM
    GTCMpush/Secmoi#proc.inyes357581,3702,441
    GTCNRFmoi#proc.inyes1x22x40x71x

    ICON

    Weather and Climate

    A global unified atmosphere model for numerical weather prediction and climate modeling research

    VERSION

    2.6.5_RC

    ACCELERATED FEATURES

    • Full model of dynamics and physics

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    https://code.mpimet.mpg.de/projects/iconpublic

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x H100 SXM2x H100 SXM4x H100 SXM
    ICON [SLAM 191 - 160KM - no radiation]Integrate_nh (sec)SLAM 191 levels 160 km resolution without radiationno2,431204149113
    ICON [SLAM 191 - 160KM - no radiation]NRFSLAM 191 levels 160 km resolution without radiationyes1x12x16x22x
    ICON [QUBICC 160 km resolution]Integrate_nh (sec)SLAM 191 levels 160 km resolution with radiationno2,213188133101
    ICON [QUBICC 160 km resolution]NRFSLAM 191 levels 160 km resolution with radiationyes1x12x17x22x

    LAMMPS

    Molecular Dynamics

    Classical molecular dynamics package

    VERSION

    stable_23Jun2022_update1

    ACCELERATED FEATURES

    • Lennard-Jones, Gay-Berne, Tersoff, many more potentials
    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x H100 SXM2x H100 SXM4x H100 SXM
    LAMMPS [LJ 2.5]ATOM-Time Steps/sLJ 2.5yes1.11E+081.07E+091.93E+093.35E+09
    LAMMPS [LJ 2.5]NRFLJ 2.5yes1x10x18x31x
    LAMMPS [EAM]ATOM-Time Steps/sEAMyes5.33E+075.13E+089.12E+081.60E+09
    LAMMPS [EAM]NRFEAMyes1x10x18x31x
    LAMMPS [ReaxFF/C]ATOM-Time Steps/sReaxFF/Cyes4.45E+051.02E+071.84E+073.04E+07
    LAMMPS [ReaxFF/C]NRFReaxFF/Cyes1x31x57x94x
    LAMMPS [SNAP]ATOM-Time Steps/sSNAPyes1.08E+053.87E+067.69E+061.52E+07
    LAMMPS [SNAP]NRFSNAPyes1x37x74x147x
    LAMMPS [Tersoff]ATOM-Time Steps/sTersoffyes2.77E+079.16E+081.64E+092.95E+09
    LAMMPS [Tersoff]NRFTersoffyes1x34x60x108x

    MILC

    Physics

    Lattice Quantum Chromodynamics (LQCD) codes simulate how elemental particles are formed and bound by the “strong force” to create larger particles like protons and neutrons

    VERSION

    feature/gauge-action-quda_16a2d47119

    ACCELERATED FEATURES

    • Staggered fermions, Krylov solvers, Gauge-link fattening

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    https://ngc.nvidia.com/catalog/containers/hpc:milc

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x H100 SXM2x H100 SXM4x H100 SXM
    MILCTotal Time (Sec)Apex Mediumno71,5951,172634355
    MILCNRFApex Mediumyes1x67x124x222x

    NAMD

    Molecular Dynamics

    Designed for high-performance simulation of large molecular systems

    VERSION

    GPU, AMD CPU V 3.0a13 ; Intel CPU V 2.15a AVX512

    ACCELERATED FEATURES

    • Full electrostatics with PME and most simulation features

    SCALABILITY

    Up to 100M atom capable, multi-GPU, single node

    MORE INFORMATION

    http://www.ks.uiuc.edu/Research/namd/

    https://ngc.nvidia.com/catalog/containers/hpc:namd

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x H100 SXM2x H100 SXM4x H100 SXM
    NAMD [apoa1_npt_cuda]Ave ns/dayapoa1_npt_cudayes19.152845491,048
    NAMD [apoa1_npt_cuda]NRFapoa1_npt_cudayes1x15x29x55x
    NAMD [apoa1_nptsr_cuda]Ave ns/dayapoa1_nptsr_cudayes19.592915701,124
    NAMD [apoa1_nptsr_cuda]NRFapoa1_nptsr_cudayes1x15x29x57x
    NAMD [apoa1_nve_cuda]Ave ns/dayapoa1_nve_cudayes20.753636981,386
    NAMD [apoa1_nve_cuda]NRFapoa1_nve_cudayes1x17x34x67x
    NAMD [stmv_npt_cuda]Ave ns/daystmv_npt_cudayes1.87234589
    NAMD [stmv_npt_cuda]NRFstmv_npt_cudayes1x12x24x48x
    NAMD [stmv_nptsr_cuda]Ave ns/daystmv_nptsr_cudayes1.81234692
    NAMD [stmv_nptsr_cuda]NRFstmv_nptsr_cudayes1x13x26x51x
    NAMD [stmv_nve_cuda]Ave ns/daystmv_nve_cudayes1.942754108
    NAMD [stmv_nve_cuda]NRFstmv_nve_cudayes1x14x28x56x

    RTM

    Geoscience

    Reverse time migration (RTM) modeling is a critical component in the seismic processing workflow of oil and gas exploration

    VERSION

    nvidia_2021_05

    ACCELERATED FEATURES

    • Batch algorithm

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    http://www.tsunamidevelopment.com/assets/rtm.pdf

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x H100 SXM2x H100 SXM4x H100 SXM
    RTM [Isotropic Radius 4]Mcells/sIsotropic Radius 4yes11,318124,975249,251498,066
    RTM [Isotropic Radius 4]NRFIsotropic Radius 4yes1x11x22x44x
    RTM [TTI Radius 8 1-pass]Mcells/sTTI Radius 8 1-passyes3,77322,10944,13588,094
    RTM [TTI Radius 8 1-pass]NRFTTI Radius 8 1-passyes1x6x12x23x
    RTM [TTI RX 2Pass mgpu]Mcells/sTTI RX 2Pass mgpuyes3,77321,70443,08885,784
    RTM [TTI RX 2Pass mgpu]NRFTTI RX 2Pass mgpuyes1x6x11x23x

    SPECFEM3D

    Geoscience

    Simulates Seismic wave propagation

    VERSION

    devel_fef2ace9

    ACCELERATED FEATURES

    • OpenCL and CUDA hardware accelerators, based on an automatic source-to-source transformation library

    SCALABILITY

    Multi-GPU and Single-Node

    MORE INFORMATION

    https://geodynamics.org/cig/software/specfem3d/

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x H100 SXM2x H100 SXM4x H100 SXM
    SPECFEM3DTotal Time (Sec)four_material_simple_modelno1,268462414
    SPECFEM3DNRFfour_material_simple_modelyes1x32x59x105x

    Engineering

    CPU Server: Dual Xeon Gold 6240@2.60GHz | GPU Server: Dual EPYC 7763@2.45GHz with 4x NVIDIA L40 | FUN3D Benchmark: dpw_wbt0_crs-3.6Mn_5, CUDA Version: 11.8

    Molecular Dynamics

    CPU Server: Dual Xeon Gold 6240@2.60GHz | GPU Server: Dual EPYC 7763@2.45GHz with 4x NVIDIA L40 | AMBER Benchmark: DC-STMV_NPT, CUDA Version: 11.8 | GROMACS Benchmark: STMV, CUDA Version: 11.8 | LAMMPS Benchmark: SNAP, CUDA Version: 11.8 | NAMD Benchmark: apoa1_nve_cuda, CUDA Version: 11.8


    Detailed L40 application performance data is located below in alphabetical order.

    AMBER

    Molecular Dynamics

    Suite of programs to simulate molecular dynamics on biomolecule

    VERSION

    22.0-AT_22.3

    ACCELERATED FEATURES

    • PMEMD Explicit Solvent and GB Implicit Solvent

    SCALABILITY

    Multi-GPU and Single Node

    MORE INFORMATION

    http://ambermd.org/GPUSupport.php

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x L402x L404x L408x L40
    AMBER [PME-FactorIX_NPT_4fs]ns/dayDC-FactorIX_NPTyes20.717791,5753,1476,303
    AMBER [PME-FactorIX_NPT_4fs]NRFDC-FactorIX_NPTyes1x38x76x152x304x
    AMBER [PME-FactorIX_NVE_4fs]ns/dayDC-FactorIX_NVEyes20.957971,6133,1826,428
    AMBER [PME-FactorIX_NVE_4fs]NRFDC-FactorIX_NVEyes1x38x77x152x307x
    AMBER [PME-JAC_NPT_4fs]ns/dayDC-JAC_NPTyes84.613,2706,56112,90126,316
    AMBER [PME-JAC_NPT_4fs]NRFDC-JAC_NPTyes1x39x78x152x311x
    AMBER [PME-JAC_NVE_4fs]ns/dayDC-JAC_NVEyes85.163,2976,65613,25026,477
    AMBER [PME-JAC_NVE_4fs]NRFDC-JAC_NVEyes1x39x78x156x311x
    AMBER [PME-STMV_NPT_4fs]ns/dayDC-STMV_NPTyes1.3862124248497
    AMBER [PME-STMV_NPT_4fs]NRFDC-STMV_NPTyes1x45x90x180x360x

    FUN3D

    Engineering

    Suite of tools actively developed at NASA for Aeronautics and Space Technology by modeling fluid flow

    VERSION

    13.7 (update 1)

    ACCELERATED FEATURES

    • Full range of Mach number regimes for the Reynolds-averaged Navier Stokes (RANS) formulation

    SCALABILITY

    Multi-GPU and Single-Node

    MORE INFORMATION

    https://fun3d.larc.nasa.gov

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x L402x L404x L408x L40
    FUN3DLoop Time (Sec)dpw_wbt0_crs-3.6Mn_5no495119613219
    FUN3DNRFdpw_wbt0_crs-3.6Mn_5yes1x5x10x19x32x

    GROMACS

    Molecular Dynamics

    Simulation of biochemical molecules with complicated bond interactions

    VERSION

    2022.3

    ACCELERATED FEATURES

    • Implicit (5x), Explicit (2x) Solvent
    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x L402x L404x L40
    GROMACS [ADH Dodec]ns/dayADH Dodecyes67566--
    GROMACS [ADH Dodec]NRFADH Dodecyes1x11x--
    GROMACS [Cellulose]ns/dayCelluloseyes19161-212
    GROMACS [Cellulose]NRFCelluloseyes1x12x-16x
    GROMACS [STMV]ns/daySTMVyes4335581
    GROMACS [STMV]NRFSTMVyes1x8x13x20x

    LAMMPS

    Molecular Dynamics

    Classical molecular dynamics package

    VERSION

    stable_23Jun2022_update1

    ACCELERATED FEATURES

    • Lennard-Jones, Gay-Berne, Tersoff, many more potentials
    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x L402x L404x L408x L40
    LAMMPS [ReaxFF/C]ATOM-Time Steps/sReaxFF/Cyes4.45E+051.51E+062.89E+065.35E+068.11E+06
    LAMMPS [ReaxFF/C]NRFReaxFF/Cyes1x4x9x16x25x
    LAMMPS [SNAP]ATOM-Time Steps/sSNAPyes1.08E+055.82E+051.16E+062.32E+064.58E+06
    LAMMPS [SNAP]NRFSNAPyes1x6x11x22x44x
    LAMMPS [Tersoff]ATOM-Time Steps/sTersoffyes2.77E+071.23E+082.40E+084.63E+087.01E+08
    LAMMPS [Tersoff]NRFTersoffyes1x4x9x17x26x

    NAMD

    Molecular Dynamics

    Designed for high-performance simulation of large molecular systems

    VERSION

    GPU, AMD CPU V 3.0a13 ; Intel CPU V 2.15a AVX512

    ACCELERATED FEATURES

    • Full electrostatics with PME and most simulation features

    SCALABILITY

    Up to 100M atom capable, multi-GPU, single node

    MORE INFORMATION

    http://www.ks.uiuc.edu/Research/namd/

    https://ngc.nvidia.com/catalog/containers/hpc:namd

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x L402x L404x L408x L40
    NAMD [apoa1_npt_cuda]Ave ns/dayapoa1_npt_cudayes19.151883867611,542
    NAMD [apoa1_npt_cuda]NRFapoa1_npt_cudayes1x10x20x40x81x
    NAMD [apoa1_nptsr_cuda]Ave ns/dayapoa1_nptsr_cudayes19.591913887671,571
    NAMD [apoa1_nptsr_cuda]NRFapoa1_nptsr_cudayes1x10x20x39x80x
    NAMD [apoa1_nve_cuda]Ave ns/dayapoa1_nve_cudayes20.752404819701,917
    NAMD [apoa1_nve_cuda]NRFapoa1_nve_cudayes1x12x23x47x92x
    NAMD [stmv_npt_cuda]Ave ns/daystmv_npt_cudayes1.87153059120
    NAMD [stmv_npt_cuda]NRFstmv_npt_cudayes1x8x16x32x64x
    NAMD [stmv_nptsr_cuda]Ave ns/daystmv_nptsr_cudayes1.81153162123
    NAMD [stmv_nptsr_cuda]NRFstmv_nptsr_cudayes1x8x17x34x68x
    NAMD [stmv_nve_cuda]Ave ns/daystmv_nve_cudayes1.94183570142
    NAMD [stmv_nve_cuda]NRFstmv_nve_cudayes1x9x18x36x73x

    Molecular Dynamics

    CPU Server: Dual Xeon Gold 6240@2.60GHz | GPU Server: Dual EPYC 7763@2.45GHz with 4x NVIDIA L4 | AMBER Benchmark: DC-JAC_NPT, CUDA Version: 11.8 | GROMACS Benchmark: STMV, CUDA Version: 11.8 | NAMD Benchmark: apoa1_nve_cuda, CUDA Version: 11.8


    Detailed L4 application performance data is located below in alphabetical order.

    AMBER

    Molecular Dynamics

    Suite of programs to simulate molecular dynamics on biomolecule

    VERSION

    22.0-AT_22.3

    ACCELERATED FEATURES

    • PMEMD Explicit Solvent and GB Implicit Solvent

    SCALABILITY

    Multi-GPU and Single Node

    MORE INFORMATION

    http://ambermd.org/GPUSupport.php

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x L42x L44x L48x L4
    AMBER [PME-JAC_NPT_4fs]ns/dayDC-JAC_NPTyes84.611,1462,3234,7319,554
    AMBER [PME-JAC_NPT_4fs]NRFDC-JAC_NPTyes1x14x27x56x113x
    AMBER [PME-JAC_NVE_4fs]ns/dayDC-JAC_NVEyes85.161,1622,3664,8119,666
    AMBER [PME-JAC_NVE_4fs]NRFDC-JAC_NVEyes1x14x28x56x113x

    GROMACS

    Molecular Dynamics

    Simulation of biochemical molecules with complicated bond interactions

    VERSION

    2022.3

    ACCELERATED FEATURES

    • Implicit (5x), Explicit (2x) Solvent
    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x L42x L44x L48x L4
    GROMACS [ADH Dodec]ns/dayADH Dodecyes67209346464-
    GROMACS [ADH Dodec]NRFADH Dodecyes1x4x7x9x-
    GROMACS [Cellulose]ns/dayCelluloseyes195794133162
    GROMACS [Cellulose]NRFCelluloseyes1x3x6x10x12x
    GROMACS [STMV]ns/daySTMVyes412224363
    GROMACS [STMV]NRFSTMVyes1x3x5x10x15x

    NAMD

    Molecular Dynamics

    Designed for high-performance simulation of large molecular systems

    VERSION

    GPU, AMD CPU V 3.0a13 ; Intel CPU V 2.15a AVX512

    ACCELERATED FEATURES

    • Full electrostatics with PME and most simulation features

    SCALABILITY

    Up to 100M atom capable, multi-GPU, single node

    MORE INFORMATION

    http://www.ks.uiuc.edu/Research/namd/

    https://ngc.nvidia.com/catalog/containers/hpc:namd

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x L42x L44x L48x L4
    NAMD [apoa1_npt_cuda]Ave ns/dayapoa1_npt_cudayes19.1563128260520
    NAMD [apoa1_npt_cuda]NRFapoa1_npt_cudayes1x3x7x14x27x
    NAMD [apoa1_nptsr_cuda]Ave ns/dayapoa1_nptsr_cudayes19.59-131266535
    NAMD [apoa1_nptsr_cuda]NRFapoa1_nptsr_cudayes1x-7x14x27x
    NAMD [apoa1_nve_cuda]Ave ns/dayapoa1_nve_cudayes20.7586172347701
    NAMD [apoa1_nve_cuda]NRFapoa1_nve_cudayes1x4x8x17x34x
    NAMD [stmv_npt_cuda]Ave ns/daystmv_npt_cudayes1.87591837
    NAMD [stmv_npt_cuda]NRFstmv_npt_cudayes1x2x5x10x20x
    NAMD [stmv_nptsr_cuda]Ave ns/daystmv_nptsr_cudayes1.815-1939
    NAMD [stmv_nptsr_cuda]NRFstmv_nptsr_cudayes1x3x-11x22x
    NAMD [stmv_nve_cuda]Ave ns/daystmv_nve_cudayes1.946122447
    NAMD [stmv_nve_cuda]NRFstmv_nve_cudayes1x3x6x12x24x

    Engineering

    CPU Server: Dual Xeon Gold 6240@2.60GHz | GPU Server: Dual EPYC 7742@2.25GHz with 4x NVIDIA A100 SXM 80GB | FUN3D Benchmark: dpw_wbt0_crs-3.6Mn_5, CUDA Version: 11.8

    Geoscience

    CPU Server: Dual Xeon Gold 6240@2.60GHz | GPU Server: Dual EPYC 7742@2.25GHz with 4x NVIDIA A100 SXM 80GB | ICON Benchmark: QUBICC 160 km resolution, CUDA Version: 11.8 | RTM Benchmark: Isotropic Radius 4, CUDA Version: 11.8 | SPECFEM3D Benchmark: four_material_simple_model, CUDA Version: 11.8

    Molecular Dynamics

    CPU Server: Dual Xeon Gold 6240@2.60GHz | GPU Server: Dual EPYC 7742@2.25GHz with 4x NVIDIA A100 SXM 80GB | AMBER Benchmark: DC-Cellulose_NVE, CUDA Version: 11.8 | GROMACS Benchmark: STMV, CUDA Version: 11.8 | LAMMPS Benchmark: SNAP, CUDA Version: 11.8 | NAMD Benchmark: apoa1_nve_cuda, CUDA Version: 11.8 | Relion Benchmark: Plasmodium Ribosome (2D), CUDA Version: 11.4.2

    Physics

    CPU Server: Dual Xeon Gold 6240@2.60GHz | GPU Server: Dual EPYC 7742@2.25GHz with 4x NVIDIA A100 SXM 80GB | Chroma Benchmark: szscl21_24_128, CUDA Version: 11.3.1 | MILC Benchmark: Apex Medium, CUDA Version: 11.8

    Quantum Mechanics

    CPU Server: Dual Xeon Gold 6240@2.60GHz | GPU Server: Dual EPYC 7742@2.25GHz with 4x NVIDIA A100 SXM 80GB | Quantum Espresso Benchmark: AUSURF112-jR, CUDA Version: 11.8


    Detailed A100 application performance data is located below in alphabetical order.

    AMBER

    Molecular Dynamics

    Suite of programs to simulate molecular dynamics on biomolecule

    VERSION

    22.0-AT_22.3

    ACCELERATED FEATURES

    • PMEMD Explicit Solvent and GB Implicit Solvent

    SCALABILITY

    Multi-GPU and Single Node

    MORE INFORMATION

    http://ambermd.org/GPUSupport.php

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x A100 SXM4 80GB2x A100 SXM4 80GB4x A100 SXM4 80GB8x A100 SXM4 80GB1x A100 PCIe 80GB2x A100 PCIe 80GB4x A100 PCIe 80GB8x A100 PCIe 80GB
    AMBER [PME-Cellulose_NPT_4fs]ns/dayDC-Cellulose_NPTyes4.131823647261,4561723346741,375
    AMBER [PME-Cellulose_NPT_4fs]NRFDC-Cellulose_NPTyes1x44x88x176x353x42x81x163x333x
    AMBER [PME-Cellulose_NVE_4fs]ns/dayDC-Cellulose_NVEyes4.121853717391,4831763406861,366
    AMBER [PME-Cellulose_NVE_4fs]NRFDC-Cellulose_NVEyes1x45x90x179x360x43x83x167x331x
    AMBER [PME-FactorIX_NPT_4fs]ns/dayDC-FactorIX_NPTyes20.717961,5943,1756,3837691,5253,0546,139
    AMBER [PME-FactorIX_NPT_4fs]NRFDC-FactorIX_NPTyes1x38x77x153x308x37x74x147x296x
    AMBER [PME-FactorIX_NVE_4fs]ns/dayDC-FactorIX_NVEyes20.958131,6313,2576,5327811,5143,1326,303
    AMBER [PME-FactorIX_NVE_4fs]NRFDC-FactorIX_NVEyes1x39x78x155x312x37x72x149x301x
    AMBER [PME-JAC_NPT_4fs]ns/dayDC-JAC_NPTyes84.612,8835,76111,51223,4332,8195,47611,12123,249
    AMBER [PME-JAC_NPT_4fs]NRFDC-JAC_NPTyes1x34x68x136x277x33x65x131x275x
    AMBER [PME-JAC_NVE_4fs]ns/dayDC-JAC_NVEyes85.162,9535,89411,69323,9352,9005,78711,39623,903
    AMBER [PME-JAC_NVE_4fs]NRFDC-JAC_NVEyes1x35x69x137x281x34x68x134x281x
    AMBER [PME-STMV_NPT_4fs]ns/dayDC-STMV_NPTyes1.385410721442953107214427
    AMBER [PME-STMV_NPT_4fs]NRFDC-STMV_NPTyes1x39x78x155x311x39x77x155x310x
    AMBER [FEP-GTI_Complex 1fs]ns/dayFEP-GTI_Complexyes9.891332665331,0661342685361,073
    AMBER [FEP-GTI_Complex 1fs]NRFFEP-GTI_Complexyes1x13x27x54x108x14x27x54x108x

    Chroma

    Physics

    Lattice Quantum Chromodynamics (LQCD)

    VERSION

    V 2021.08

    ACCELERATED FEATURES

    • Wilson-clover fermions, Krylov solvers, Domain-decomposition
    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x A100 SXM4 80GB2x A100 SXM4 80GB4x A100 SXM4 80GB8x A100 SXM4 80GB1x A100 PCIe 80GB2x A100 PCIe 80GB4x A100 PCIe 80GB8x A100 PCIe 80GB
    ChromaTotal Time (Sec)szscl21_24_128no1,11536201174425139
    ChromaNRFszscl21_24_128yes1x32x55x99x163x26x46x84x129x

    FUN3D

    Engineering

    Suite of tools actively developed at NASA for Aeronautics and Space Technology by modeling fluid flow

    VERSION

    13.7 (update 1)

    ACCELERATED FEATURES

    • Full range of Mach number regimes for the Reynolds-averaged Navier Stokes (RANS) formulation

    SCALABILITY

    Multi-GPU and Single-Node

    MORE INFORMATION

    https://fun3d.larc.nasa.gov

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x A100 SXM4 80GB2x A100 SXM4 80GB4x A100 SXM4 80GB8x A100 SXM4 80GB1x A100 PCIe 80GB2x A100 PCIe 80GB4x A100 PCIe 80GB8x A100 PCIe 80GB
    FUN3DLoop Time (Sec)dpw_wbt0_crs-3.6Mn_5no4955228161154291613
    FUN3DNRFdpw_wbt0_crs-3.6Mn_5yes1x12x22x39x55x11x21x39x49x

    GROMACS

    Molecular Dynamics

    Simulation of biochemical molecules with complicated bond interactions

    VERSION

    2022.3

    ACCELERATED FEATURES

    • Implicit (5x), Explicit (2x) Solvent
    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x A100 SXM4 80GB2x A100 SXM4 80GB4x A100 SXM4 80GB8x A100 SXM4 80GB1x A100 PCIe 80GB2x A100 PCIe 80GB4x A100 PCIe 80GB8x A100 PCIe 80GB
    GROMACS [ADH Dodec]ns/dayADH Dodecyes67372506677-389-518-
    GROMACS [ADH Dodec]NRFADH Dodecyes1x7x10x13x-8x-10x-
    GROMACS [Cellulose]ns/dayCelluloseyes19108174254290108122183-
    GROMACS [Cellulose]NRFCelluloseyes1x8x13x19x22x8x9x14x-
    GROMACS [STMV]ns/daySTMVyes424448012824396592
    GROMACS [STMV]NRFSTMVyes1x5x11x20x31x5x9x16x22x

    GTC

    Physics

    GTC is used for Gyrokinetic Particle Simulation of Turbulent Transport in Burning Plasmas

    VERSION

    V 4.5 Updated

    ACCELERATED FEATURES

    • Push, shift, and collision

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x A100 SXM4 80GB2x A100 SXM4 80GB8x A100 SXM4 80GB1x A100 PCIe 80GB2x A100 PCIe 80GB4x A100 PCIe 80GB8x A100 PCIe 80GB
    GTCMpush/Secmoi#proc.inyes354728983,6224789091,7552,706
    GTCNRFmoi#proc.inyes1x14x26x105x14x26x51x79x

    ICON

    Weather and Climate

    A global unified atmosphere model for numerical weather prediction and climate modeling research

    VERSION

    2.6.5_RC

    ACCELERATED FEATURES

    • Full model of dynamics and physics

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    https://code.mpimet.mpg.de/projects/iconpublic

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x A100 SXM4 80GB2x A100 SXM4 80GB4x A100 SXM4 80GB8x A100 SXM4 80GB1x A100 PCIe 80GB2x A100 PCIe 80GB4x A100 PCIe 80GB
    ICON [SLAM 191 - 160KM - no radiation]Integrate_nh (sec)SLAM 191 levels 160 km resolution without radiationno2,431317218158134318224165
    ICON [SLAM 191 - 160KM - no radiation]NRFSLAM 191 levels 160 km resolution without radiationyes1x8x11x15x18x8x11x15x
    ICON [QUBICC 160 km resolution]Integrate_nh (sec)SLAM 191 levels 160 km resolution with radiationno2,213293197144120291192140
    ICON [QUBICC 160 km resolution]NRFSLAM 191 levels 160 km resolution with radiationyes1x8x11x15x18x8x12x16x

    LAMMPS

    Molecular Dynamics

    Classical molecular dynamics package

    VERSION

    stable_23Jun2022_update1

    ACCELERATED FEATURES

    • Lennard-Jones, Gay-Berne, Tersoff, many more potentials
    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x A100 SXM4 80GB2x A100 SXM4 80GB4x A100 SXM4 80GB8x A100 SXM4 80GB1x A100 PCIe 80GB2x A100 PCIe 80GB4x A100 PCIe 80GB8x A100 PCIe 80GB
    LAMMPS [LJ 2.5]ATOM-Time Steps/sLJ 2.5yes1.11E+086.00E+081.12E+092.01E+093.66E+096.00E+081.07E+091.81E+09-
    LAMMPS [LJ 2.5]NRFLJ 2.5yes1x6x10x19x34x6x10x17x-
    LAMMPS [EAM]ATOM-Time Steps/sEAMyes5.33E+072.93E+085.35E+089.23E+081.58E+092.88E+085.04E+088.48E+08-
    LAMMPS [EAM]NRFEAMyes1x6x10x18x31x5x10x17x-
    LAMMPS [ReaxFF/C]ATOM-Time Steps/sReaxFF/Cyes4.45E+055.24E+069.68E+061.70E+072.69E+075.28E+069.53E+061.62E+071.97E+07
    LAMMPS [ReaxFF/C]NRFReaxFF/Cyes1x16x30x52x83x16x29x50x61x
    LAMMPS [SNAP]ATOM-Time Steps/sSNAPyes1.08E+052.21E+064.39E+068.73E+061.67E+072.11E+064.09E+068.12E+061.58E+07
    LAMMPS [SNAP]NRFSNAPyes1x21x42x85x162x20x40x79x153x
    LAMMPS [Tersoff]ATOM-Time Steps/sTersoffyes2.77E+075.28E+089.81E+081.75E+092.99E+095.09E+088.74E+081.40E+09-
    LAMMPS [Tersoff]NRFTersoffyes1x19x36x64x110x19x32x51x-

    MILC

    Physics

    Lattice Quantum Chromodynamics (LQCD) codes simulate how elemental particles are formed and bound by the “strong force” to create larger particles like protons and neutrons

    VERSION

    feature/gauge-action-quda_16a2d47119

    ACCELERATED FEATURES

    • Staggered fermions, Krylov solvers, Gauge-link fattening

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    https://ngc.nvidia.com/catalog/containers/hpc:milc

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x A100 SXM4 80GB2x A100 SXM4 80GB4x A100 SXM4 80GB8x A100 SXM4 80GB1x A100 PCIe 80GB2x A100 PCIe 80GB4x A100 PCIe 80GB
    MILCTotal Time (Sec)Apex Mediumno71,5952,0291,1846293612,0881,111614
    MILCNRFApex Mediumyes1x39x67x125x218x38x71x128x

    NAMD

    Molecular Dynamics

    Designed for high-performance simulation of large molecular systems

    VERSION

    GPU, AMD CPU V 3.0a13 ; Intel CPU V 2.15a AVX512

    ACCELERATED FEATURES

    • Full electrostatics with PME and most simulation features

    SCALABILITY

    Up to 100M atom capable, multi-GPU, single node

    MORE INFORMATION

    http://www.ks.uiuc.edu/Research/namd/

    https://ngc.nvidia.com/catalog/containers/hpc:namd

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x A100 SXM4 80GB2x A100 SXM4 80GB4x A100 SXM4 80GB8x A100 SXM4 80GB1x A100 PCIe 80GB2x A100 PCIe 80GB4x A100 PCIe 80GB8x A100 PCIe 80GB
    NAMD [apoa1_npt_cuda]Ave ns/dayapoa1_npt_cudayes19.151753476891,3681723416931,372
    NAMD [apoa1_npt_cuda]NRFapoa1_npt_cudayes1x9x18x36x71x9x18x36x72x
    NAMD [apoa1_nptsr_cuda]Ave ns/dayapoa1_nptsr_cudayes19.591783577141,3891783547111,399
    NAMD [apoa1_nptsr_cuda]NRFapoa1_nptsr_cudayes1x9x18x36x71x9x18x36x71x
    NAMD [apoa1_nve_cuda]Ave ns/dayapoa1_nve_cudayes20.752154368701,7312144248511,714
    NAMD [apoa1_nve_cuda]NRFapoa1_nve_cudayes1x10x21x42x83x10x20x41x83x
    NAMD [stmv_npt_cuda]Ave ns/daystmv_npt_cudayes1.8714274365132753104
    NAMD [stmv_npt_cuda]NRFstmv_npt_cudayes1x7x14x23x35x7x14x29x56x
    NAMD [stmv_nptsr_cuda]Ave ns/daystmv_nptsr_cudayes1.8114285666142655110
    NAMD [stmv_nptsr_cuda]NRFstmv_nptsr_cudayes1x8x15x31x36x8x15x31x61x
    NAMD [stmv_nve_cuda]Ave ns/daystmv_nve_cudayes1.94163250128163161127
    NAMD [stmv_nve_cuda]NRFstmv_nve_cudayes1x8x17x26x66x8x16x31x65x

    Quantum Espresso

    Material Science (Quantum Chemistry)

    An Open-source suite of computer codes for electronic structure calculations and materials modeling at the nanoscale

    VERSION

    V7.0 CPU; V7.1 GPU

    ACCELERATED FEATURES

    • linear algebra (matrix multiply)
    • explicit computational kernels
    • 3D FFTs

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    http://www.quantum-espresso.org

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x A100 SXM4 80GB2x A100 SXM4 80GB4x A100 SXM4 80GB8x A100 SXM4 80GB1x A100 PCIe 80GB2x A100 PCIe 80GB4x A100 PCIe 80GB8x A100 PCIe 80GB
    Quantum EspresssoTotal CPU Time (Sec)AUSURF112-jRno718111714736114704939
    Quantum EspresssoNRFAUSURF112-jRyes1x7x11x17x22x7x11x16x20x

    RELION

    Microscopy

    Stand-alone computer program that employs an empirical Bayesianapproach to refinement of (multiple) 3D reconstructions or 2D class averages in electron cryo-microscopy (cryo-EM)

    VERSION

    3.1.3

    ACCELERATED FEATURES

    • Reduced memory requirements; high-resolution cryo-EM structure determination in a matter of day on a single workstation
    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x A100 SXM4 80GB2x A100 SXM4 80GB4x A100 SXM4 80GB1x A100 PCIe 80GB2x A100 PCIe 80GB4x A100 PCIe 80GB
    Relion [Plasmodium Ribosome]Total Wall Clock (Sec)MB numbers Plasmodium Ribosime on Relion-3.0no12,7422,7361,6271,4392,6011,5231,383
    Relion [Plasmodium Ribosome]NRFMB numbers Plasmodium Ribosime on Relion-3.0yes1x5x8x9x5x8x9x

    RTM

    Geoscience

    Reverse time migration (RTM) modeling is a critical component in the seismic processing workflow of oil and gas exploration

    VERSION

    nvidia_2021_05

    ACCELERATED FEATURES

    • Batch algorithm

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    http://www.tsunamidevelopment.com/assets/rtm.pdf

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x A100 SXM4 80GB2x A100 SXM4 80GB4x A100 SXM4 80GB8x A100 SXM4 80GB1x A100 PCIe 80GB2x A100 PCIe 80GB4x A100 PCIe 80GB8x A100 PCIe 80GB
    RTM [Isotropic Radius 4]Mcells/sIsotropic Radius 4yes11,31889,561178,511356,907713,88389,536178,551339,823713,096
    RTM [Isotropic Radius 4]NRFIsotropic Radius 4yes1x8x16x32x63x8x16x30x63x
    RTM [TTI Radius 8 1-pass]Mcells/sTTI Radius 8 1-passyes3,77312,90325,76451,122102,18712,90125,79651,402102,510
    RTM [TTI Radius 8 1-pass]NRFTTI Radius 8 1-passyes1x3x7x14x27x3x7x14x27x
    RTM [TTI RX 2Pass mgpu]Mcells/sTTI RX 2Pass mgpuyes3,77313,95727,66454,933108,60713,74327,26553,741107,880
    RTM [TTI RX 2Pass mgpu]NRFTTI RX 2Pass mgpuyes1x4x7x15x29x4x7x14x29x

    SPECFEM3D

    Geoscience

    Simulates Seismic wave propagation

    VERSION

    devel_fef2ace9

    ACCELERATED FEATURES

    • OpenCL and CUDA hardware accelerators, based on an automatic source-to-source transformation library

    SCALABILITY

    Multi-GPU and Single-Node

    MORE INFORMATION

    https://geodynamics.org/cig/software/specfem3d/

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x A100 SXM4 80GB2x A100 SXM4 80GB4x A100 SXM4 80GB8x A100 SXM4 80GB1x A100 PCIe 80GB2x A100 PCIe 80GB4x A100 PCIe 80GB8x A100 PCIe 80GB
    SPECFEM3DTotal Time (Sec)four_material_simple_modelno1,2687740211378412215
    SPECFEM3DNRFfour_material_simple_modelyes1x19x36x68x116x19x35x67x100x

    Engineering

    CPU Server: Dual Xeon Gold 6240@2.60GHz | GPU Server: Dual EPYC 7742@2.25GHz with 4x NVIDIA A30 | FUN3D Benchmark: dpw_wbt0_crs-3.6Mn_5, CUDA Version: 11.8

    Geoscience

    CPU Server: Dual Xeon Gold 6240@2.60GHz | GPU Server: Dual EPYC 7742@2.25GHz with 4x NVIDIA A30 | ICON Benchmark: QUBICC 160km resolution, CUDA Version: 11.8 | RTM Benchmark: Isotropic Radius 4, CUDA Version: 11.8 | SPECFEM3D Benchmark: four_material_simple_model, CUDA Version: 11.8

    Molecular Dynamics

    CPU Server: Dual Xeon Gold 6240@2.60GHz | GPU Server: Dual EPYC 7742@2.25GHz with 4x NVIDIA A30 | AMBER Benchmark: DC-Cellulose_NVE, CUDA Version: 11.8 | GROMACS Benchmark: STMV, CUDA Version: 11.8 | LAMMPS Benchmark: SNAP, CUDA Version: 11.8 | NAMD Benchmark: apoa1_nve_cuda, CUDA Version: 11.8 | Relion Benchmark: Plasmodium Ribosome (2D), CUDA Version: 11.4.2

    Physics

    CPU Server: Dual Xeon Gold 6240@2.60GHz | GPU Server: Dual EPYC 7742@2.25GHz with 4x NVIDIA A30 | Chroma Benchmark: szscl21_24_128, CUDA Version: 11.3.1 | GTC Benchmark: moi#proc.in, CUDA Version: 11.8 | MILC Benchmark: Apex Medium, CUDA Version: 11.8

    Quantum Mechanics

    CPU Server: Dual Xeon Gold 6240@2.60GHz | GPU Server: Dual EPYC 7742@2.25GHz with 4x NVIDIA A30 | Quantum Espresso Benchmark: AUSURF112-jR, CUDA Version: 11.8


    Detailed A30 application performance data is located below in alphabetical order.

    AMBER

    Molecular Dynamics

    Suite of programs to simulate molecular dynamics on biomolecule

    VERSION

    22.0-AT_22.3

    ACCELERATED FEATURES

    • PMEMD Explicit Solvent and GB Implicit Solvent

    SCALABILITY

    Multi-GPU and Single Node

    MORE INFORMATION

    http://ambermd.org/GPUSupport.php

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x A302x A304x A308x A30
    AMBER [PME-Cellulose_NPT_4fs]ns/dayDC-Cellulose_NPTyes4.1389177355714
    AMBER [PME-Cellulose_NPT_4fs]NRFDC-Cellulose_NPTyes1x22x43x86x173x
    AMBER [PME-Cellulose_NVE_4fs]ns/dayDC-Cellulose_NVEyes4.1291181362727
    AMBER [PME-Cellulose_NVE_4fs]NRFDC-Cellulose_NVEyes1x22x44x88x176x
    AMBER [PME-FactorIX_NPT_4fs]ns/dayDC-FactorIX_NPTyes20.714068111,6163,241
    AMBER [PME-FactorIX_NPT_4fs]NRFDC-FactorIX_NPTyes1x20x39x78x156x
    AMBER [PME-FactorIX_NVE_4fs]ns/dayDC-FactorIX_NVEyes20.954188261,6513,311
    AMBER [PME-FactorIX_NVE_4fs]NRFDC-FactorIX_NVEyes1x20x39x79x158x
    AMBER [PME-JAC_NPT_4fs]ns/dayDC-JAC_NPTyes84.611,5032,9895,97311,932
    AMBER [PME-JAC_NPT_4fs]NRFDC-JAC_NPTyes1x18x35x71x141x
    AMBER [PME-JAC_NVE_4fs]ns/dayDC-JAC_NVEyes85.161,5313,0456,07712,277
    AMBER [PME-JAC_NVE_4fs]NRFDC-JAC_NVEyes1x18x36x71x144x
    AMBER [PME-STMV_NPT_4fs]ns/dayDC-STMV_NPTyes1.382958116233
    AMBER [PME-STMV_NPT_4fs]NRFDC-STMV_NPTyes1x21x42x84x169x
    AMBER [FEP-GTI_Complex 1fs]ns/dayFEP-GTI_Complexyes9.8999198395790
    AMBER [FEP-GTI_Complex 1fs]NRFFEP-GTI_Complexyes1x10x20x40x80x

    Chroma

    Physics

    Lattice Quantum Chromodynamics (LQCD)

    VERSION

    V 2021.08

    ACCELERATED FEATURES

    • Wilson-clover fermions, Krylov solvers, Domain-decomposition
    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)2x A304x A308x A30
    ChromaTotal Time (Sec)szscl21_24_128no1,115351811
    ChromaNRFszscl21_24_128yes1x33x62x103x

    FUN3D

    Engineering

    Suite of tools actively developed at NASA for Aeronautics and Space Technology by modeling fluid flow

    VERSION

    13.7 (update 1)

    ACCELERATED FEATURES

    • Full range of Mach number regimes for the Reynolds-averaged Navier Stokes (RANS) formulation

    SCALABILITY

    Multi-GPU and Single-Node

    MORE INFORMATION

    https://fun3d.larc.nasa.gov

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x A302x A304x A308x A30
    FUN3DLoop Time (Sec)dpw_wbt0_crs-3.6Mn_5no495111552918
    FUN3DNRFdpw_wbt0_crs-3.6Mn_5yes1x5x11x21x34x

    GROMACS

    Molecular Dynamics

    Simulation of biochemical molecules with complicated bond interactions

    VERSION

    2022.3

    ACCELERATED FEATURES

    • Implicit (5x), Explicit (2x) Solvent
    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x A302x A304x A308x A30
    GROMACS [ADH Dodec]ns/dayADH Dodecyes67201287378-
    GROMACS [ADH Dodec]NRFADH Dodecyes1x3x6x7x-
    GROMACS [Cellulose]ns/dayCelluloseyes196091119147
    GROMACS [Cellulose]NRFCelluloseyes1x3x5x9x11x
    GROMACS [STMV]ns/daySTMVyes412224159
    GROMACS [STMV]NRFSTMVyes1x3x5x10x14x

    GTC

    Physics

    GTC is used for Gyrokinetic Particle Simulation of Turbulent Transport in Burning Plasmas

    VERSION

    V 4.5 Updated

    ACCELERATED FEATURES

    • Push, shift, and collision

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x A302x A304x A308x A30
    GTCMpush/Secmoi#proc.inyes352855311,0491,774
    GTCNRFmoi#proc.inyes1x8x15x31x52x

    ICON

    Weather and Climate

    A global unified atmosphere model for numerical weather prediction and climate modeling research

    VERSION

    2.6.5_RC

    ACCELERATED FEATURES

    • Full model of dynamics and physics

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    https://code.mpimet.mpg.de/projects/iconpublic

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x A302x A304x A308x A30
    ICON [SLAM 191 - 160KM - no radiation]Integrate_nh (sec)SLAM 191 levels 160 km resolution without radiationno2,431571354233206
    ICON [SLAM 191 - 160KM - no radiation]NRFSLAM 191 levels 160 km resolution without radiationyes1x4x7x10x12x
    ICON [QUBICC 160 km resolution]Integrate_nh (sec)SLAM 191 levels 160 km resolution with radiationno2,213502302193164
    ICON [QUBICC 160 km resolution]NRFSLAM 191 levels 160 km resolution with radiationyes1x4x7x11x13x

    LAMMPS

    Molecular Dynamics

    Classical molecular dynamics package

    VERSION

    stable_23Jun2022_update1

    ACCELERATED FEATURES

    • Lennard-Jones, Gay-Berne, Tersoff, many more potentials
    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x A302x A304x A308x A30
    LAMMPS [LJ 2.5]ATOM-Time Steps/sLJ 2.5yes1.11E+083.09E+085.94E+081.10E+091.46E+09
    LAMMPS [LJ 2.5]NRFLJ 2.5yes1x3x5x10x14x
    LAMMPS [EAM]ATOM-Time Steps/sEAMyes5.33E+071.37E+082.58E+084.70E+087.30E+08
    LAMMPS [EAM]NRFEAMyes1x3x5x9x14x
    LAMMPS [ReaxFF/C]ATOM-Time Steps/sReaxFF/Cyes4.45E+052.88E+065.52E+069.98E+061.41E+07
    LAMMPS [ReaxFF/C]NRFReaxFF/Cyes1x9x17x31x44x
    LAMMPS [SNAP]ATOM-Time Steps/sSNAPyes1.08E+051.11E+062.19E+064.37E+068.54E+06
    LAMMPS [SNAP]NRFSNAPyes1x11x21x42x83x
    LAMMPS [Tersoff]ATOM-Time Steps/sTersoffyes2.77E+072.51E+084.37E+087.96E+081.03E+09
    LAMMPS [Tersoff]NRFTersoffyes1x9x16x29x38x

    MILC

    Physics

    Lattice Quantum Chromodynamics (LQCD) codes simulate how elemental particles are formed and bound by the “strong force” to create larger particles like protons and neutrons

    VERSION

    feature/gauge-action-quda_16a2d47119

    ACCELERATED FEATURES

    • Staggered fermions, Krylov solvers, Gauge-link fattening

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    https://ngc.nvidia.com/catalog/containers/hpc:milc

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x A302x A304x A308x A30
    MILCTotal Time (Sec)Apex Mediumno71,5954,7102,0251,087697
    MILCNRFApex Mediumyes1x17x39x72x113x

    NAMD

    Molecular Dynamics

    Designed for high-performance simulation of large molecular systems

    VERSION

    GPU, AMD CPU V 3.0a13 ; Intel CPU V 2.15a AVX512

    ACCELERATED FEATURES

    • Full electrostatics with PME and most simulation features

    SCALABILITY

    Up to 100M atom capable, multi-GPU, single node

    MORE INFORMATION

    http://www.ks.uiuc.edu/Research/namd/

    https://ngc.nvidia.com/catalog/containers/hpc:namd

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x A302x A304x A308x A30
    NAMD [apoa1_npt_cuda]Ave ns/dayapoa1_npt_cudayes19.1591181362726
    NAMD [apoa1_npt_cuda]NRFapoa1_npt_cudayes1x5x9x19x38x
    NAMD [apoa1_nptsr_cuda]Ave ns/dayapoa1_nptsr_cudayes19.5994187371745
    NAMD [apoa1_nptsr_cuda]NRFapoa1_nptsr_cudayes1x5x10x19x38x
    NAMD [apoa1_nve_cuda]Ave ns/dayapoa1_nve_cudayes20.75111221441882
    NAMD [apoa1_nve_cuda]NRFapoa1_nve_cudayes1x5x11x21x42x
    NAMD [stmv_npt_cuda]Ave ns/daystmv_npt_cudayes1.877142958
    NAMD [stmv_npt_cuda]NRFstmv_npt_cudayes1x4x8x15x31x
    NAMD [stmv_nptsr_cuda]Ave ns/daystmv_nptsr_cudayes1.817153059
    NAMD [stmv_nptsr_cuda]NRFstmv_nptsr_cudayes1x4x8x16x33x
    NAMD [stmv_nve_cuda]Ave ns/daystmv_nve_cudayes1.948163265
    NAMD [stmv_nve_cuda]NRFstmv_nve_cudayes1x4x8x17x34x

    RELION

    Microscopy

    Stand-alone computer program that employs an empirical Bayesianapproach to refinement of (multiple) 3D reconstructions or 2D class averages in electron cryo-microscopy (cryo-EM)

    VERSION

    3.1.3

    ACCELERATED FEATURES

    • Reduced memory requirements; high-resolution cryo-EM structure determination in a matter of day on a single workstation
    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x A302x A304x A308x A30
    Relion [Plasmodium Ribosome]Total Wall Clock (Sec)MB numbers Plasmodium Ribosime on Relion-3.0no12,7423,4171,8611,4231,297
    Relion [Plasmodium Ribosome]NRFMB numbers Plasmodium Ribosime on Relion-3.0yes1x4x7x9x10x

    RTM

    Geoscience

    Reverse time migration (RTM) modeling is a critical component in the seismic processing workflow of oil and gas exploration

    VERSION

    nvidia_2021_05

    ACCELERATED FEATURES

    • Batch algorithm

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    http://www.tsunamidevelopment.com/assets/rtm.pdf

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x A302x A304x A308x A30
    RTM [Isotropic Radius 4]Mcells/sIsotropic Radius 4yes11,31844,05187,806175,438350,760
    RTM [Isotropic Radius 4]NRFIsotropic Radius 4yes1x4x8x16x31x
    RTM [TTI Radius 8 1-pass]Mcells/sTTI Radius 8 1-passyes3,7736,75713,36126,71053,281
    RTM [TTI Radius 8 1-pass]NRFTTI Radius 8 1-passyes1x2x4x7x14x
    RTM [TTI RX 2Pass mgpu]Mcells/sTTI RX 2Pass mgpuyes3,7737,02613,89927,64255,140
    RTM [TTI RX 2Pass mgpu]NRFTTI RX 2Pass mgpuyes1x2x4x7x15x

    SPECFEM3D

    Geoscience

    Simulates Seismic wave propagation

    VERSION

    devel_fef2ace9

    ACCELERATED FEATURES

    • OpenCL and CUDA hardware accelerators, based on an automatic source-to-source transformation library

    SCALABILITY

    Multi-GPU and Single-Node

    MORE INFORMATION

    https://geodynamics.org/cig/software/specfem3d/

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x A302x A304x A308x A30
    SPECFEM3DTotal Time (Sec)four_material_simple_modelno1,268156804123
    SPECFEM3DNRFfour_material_simple_modelyes1x9x18x35x64x

    Engineering

    CPU Server: Dual Xeon Gold 6240@2.60GHz | GPU Server: Dual EPYC 7742@2.25GHz with 4x NVIDIA A40 | FUN3D Benchmark: dpw_wbt0_crs-3.6Mn_5, CUDA Version: 11.8

    Geoscience

    CPU Server: Dual Xeon Gold 6240@2.60GHz | GPU Server: Dual EPYC 7742@2.25GHz with 4x NVIDIA A40 | ICON Benchmark: QUBICC 160 km resolution, CUDA Version: 11.8 | SPECFEM3D Benchmark: four_material_simple_model, CUDA Version: 11.8

    Molecular Dynamics

    CPU Server: Dual Xeon Gold 6240@2.60GHz | GPU Server: Dual EPYC 7742@2.25GHz with 4x NVIDIA A40 | AMBER Benchmark: DC-Cellulose_NVE, CUDA Version: 11.8 | GROMACS Benchmark: STMV, CUDA Version: 11.8 | LAMMPS Benchmark: SNAP, CUDA Version: 11.8 | NAMD Benchmark: apoa1_nve_cuda, CUDA Version: 11.8 | Relion Benchmark: Plasmodium Ribosome (2D), CUDA Version: 11.4.2

    Physics

    CPU Server: Dual Xeon Gold 6240@2.60GHz | GPU Server: Dual EPYC 7742@2.25GHz with 4x NVIDIA A40 | Chroma Benchmark: szscl21_24_128, CUDA Version: 11.3.1 | GTC Benchmark: moi#proc.in, CUDA Version: 11.8 | MILC Benchmark: Apex Medium, CUDA Version: 11.8


    Detailed A40 application performance data is located below in alphabetical order.

    AMBER

    Molecular Dynamics

    Suite of programs to simulate molecular dynamics on biomolecule

    VERSION

    22.0-AT_22.3

    ACCELERATED FEATURES

    • PMEMD Explicit Solvent and GB Implicit Solvent

    SCALABILITY

    Multi-GPU and Single Node

    MORE INFORMATION

    http://ambermd.org/GPUSupport.php

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x A402x A404x A408x A40
    AMBER [PME-Cellulose_NPT_4fs]ns/dayDC-Cellulose_NPTyes4.1397195390781
    AMBER [PME-Cellulose_NPT_4fs]NRFDC-Cellulose_NPTyes1x23x47x94x189x
    AMBER [PME-Cellulose_NVE_4fs]ns/dayDC-Cellulose_NVEyes4.1298198396794
    AMBER [PME-Cellulose_NVE_4fs]NRFDC-Cellulose_NVEyes1x24x48x96x193x
    AMBER [PME-FactorIX_NPT_4fs]ns/dayDC-FactorIX_NPTyes20.714869841,9653,954
    AMBER [PME-FactorIX_NPT_4fs]NRFDC-FactorIX_NPTyes1x23x48x95x191x
    AMBER [PME-FactorIX_NVE_4fs]ns/dayDC-FactorIX_NVEyes20.954971,0062,0154,022
    AMBER [PME-FactorIX_NVE_4fs]NRFDC-FactorIX_NVEyes1x24x48x96x192x
    AMBER [PME-JAC_NPT_4fs]ns/dayDC-JAC_NPTyes84.611,9223,8897,78015,568
    AMBER [PME-JAC_NPT_4fs]NRFDC-JAC_NPTyes1x23x46x92x184x
    AMBER [PME-JAC_NVE_4fs]ns/dayDC-JAC_NVEyes85.161,9483,9467,90616,037
    AMBER [PME-JAC_NVE_4fs]NRFDC-JAC_NVEyes1x23x46x93x188x
    AMBER [PME-STMV_NPT_4fs]ns/dayDC-STMV_NPTyes1.383263127254
    AMBER [PME-STMV_NPT_4fs]NRFDC-STMV_NPTyes1x23x46x92x184x
    AMBER [FEP-GTI_Complex 1fs]ns/dayFEP-GTI_Complexyes9.89116232463926
    AMBER [FEP-GTI_Complex 1fs]NRFFEP-GTI_Complexyes1x12x23x47x94x

    Chroma

    Physics

    Lattice Quantum Chromodynamics (LQCD)

    VERSION

    V 2021.08

    ACCELERATED FEATURES

    • Wilson-clover fermions, Krylov solvers, Domain-decomposition
    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x A402x A404x A408x A40
    ChromaTotal Time (Sec)szscl21_24_128no1,11578412213
    ChromaNRFszscl21_24_128yes1x15x28x52x89x

    FUN3D

    Engineering

    Suite of tools actively developed at NASA for Aeronautics and Space Technology by modeling fluid flow

    VERSION

    13.7 (update 1)

    ACCELERATED FEATURES

    • Full range of Mach number regimes for the Reynolds-averaged Navier Stokes (RANS) formulation

    SCALABILITY

    Multi-GPU and Single-Node

    MORE INFORMATION

    https://fun3d.larc.nasa.gov

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x A402x A404x A408x A40
    FUN3DLoop Time (Sec)dpw_wbt0_crs-3.6Mn_5no4952311175932
    FUN3DNRFdpw_wbt0_crs-3.6Mn_5yes1x2x5x10x19x

    GROMACS

    Molecular Dynamics

    Simulation of biochemical molecules with complicated bond interactions

    VERSION

    2022.3

    ACCELERATED FEATURES

    • Implicit (5x), Explicit (2x) Solvent
    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x A402x A404x A408x A40
    GROMACS [ADH Dodec]ns/dayADH Dodecyes67340379505-
    GROMACS [ADH Dodec]NRFADH Dodecyes1x7x8x10x-
    GROMACS [Cellulose]ns/dayCelluloseyes1977110160177
    GROMACS [Cellulose]NRFCelluloseyes1x5x8x12x13x
    GROMACS [STMV]ns/daySTMVyes420386175
    GROMACS [STMV]NRFSTMVyes1x5x9x15x18x

    GTC

    Physics

    GTC is used for Gyrokinetic Particle Simulation of Turbulent Transport in Burning Plasmas

    VERSION

    V 4.5 Updated

    ACCELERATED FEATURES

    • Push, shift, and collision

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x A402x A404x A408x A40
    GTCMpush/Secmoi#proc.inyes353055631,1121,854
    GTCNRFmoi#proc.inyes1x9x16x32x54x

    ICON

    Weather and Climate

    A global unified atmosphere model for numerical weather prediction and climate modeling research

    VERSION

    2.6.5_RC

    ACCELERATED FEATURES

    • Full model of dynamics and physics

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    https://code.mpimet.mpg.de/projects/iconpublic

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x A402x A404x A408x A40
    ICON [SLAM 191 - 160KM - no radiation]Integrate_nh (sec)SLAM 191 levels 160 km resolution without radiationno2,431741420262223
    ICON [SLAM 191 - 160KM - no radiation]NRFSLAM 191 levels 160 km resolution without radiationyes1x3x6x9x11x
    ICON [QUBICC 160 km resolution]Integrate_nh (sec)SLAM 191 levels 160 km resolution with radiationno2,213747415253192
    ICON [QUBICC 160 km resolution]NRFSLAM 191 levels 160 km resolution with radiationyes1x3x5x9x12x

    LAMMPS

    Molecular Dynamics

    Classical molecular dynamics package

    VERSION

    stable_23Jun2022_update1

    ACCELERATED FEATURES

    • Lennard-Jones, Gay-Berne, Tersoff, many more potentials
    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x A402x A404x A408x A40
    LAMMPS [ReaxFF/C]ATOM-Time Steps/sReaxFF/Cyes4.45E+056.85E+051.32E+062.50E+064.28E+06
    LAMMPS [ReaxFF/C]NRFReaxFF/Cyes1x2x3x7x13x
    LAMMPS [SNAP]ATOM-Time Steps/sSNAPyes1.08E+052.43E+054.87E+059.74E+051.93E+06
    LAMMPS [SNAP]NRFSNAPyes1x2x5x9x19x
    LAMMPS [Tersoff]ATOM-Time Steps/sTersoffyes2.77E+075.23E+071.03E+082.02E+083.51E+08
    LAMMPS [Tersoff]NRFTersoffyes1x2x4x7x13x

    MILC

    Physics

    Lattice Quantum Chromodynamics (LQCD) codes simulate how elemental particles are formed and bound by the “strong force” to create larger particles like protons and neutrons

    VERSION

    feature/gauge-action-quda_16a2d47119

    ACCELERATED FEATURES

    • Staggered fermions, Krylov solvers, Gauge-link fattening

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    https://ngc.nvidia.com/catalog/containers/hpc:milc

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x A402x A404x A408x A40
    MILCTotal Time (Sec)Apex Mediumno71,5956,0053,0941,7621,074
    MILCNRFApex Mediumyes1x13x25x45x73x

    NAMD

    Molecular Dynamics

    Designed for high-performance simulation of large molecular systems

    VERSION

    GPU, AMD CPU V 3.0a13 ; Intel CPU V 2.15a AVX512

    ACCELERATED FEATURES

    • Full electrostatics with PME and most simulation features

    SCALABILITY

    Up to 100M atom capable, multi-GPU, single node

    MORE INFORMATION

    http://www.ks.uiuc.edu/Research/namd/

    https://ngc.nvidia.com/catalog/containers/hpc:namd

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x A402x A404x A408x A40
    NAMD [apoa1_npt_cuda]Ave ns/dayapoa1_npt_cudayes19.15103208416835
    NAMD [apoa1_npt_cuda]NRFapoa1_npt_cudayes1x5x11x22x44x
    NAMD [apoa1_nptsr_cuda]Ave ns/dayapoa1_nptsr_cudayes19.59109220440882
    NAMD [apoa1_nptsr_cuda]NRFapoa1_nptsr_cudayes1x6x11x22x45x
    NAMD [apoa1_nve_cuda]Ave ns/dayapoa1_nve_cudayes20.751442925851,172
    NAMD [apoa1_nve_cuda]NRFapoa1_nve_cudayes1x7x14x28x56x
    NAMD [stmv_npt_cuda]Ave ns/daystmv_npt_cudayes1.878153061
    NAMD [stmv_npt_cuda]NRFstmv_npt_cudayes1x4x8x16x32x
    NAMD [stmv_nptsr_cuda]Ave ns/daystmv_nptsr_cudayes1.818163264
    NAMD [stmv_nptsr_cuda]NRFstmv_nptsr_cudayes1x4x9x18x35x
    NAMD [stmv_nve_cuda]Ave ns/daystmv_nve_cudayes1.9410203979
    NAMD [stmv_nve_cuda]NRFstmv_nve_cudayes1x5x10x20x41x

    RELION

    Microscopy

    Stand-alone computer program that employs an empirical Bayesianapproach to refinement of (multiple) 3D reconstructions or 2D class averages in electron cryo-microscopy (cryo-EM)

    VERSION

    3.1.3

    ACCELERATED FEATURES

    • Reduced memory requirements; high-resolution cryo-EM structure determination in a matter of day on a single workstation
    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x A402x A404x A408x A40
    Relion [Plasmodium Ribosome]Total Wall Clock (Sec)MB numbers Plasmodium Ribosime on Relion-3.0no12,7423,2071,7161,3441,323
    Relion [Plasmodium Ribosome]NRFMB numbers Plasmodium Ribosime on Relion-3.0yes1x4x7x9x10x

    SPECFEM3D

    Geoscience

    Simulates Seismic wave propagation

    VERSION

    devel_fef2ace9

    ACCELERATED FEATURES

    • OpenCL and CUDA hardware accelerators, based on an automatic source-to-source transformation library

    SCALABILITY

    Multi-GPU and Single-Node

    MORE INFORMATION

    https://geodynamics.org/cig/software/specfem3d/

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x A402x A404x A408x A40
    SPECFEM3DTotal Time (Sec)four_material_simple_modelno1,2682031035329
    SPECFEM3DNRFfour_material_simple_modelyes1x6x14x27x50x

    Engineering

    CPU Server: Dual Xeon Gold 6240@2.60GHz | GPU Server: Dual Xeon Gold 6240@2.60GHz with 4x NVIDIA V100 SXM2 | FUN3D Benchmark: dpw_wbt0_crs-3.6Mn_5, CUDA Version: 11.8

    Geoscience

    CPU Server: Dual Xeon Gold 6240@2.60GHz | GPU Server: Dual Xeon Gold 6240@2.60GHz 4x NVIDIA V100 SXM2 | ICON Benchmark: QUBICC 160 km resolution, CUDA Version: 11.8 | RTM Benchmark: Isotropic Radius 4, CUDA Version: 11.8 | SPENFEM3D Benchmark: four_material_simple_model, CUDA Version: 11.8

    Microscopy and Molecular Dynamics

    CPU Server: Dual Xeon Gold 6240@2.60GHz | GPU Server: Dual Xeon Gold 6240@2.60GHz with 4x NVIDIA V100 SXM2 | AMBER Benchmark: DC-Cellulose_NVE, CUDA Version: 11.8 | GROMACS Benchmark: STMV, CUDA Version: 11.8 | LAMMPS Benchmark: SNAP, CUDA Version: 11.8 | NAMD Benchmark: apoa1_nve_cuda, CUDA Version: 11.8

    Physics

    CPU Server: Dual Xeon Gold 6240@2.60GHz | GPU Server: Dual Xeon Gold 6240@2.60GHz with 4x NVIDIA V100 SXM2 | Chroma Benchmark: szscl21_24_128, CUDA Version: 11.3.1 | GTC Benchmark: moi#proc.in, CUDA Version: 11.8 | MILC Benchmark: Apex Medium, CUDA Version: 11.8

    Quantum Mechanics

    CPU Server: Dual Xeon Gold 6240@2.60GHz | GPU Server: Dual Xeon Gold 6240@2.60GHz with 4x NVIDIA V100 SXM2 | Quantum Espresso Benchmark: AUSURF112-jR, CUDA Version: 11.8


    Detailed V100 application performance data is located below in alphabetical order.

    AMBER

    Molecular Dynamics

    Suite of programs to simulate molecular dynamics on biomolecule

    VERSION

    22.0-AT_22.3

    ACCELERATED FEATURES

    • PMEMD Explicit Solvent and GB Implicit Solvent

    SCALABILITY

    Multi-GPU and Single Node

    MORE INFORMATION

    http://ambermd.org/GPUSupport.php

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x V100 SXM2 32GB2x V100 SXM2 32GB4x V100 SXM2 32GB8x V100 SXM2 32GB1x V100S PCIe 32GB2x V100S PCIe 32GB4x V100S PCIe 32GB8x V100S PCIe 32GB
    AMBER [PME-Cellulose_NPT_4fs]ns/dayDC-Cellulose_NPTyes4.1310020240680597199400808
    AMBER [PME-Cellulose_NPT_4fs]NRFDC-Cellulose_NPTyes1x24x49x98x195x24x48x97x196x
    AMBER [PME-Cellulose_NVE_4fs]ns/dayDC-Cellulose_NVEyes4.1210120541281899202406815
    AMBER [PME-Cellulose_NVE_4fs]NRFDC-Cellulose_NVEyes1x25x50x100x198x24x49x98x198x
    AMBER [PME-FactorIX_NPT_4fs]ns/dayDC-FactorIX_NPTyes20.714839531,9153,7874709361,8733,784
    AMBER [PME-FactorIX_NPT_4fs]NRFDC-FactorIX_NPTyes1x23x46x92x183x23x45x90x183x
    AMBER [PME-FactorIX_NVE_4fs]ns/dayDC-FactorIX_NVEyes20.954969781,9643,8924759591,9263,869
    AMBER [PME-FactorIX_NVE_4fs]NRFDC-FactorIX_NVEyes1x24x47x94x186x23x46x92x185x
    AMBER [PME-JAC_NPT_4fs]ns/dayDC-JAC_NPTyes84.611,8703,2936,61313,0311,7893,2936,59813,149
    AMBER [PME-JAC_NPT_4fs]NRFDC-JAC_NPTyes1x22x39x78x154x21x39x78x155x
    AMBER [PME-JAC_NVE_4fs]ns/dayDC-JAC_NVEyes85.161,9073,3896,79513,3711,8223,3876,77913,533
    AMBER [PME-JAC_NVE_4fs]NRFDC-JAC_NVEyes1x22x40x80x157x21x40x80x159x
    AMBER [PME-STMV_NPT_4fs]ns/dayDC-STMV_NPTyes1.3831621252492857113226
    AMBER [PME-STMV_NPT_4fs]NRFDC-STMV_NPTyes1x23x45x90x180x20x41x82x164x
    AMBER [FEP-GTI_Complex 1fs]ns/dayFEP-GTI_Complexyes9.89120240480960122245489979
    AMBER [FEP-GTI_Complex 1fs]NRFFEP-GTI_Complexyes1x12x24x49x97x12x25x49x99x

    Chroma

    Physics

    Lattice Quantum Chromodynamics (LQCD)

    VERSION

    V 2021.08

    ACCELERATED FEATURES

    • Wilson-clover fermions, Krylov solvers, Domain-decomposition
    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x V100 SXM2 32GB2x V100 SXM2 32GB4x V100 SXM2 32GB8x V100 SXM2 32GB1x V100S PCIe 32GB2x V100S PCIe 32GB4x V100S PCIe 32GB8x V100S PCIe 32GB
    ChromaTotal Time (Sec)szscl21_24_128no1,115165311710142281513
    ChromaNRFszscl21_24_128yes1x7x37x68x111x8x41x77x85x

    FUN3D

    Engineering

    Suite of tools actively developed at NASA for Aeronautics and Space Technology by modeling fluid flow

    VERSION

    13.7 (update 1)

    ACCELERATED FEATURES

    • Full range of Mach number regimes for the Reynolds-averaged Navier Stokes (RANS) formulation

    SCALABILITY

    Multi-GPU and Single-Node

    MORE INFORMATION

    https://fun3d.larc.nasa.gov

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x V100 SXM2 32GB2x V100 SXM2 32GB4x V100 SXM2 32GB8x V100 SXM2 32GB1x V100S PCIe 32GB2x V100S PCIe 32GB4x V100S PCIe 32GB8x V100S PCIe 32GB
    FUN3DLoop Time (Sec)dpw_wbt0_crs-3.6Mn_5no4959950261588452314
    FUN3DNRFdpw_wbt0_crs-3.6Mn_5yes1x5x12x24x41x6x14x26x44x

    GROMACS

    Molecular Dynamics

    Simulation of biochemical molecules with complicated bond interactions

    VERSION

    2022.3

    ACCELERATED FEATURES

    • Implicit (5x), Explicit (2x) Solvent
    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x V100 SXM2 32GB2x V100 SXM2 32GB4x V100 SXM2 32GB1x RTX60002x RTX60004x RTX60001x V100S PCIe 32GB2x V100S PCIe 32GB4x V100S PCIe 32GB
    GROMACS [ADH Dodec]ns/dayADH Dodecyes67266311472251296-270288330
    GROMACS [ADH Dodec]NRFADH Dodecyes1x5x6x9x5x6x-5x6x7x
    GROMACS [Cellulose]ns/dayCelluloseyes19711031566083-7398-
    GROMACS [Cellulose]NRFCelluloseyes1x4x6x12x3x5x-4x6x-
    GROMACS [STMV]ns/daySTMVyes4163053132532162938
    GROMACS [STMV]NRFSTMVyes1x3x7x13x3x6x7x3x7x9x

    GTC

    Physics

    GTC is used for Gyrokinetic Particle Simulation of Turbulent Transport in Burning Plasmas

    VERSION

    V 4.5 Updated

    ACCELERATED FEATURES

    • Push, shift, and collision

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x V100 SXM2 32GB2x V100 SXM2 32GB4x V100 SXM2 32GB8x V100 SXM2 32GB1x V100S PCIe 32GB2x V100S PCIe 32GB4x V100S PCIe 32GB8x V100S PCIe 32GB
    GTCMpush/Secmoi#proc.inyes352715101,0111,7962985521,0811,945
    GTCNRFmoi#proc.inyes1x8x15x29x52x9x16x31x57x

    ICON

    Weather and Climate

    A global unified atmosphere model for numerical weather prediction and climate modeling research

    VERSION

    2.6.5_RC

    ACCELERATED FEATURES

    • Full model of dynamics and physics

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    https://code.mpimet.mpg.de/projects/iconpublic

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x V100 SXM2 32GB2x V100 SXM2 32GB4x V100 SXM2 32GB8x V100 SXM2 32GB1x V100S PCIe 32GB2x V100S PCIe 32GB4x V100S PCIe 32GB
    ICON [SLAM 191 - 160KM - no radiation]Integrate_nh (sec)SLAM 191 levels 160 km resolution without radiationno2,431591353223167819578248
    ICON [SLAM 191 - 160KM - no radiation]NRFSLAM 191 levels 160 km resolution without radiationyes1x4x7x11x15x3x4x10x
    ICON [QUBICC 160 km resolution]Integrate_nh (sec)SLAM 191 levels 160 km resolution with radiationno2,213514304192143697438215
    ICON [QUBICC 160 km resolution]NRFSLAM 191 levels 160 km resolution with radiationyes1x4x7x12x16x3x5x10x

    LAMMPS

    Molecular Dynamics

    Classical molecular dynamics package

    VERSION

    stable_23Jun2022_update1

    ACCELERATED FEATURES

    • Lennard-Jones, Gay-Berne, Tersoff, many more potentials
    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x V100 SXM2 32GB2x V100 SXM2 32GB4x V100 SXM2 32GB8x V100 SXM2 32GB1x V100S PCIe 32GB2x V100S PCIe 32GB4x V100S PCIe 32GB8x V100S PCIe 32GB
    LAMMPS [LJ 2.5]ATOM-Time Steps/sLJ 2.5yes1.11E+083.41E+086.34E+081.24E+092.24E+093.45E+086.23E+081.15E+091.87E+09
    LAMMPS [LJ 2.5]NRFLJ 2.5yes1x3x6x11x21x3x6x11x17x
    LAMMPS [EAM]ATOM-Time Steps/sEAMyes5.33E+071.23E+082.67E+085.39E+089.74E+081.25E+082.66E+085.15E+088.23E+08
    LAMMPS [EAM]NRFEAMyes1x2x5x11x19x2x5x10x16x
    LAMMPS [ReaxFF/C]ATOM-Time Steps/sReaxFF/Cyes4.45E+053.23E+066.09E+061.14E+071.94E+073.44E+066.42E+061.19E+071.91E+07
    LAMMPS [ReaxFF/C]NRFReaxFF/Cyes1x10x19x35x60x11x20x37x59x
    LAMMPS [SNAP]ATOM-Time Steps/sSNAPyes1.08E+051.42E+062.86E+065.69E+061.14E+071.40E+062.80E+065.58E+061.12E+07
    LAMMPS [SNAP]NRFSNAPyes1x14x28x55x111x14x27x54x108x
    LAMMPS [Tersoff]ATOM-Time Steps/sTersoffyes2.77E+072.71E+084.95E+089.62E+081.80E+092.81E+085.18E+089.83E+081.56E+09
    LAMMPS [Tersoff]NRFTersoffyes1x10x18x35x66x10x19x36x57x

    MILC

    Physics

    Lattice Quantum Chromodynamics (LQCD) codes simulate how elemental particles are formed and bound by the “strong force” to create larger particles like protons and neutrons

    VERSION

    feature/gauge-action-quda_16a2d47119

    ACCELERATED FEATURES

    • Staggered fermions, Krylov solvers, Gauge-link fattening

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    https://ngc.nvidia.com/catalog/containers/hpc:milc

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x V100 SXM2 32GB2x V100 SXM2 32GB4x V100 SXM2 32GB8x V100 SXM2 32GB1x V100S PCIe 32GB2x V100S PCIe 32GB4x V100S PCIe 32GB8x V100S PCIe 32GB
    MILCTotal Time (Sec)Apex Mediumno71,5954,7372,3471,2296893,8642,0201,1031,068
    MILCNRFApex Mediumyes1x17x34x64x114x20x39x71x74x

    NAMD

    Molecular Dynamics

    Designed for high-performance simulation of large molecular systems

    VERSION

    GPU, AMD CPU V 3.0a13 ; Intel CPU V 2.15a AVX512

    ACCELERATED FEATURES

    • Full electrostatics with PME and most simulation features

    SCALABILITY

    Up to 100M atom capable, multi-GPU, single node

    MORE INFORMATION

    http://www.ks.uiuc.edu/Research/namd/

    https://ngc.nvidia.com/catalog/containers/hpc:namd

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x V100 SXM2 32GB2x V100 SXM2 32GB4x V100 SXM2 32GB8x V100 SXM2 32GB1x RTX60002x RTX60004x RTX60008x RTX60001x V100S PCIe 32GB2x V100S PCIe 32GB4x V100S PCIe 32GB8x V100S PCIe 32GB
    NAMD [apoa1_npt_cuda]Ave ns/dayapoa1_npt_cudayes19.1511122344989066133266532114227455905
    NAMD [apoa1_npt_cuda]NRFapoa1_npt_cudayes1x6x12x23x46x3x7x14x28x6x12x24x47x
    NAMD [apoa1_nptsr_cuda]Ave ns/dayapoa1_nptsr_cudayes19.5911623547093570141282562119236473943
    NAMD [apoa1_nptsr_cuda]NRFapoa1_nptsr_cudayes1x6x12x24x48x4x7x14x29x6x12x24x48x
    NAMD [apoa1_nve_cuda]Ave ns/dayapoa1_nve_cudayes20.751422855711,148891793587171442865731,145
    NAMD [apoa1_nve_cuda]NRFapoa1_nve_cudayes1x7x14x28x55x4x9x17x35x7x14x28x55x
    NAMD [stmv_npt_cuda]Ave ns/daystmv_npt_cudayes1.87817346851021419183570
    NAMD [stmv_npt_cuda]NRFstmv_npt_cudayes1x5x9x18x36x3x6x11x22x5x9x19x38x
    NAMD [stmv_nptsr_cuda]Ave ns/daystmv_nptsr_cudayes1.81918367151122449183672
    NAMD [stmv_nptsr_cuda]NRFstmv_nptsr_cudayes1x5x10x20x39x3x6x12x24x5x10x20x40x
    NAMD [stmv_nve_cuda]Ave ns/daystmv_nve_cudayes1.9410204079613265110204080
    NAMD [stmv_nve_cuda]NRFstmv_nve_cudayes1x5x10x20x41x3x7x13x26x5x10x21x41x

    NV-WRFg

    Numerical Weather Prediction

    Numerical weather prediction system designed for both atmospheric research and operational forecasting applications

    VERSION

    3.8.1 NCAR (CPU) / 3.8.1 WRFg 10_28 (GPU)

    ACCELERATED FEATURES

    • Dynamics modules
    • Several Physics modules

    SCALABILITY

    Multi-GPU and Single Node

    MORE INFORMATION

    https://wrfg.net/wrfg-description/

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)4x V100 SXM2 32GB4x V100S PCIe 32GB
    NV-WRFgSeconds / TimestampsConus_2.5k_JAno60.620.68
    NV-WRFgNRFConus_2.5k_JAyes1x10x9x

    Quantum Espresso

    Material Science (Quantum Chemistry)

    An Open-source suite of computer codes for electronic structure calculations and materials modeling at the nanoscale

    VERSION

    V7.0 CPU; V7.1 GPU

    ACCELERATED FEATURES

    • linear algebra (matrix multiply)
    • explicit computational kernels
    • 3D FFTs

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    http://www.quantum-espresso.org

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x V100 SXM2 32GB2x V100 SXM2 32GB4x V100 SXM2 32GB8x V100 SXM2 32GB1x V100S PCIe 32GB2x V100S PCIe 32GB4x V100S PCIe 32GB8x V100S PCIe 32GB
    Quantum EspresssoTotal CPU Time (Sec)AUSURF112-jRno71827013382582601308869
    Quantum EspresssoNRFAUSURF112-jRyes1x3x6x10x14x3x6x9x12x

    RELION

    Microscopy

    Stand-alone computer program that employs an empirical Bayesianapproach to refinement of (multiple) 3D reconstructions or 2D class averages in electron cryo-microscopy (cryo-EM)

    VERSION

    3.1.3

    ACCELERATED FEATURES

    • Reduced memory requirements; high-resolution cryo-EM structure determination in a matter of day on a single workstation
    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x V100 SXM2 32GB2x V100 SXM2 32GB1x V100S PCIe 32GB2x V100S PCIe 32GB
    Relion [Plasmodium Ribosome]Total Wall Clock (Sec)MB numbers Plasmodium Ribosime on Relion-3.0no12,7423,4172,0953,4432,083
    Relion [Plasmodium Ribosome]NRFMB numbers Plasmodium Ribosime on Relion-3.0yes1x4x6x4x6x

    RTM

    Geoscience

    Reverse time migration (RTM) modeling is a critical component in the seismic processing workflow of oil and gas exploration

    VERSION

    nvidia_2021_05

    ACCELERATED FEATURES

    • Batch algorithm

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    http://www.tsunamidevelopment.com/assets/rtm.pdf

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x V100 SXM2 32GB2x V100 SXM2 32GB4x V100 SXM2 32GB8x V100 SXM2 32GB1x V100S PCIe 32GB2x V100S PCIe 32GB4x V100S PCIe 32GB8x V100S PCIe 32GB
    RTM [Isotropic Radius 4]Mcells/sIsotropic Radius 4yes11,31838,09175,978152,022303,98646,03791,790183,515367,252
    RTM [Isotropic Radius 4]NRFIsotropic Radius 4yes1x3x7x13x27x4x8x16x32x
    RTM [TTI Radius 8 1-pass]Mcells/sTTI Radius 8 1-passyes3,7738,53816,88533,07065,7329,27618,30436,39372,591
    RTM [TTI Radius 8 1-pass]NRFTTI Radius 8 1-passyes1x2x4x9x17x2x5x10x19x
    RTM [TTI RX 2Pass mgpu]Mcells/sTTI RX 2Pass mgpuyes3,7737,16514,20328,17756,2358,49116,87133,54766,849
    RTM [TTI RX 2Pass mgpu]NRFTTI RX 2Pass mgpuyes1x2x4x7x15x2x4x9x18x

    SPECFEM3D

    Geoscience

    Simulates Seismic wave propagation

    VERSION

    devel_fef2ace9

    ACCELERATED FEATURES

    • OpenCL and CUDA hardware accelerators, based on an automatic source-to-source transformation library

    SCALABILITY

    Multi-GPU and Single-Node

    MORE INFORMATION

    https://geodynamics.org/cig/software/specfem3d/

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)1x V100 SXM2 32GB2x V100 SXM2 32GB4x V100 SXM2 32GB8x V100 SXM2 32GB1x V100S PCIe 32GB2x V100S PCIe 32GB4x V100S PCIe 32GB8x V100S PCIe 32GB
    SPECFEM3DTotal Time (Sec)four_material_simple_modelno1,268159824425131683723
    SPECFEM3DNRFfour_material_simple_modelyes1x9x18x33x58x11x21x39x63x

    Geoscience

    CPU Server: Dual Xeon Gold 6240@2.60GHz | GPU Server: Dual Xeon Gold 6240@2.60GHz with 4x NVIDIA T4 PCIe | SPECFEM3D Benchmark: four_material_simple_model, CUDA Version 11.8

    Microscopy and Molecular Dynamics

    CPU Server: Dual Xeon Gold 6240@2.60GHz, GPU Server: Dual Xeon Gold 6240@2.60GHz with 4x NVIDIA T4 PCIe | AMBER Benchmark: DC-STMV_NPT, CUDA Version: 11.8 | Gromacs Benchmark: STMV, CUDA Version: 11.8 | NAMD Benchmark: apoa1_nve_cuda, CUDA Version: 11.8 | Relion Benchmark: Plasmodium Ribosome (2D), CUDA Version: 11.4.2

    Physics

    CPU Server: Dual Xeon Gold 6240@2.60GHz, GPU Server: Dual Xeon Gold 6240@2.60GHz with 4x NVIDIA T4 PCIe | Chroma Benchmark: szscl21_24_128, CUDA Version: 11.3.1 | GTC Benchmark: moi#proc.in, CUDA Version: 11.8 | MILC Benchmark: Apex Medium, CUDA Version: 11.8


    Detailed T4 application performance data is located below in alphabetical order.


    AMBER

    Molecular Dynamics

    Suite of programs to simulate molecular dynamics on biomolecule

    VERSION

    22.0-AT_22.3

    ACCELERATED FEATURES

    • PMEMD Explicit Solvent and GB Implicit Solvent

    SCALABILITY

    Multi-GPU and Single Node

    MORE INFORMATION

    http://ambermd.org/GPUSupport.php

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)2x T4 PCIe4x T4 PCIe8x T4 PCIe
    AMBER [PME-Cellulose_NPT_4fs]ns/dayDC-Cellulose_NPTyes4.1361121245
    AMBER [PME-Cellulose_NPT_4fs]NRFDC-Cellulose_NPTyes1x15x29x59x
    AMBER [PME-Cellulose_NVE_4fs]ns/dayDC-Cellulose_NVEyes4.1262123248
    AMBER [PME-Cellulose_NVE_4fs]NRFDC-Cellulose_NVEyes1x15x30x60x
    AMBER [PME-FactorIX_NPT_4fs]ns/dayDC-FactorIX_NPTyes20.712856031,213
    AMBER [PME-FactorIX_NPT_4fs]NRFDC-FactorIX_NPTyes1x14x29x59x
    AMBER [PME-FactorIX_NVE_4fs]ns/dayDC-FactorIX_NVEyes20.952926161,202
    AMBER [PME-FactorIX_NVE_4fs]NRFDC-FactorIX_NVEyes1x14x29x57x
    AMBER [PME-JAC_NPT_4fs]ns/dayDC-JAC_NPTyes84.611,2452,3654,491
    AMBER [PME-JAC_NPT_4fs]NRFDC-JAC_NPTyes1x15x28x53x
    AMBER [PME-JAC_NVE_4fs]ns/dayDC-JAC_NVEyes85.161,2592,5044,979
    AMBER [PME-JAC_NVE_4fs]NRFDC-JAC_NVEyes1x15x29x58x
    AMBER [PME-STMV_NPT_4fs]ns/dayDC-STMV_NPTyes1.38214283
    AMBER [PME-STMV_NPT_4fs]NRFDC-STMV_NPTyes1x15x30x60x
    AMBER [FEP-GTI_Complex 1fs]ns/dayFEP-GTI_Complexyes9.89107213427
    AMBER [FEP-GTI_Complex 1fs]NRFFEP-GTI_Complexyes1x11x22x43x

    Chroma

    Physics

    Lattice Quantum Chromodynamics (LQCD)

    VERSION

    V 2021.08

    ACCELERATED FEATURES

    • Wilson-clover fermions, Krylov solvers, Domain-decomposition
    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)2x T4 PCIe4x T4 PCIe8x T4 PCIe
    ChromaTotal Time (Sec)szscl21_24_128no1,1151174026
    ChromaNRFszscl21_24_128yes1x10x28x44x

    GROMACS

    Molecular Dynamics

    Simulation of biochemical molecules with complicated bond interactions

    VERSION

    2022.3

    ACCELERATED FEATURES

    • Implicit (5x), Explicit (2x) Solvent
    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)2x T4 PCIe4x T4 PCIe
    GROMACS [ADH Dodec]ns/dayADH Dodecyes67163238
    GROMACS [ADH Dodec]NRFADH Dodecyes1x3x5x
    GROMACS [STMV]ns/daySTMVyes4-20
    GROMACS [STMV]NRFSTMVyes1x-5x

    GTC

    Physics

    GTC is used for Gyrokinetic Particle Simulation of Turbulent Transport in Burning Plasmas

    VERSION

    V 4.5 Updated

    ACCELERATED FEATURES

    • Push, shift, and collision

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)2x T4 PCIe4x T4 PCIe8x T4 PCIe
    GTCMpush/Secmoi#proc.inyes35236466893
    GTCNRFmoi#proc.inyes1x7x14x26x

    MILC

    Physics

    Lattice Quantum Chromodynamics (LQCD) codes simulate how elemental particles are formed and bound by the “strong force” to create larger particles like protons and neutrons

    VERSION

    feature/gauge-action-quda_16a2d47119

    ACCELERATED FEATURES

    • Staggered fermions, Krylov solvers, Gauge-link fattening

    SCALABILITY

    Multi-GPU and Multi-Node

    MORE INFORMATION

    https://ngc.nvidia.com/catalog/containers/hpc:milc

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)2x T4 PCIe4x T4 PCIe8x T4 PCIe
    MILCTotal Time (Sec)Apex Mediumno71,5957,5633,8982,135
    MILCNRFApex Mediumyes1x10x20x37x

    NAMD

    Molecular Dynamics

    Designed for high-performance simulation of large molecular systems

    VERSION

    GPU, AMD CPU V 3.0a13 ; Intel CPU V 2.15a AVX512

    ACCELERATED FEATURES

    • Full electrostatics with PME and most simulation features

    SCALABILITY

    Up to 100M atom capable, multi-GPU, single node

    MORE INFORMATION

    http://www.ks.uiuc.edu/Research/namd/

    https://ngc.nvidia.com/catalog/containers/hpc:namd

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)2x T4 PCIe4x T4 PCIe8x T4 PCIe
    NAMD [apoa1_npt_cuda]Ave ns/dayapoa1_npt_cudayes19.1557113229
    NAMD [apoa1_npt_cuda]NRFapoa1_npt_cudayes1x3x6x12x
    NAMD [apoa1_nptsr_cuda]Ave ns/dayapoa1_nptsr_cudayes19.5959117239
    NAMD [apoa1_nptsr_cuda]NRFapoa1_nptsr_cudayes1x3x6x12x
    NAMD [apoa1_nve_cuda]Ave ns/dayapoa1_nve_cudayes20.7575149303
    NAMD [apoa1_nve_cuda]NRFapoa1_nve_cudayes1x4x7x15x
    NAMD [stmv_npt_cuda]Ave ns/daystmv_npt_cudayes1.87-917
    NAMD [stmv_npt_cuda]NRFstmv_npt_cudayes1x-5x9x
    NAMD [stmv_nptsr_cuda]Ave ns/daystmv_nptsr_cudayes1.815917
    NAMD [stmv_nptsr_cuda]NRFstmv_nptsr_cudayes1x3x5x10x
    NAMD [stmv_nve_cuda]Ave ns/daystmv_nve_cudayes1.94-1020
    NAMD [stmv_nve_cuda]NRFstmv_nve_cudayes1x-5x10x

    RELION

    Microscopy

    Stand-alone computer program that employs an empirical Bayesianapproach to refinement of (multiple) 3D reconstructions or 2D class averages in electron cryo-microscopy (cryo-EM)

    VERSION

    3.1.3

    ACCELERATED FEATURES

    • Reduced memory requirements; high-resolution cryo-EM structure determination in a matter of day on a single workstation
    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)2x T4 PCIe4x T4 PCIe
    Relion [Plasmodium Ribosome]Total Wall Clock (Sec)MB numbers Plasmodium Ribosime on Relion-3.0no12,7423,5862,549
    Relion [Plasmodium Ribosome]NRFMB numbers Plasmodium Ribosime on Relion-3.0yes1x4x5x

    SPECFEM3D

    Geoscience

    Simulates Seismic wave propagation

    VERSION

    devel_fef2ace9

    ACCELERATED FEATURES

    • OpenCL and CUDA hardware accelerators, based on an automatic source-to-source transformation library

    SCALABILITY

    Multi-GPU and Single-Node

    MORE INFORMATION

    https://geodynamics.org/cig/software/specfem3d/

    ApplicationMetricTest ModulesBigger is betterDual Cascade Lake 6240 (CPU-Only)2x T4 PCIe4x T4 PCIe8x T4 PCIe
    SPECFEM3DTotal Time (Sec)four_material_simple_modelno1,26823912264
    SPECFEM3DNRFfour_material_simple_modelyes1x5x12x23x
    人人超碰97caoporen国产