• <xmp id="om0om">
  • <table id="om0om"><noscript id="om0om"></noscript></table>
  • After clicking “Watch Now” you will be prompted to login or join.


    Click “Watch Now” to login or join the NVIDIA Developer Program.


    Roofline Performance Model for HPC and Deep-Learning Applications

    Charlene Yang, NERSC, Lawrence Berkeley National Laboratory | Samuel Williams, CRD, Lawrence Berkeley National Laboratory | Yunsong Wang, NERSC, LBNL

    GTC 2020

    Learn how to use the Roofline model to analyze the performance of GPU-accelerated applications. We'll cover the basics of the model, explain how to use tools such as nvprof and Nsight Systems/Compute to automate the data collection, and demonstrate how to track progress using Roofline for both HPC and deep-learning applications. We'll use examples such as GPP from material science, high-performance geometric multigrid from adaptive mesh refinement, and two kernels from TensorFlow to show how characteristics such as arithmetic intensity, memory access pattern, and thread divergence/prediction can all be captured by Roofline, offering useful insights to performance optimization.

    View More GTC 2020 Content