• <xmp id="om0om">
  • <table id="om0om"><noscript id="om0om"></noscript></table>
  • Content Creation / Rendering

    Real-Time GPU-Accelerated Gaussian Splatting with NVIDIA DesignWorks Sample vk_gaussian_splatting

    Gaussian splatting is a novel approach to rendering complex 3D scenes by representing them as a collection of anisotropic Gaussians in 3D space. This technique enables real-time rendering of photorealistic scenes learned from small sets of images, making it ideal for applications in gaming, virtual reality, and real-time professional visualization.

    vk_gaussian_splatting is a new Vulkan-based sample that demonstrates real-time Gaussian splatting, a cutting-edge volume rendering technique that enables highly efficient representations of radiance fields. It is the latest addition to the NVIDIA DesignWorks Samples.

    The NVIDIA DevTech team envisions this new sample project as a testbed to explore and compare various approaches to real-time visualization of 3D Gaussian splatting (3DGS). By evaluating various techniques and optimizations, the team aims to provide valuable insights into performance, quality, and implementation trade-offs when using the Vulkan API. 

    The initial implementation is based on rasterization and demonstrates two approaches for rendering splats, one leveraging mesh shaders and another using vertex shaders. 

    A diagram comparing Synchronous GPU sorting and Asynchronous CPU sorting in Gaussian Splatting Rasterization. The left side shows the GPU timeline for synchronous sorting, where 'Dist & Cull' and 'Radix Sort' steps are performed before 'Mesh' and 'Fragment' processing for each frame. The right side illustrates asynchronous CPU sorting, where a separate sorting thread computes 'Dist & Sort' without culling, swaps indices, and then copies them to VRAM before the GPU processes 'Mesh' and 'Fragment' stages.
    Figure 1. Comparison of sorting methods, illustrated for the mesh shader pipeline

    Because Gaussian splats require back-to-front sorting for correct alpha compositing, two alternative sorting methods are provided: 

    • A GPU-based Radix Sort implemented in a compute pipeline
    • A CPU-based asynchronous sorting strategy that uses the multithreaded sort function from the C++ STL
    A screenshot of the vk_gaussian_splatting sample application displaying a rendered 3D gaussian splatting model of a bicycle near a park bench with trees and a path in the background. The user interface includes various settings and statistics panels. On the right, options for data storage, rendering, and sorting methods are visible, with settings for V-Sync, frustum culling, splat scale, and mesh shaders. At the bottom, memory statistics and a profiler panel show GPU and CPU usage, including frame time, sorting, and rendering performance. The application runs at 510 FPS with 1.961 ms frame time.
    Figure 2. The vk_gaussian_splatting user interface provides several profiling feedback elements, such as memory usage in both RAM and VRAM, along with performance timers that measure the different stages of the pipeline

    The sample allows you to explore and experiment with multiple aspects of this rendering technique, including:

    • Several visualization modes to inspect the different aspects of Gaussian splats (spherical harmonics, splats, point density, and more)
    • A complete benchmarking system is available and enables profiling in real time
    • More details about both the RAM and VRAM memory consumption, to understand the stream of data to render
    • GPU timings for each stage of the different techniques investigated, to have an understanding of the workload and potential bottlenecks
    • Graphical reports made generated with all of these numbers
    A bar chart titled 'Pipeline Performance Comparison - SH storage formats in float 32, float 16 and uint 8' showing performance benchmarks from a Vulkan Gaussian Splatting sample. The chart compares processing times in microseconds across different test scenes, with stacked bars representing three pipeline stages: GPU Distribution (black), GPU Sort (dark green), and Rendering (light green). Various scenes are listed along the x-axis including bicycle, bonnet, counter, dining room, flowers, garden, kitchen, playroom, room, stump, train, treehill, and truck - each with their splat count and format details in parentheses. Most scenes show total processing times between 500-1500 microseconds, with rendering typically being the most time-consuming stage. The garden scene shows the highest total processing time at nearly 3000 microseconds. The chart demonstrates that smaller spherical harmonics (SH) storage formats (uint8 vs float16 vs float32) consistently result in faster rendering performance across all test scenes.
    Figure 3. Example of report comparing the rendering performance with different data storage formats for a complete dataset

    This sample provides a starting point for developers looking to experiment with Gaussian splatting rendering techniques and Vulkan-based optimizations.

    To start exploring real-time rendering of neural radiance fields, check out the nvpro-samples/vk_gaussian_splatting GitHub repo. 

    Discuss (0)
    +10

    Tags

    人人超碰97caoporen国产