NVIDIA Data Loading Library
The NVIDIA Data Loading Library (DALI) is a portable, open source library for decoding and augmenting images,videos and speech to accelerate deep learning applications. DALI reduces latency and training time, mitigating bottlenecks, by overlapping training and pre-processing. It provides a drop-in replacement for built in data loaders and data iterators in popular deep learning frameworks for easy integration or retargeting to different frameworks.
Training neural networks with images requires developers to first normalize those images. Moreover images are often compressed to save on storage. Developers have therefore built multi-stage data processing pipelines that include loading, decoding, cropping, resizing, and many other augmentation operators. These data processing pipelines, which are currently executed on the CPU, have become a bottleneck, limiting overall throughput.
DALI is a high performance alternative to built-in data loaders and data iterators. Developers can now run their data processing pipelines on the GPU, reducing the total time it takes to train a neural network. Data processing pipelines implemented using DALI are portable because they can easily be retargeted to TensorFlow, PyTorch and MXNet.
ResNet50 training on ImageNet | NVIDIA DGX-2 | 20.01 NGC container | DALI version 0.18 | 16 × V-100 GPUs | batch size: 256
ResNet50 training on ImageNet | NVIDIA DGX-1 | 20.01 NGC container | DALI version 0.18 | 8 × V-100 GPUs | batch size: 256
ResNet50 training on ImageNet | AWS P3 | 20.01 NGC container | DALI version 0.18 | 8 × V-100 GPUs | batch size: 192
ResNet50 training on ImageNet | Google Cloud Platform | 20.01 NGC container | DALI version 0.18 | 8 × V-100 GPUs | batch size: 192
Key Features of DALI
- Easy-to-use Python API
- Transparently scales across multiple GPUs
- Accelerates image classification (ResNet-50),object detection (SSD) workloads and speech recognition models such as Jasper and RNN-T
- Flexible graphs lets developers create custom pipelines
- Supports multiple data formats - LMDB, RecordIO, TFRecord, COCO, JPEG, wav, flac, ogg, H.264 and HEVC
- Developers can add custom audio, image and video processing operators
Tutorials and Blogs
- How to setup a pipeline to load and decode audio data using DALI (Jupyter Notebook)
- How to setup an audio processing pipeline and calculate spectrogram using DALI (Jupyter Notebook)
- Fast AI Data Preprocessing with NVIDIA DALI
- Optimizations in DALI to reduce training time for image classification and object detection models in MLPerf benchmarks
NVIDIA GTC sessions
- GPU Technology Conference 2018; Fast data pipeline for deep learning training, T. Gale, S. Layton and P. Tr?dak: slides, recording
- GPU Technology Conference 2019; Fast AI data pre-preprocessing with DALI; Janusz Lisiecki, Micha? Zientkiewicz: slides, recording
- GPU Technology Conference 2019; Integration of DALI with TensorRT on Xavier; Josh Park and Anurag Dixit: slides, recording