Utkarsh Uppal

Utkarsh Uppal is a senior applied deep learning solutions architect at NVIDIA, where he specializes in building high-performance deep learning pipelines across domains like language and speech. His primary focus is on developing end-to-end conversational AI systems, including training LLMs from scratch, particularly for Indic languages and building domain-specific models with enterprises. He also has deep expertise in designing and optimizing inference architectures for production, with a focus on low-precision formats (FP4, FP8), decoding strategies, and KV-cache optimizations.
Avatar photo

Posts by Utkarsh Uppal

Decorative image.
Generative AI

Per-Tensor and Per-Block Scaling Strategies for Effective FP8 Training

In this blog post, we’ll break down the main FP8 scaling strategies—per-tensor scaling, delayed and current scaling, and per-block scaling (including the... 10 MIN READ
A decorative image.
Generative AI

Floating-Point 8: An Introduction to Efficient, Lower-Precision AI Training

With the growth of large language models (LLMs), deep learning is advancing both model architecture design and computational efficiency. Mixed precision... 11 MIN READ