Chenhan Yu

Chenhan Yu is an engineering manager at NVIDIA, working on inference and deployment system software optimization for generative AIs and autonomous driving. He received his Ph.D. in computer science from the University of Texas at Austin.
Avatar photo

Posts by Chenhan Yu

Illustration showing models and NeMo.
Generative AI

Post-Training Quantization of LLMs with NVIDIA NeMo and NVIDIA TensorRT Model Optimizer

As large language models (LLMs) are becoming even bigger, it is increasingly important to provide easy-to-use and efficient deployment paths because the cost of... 10 MIN READ