Zijie Yan

Zijie Yan is a senior DevTech engineer at NVIDIA, having joined the DevTech team in 2021. He specializes in improving the efficiency and scalability of large language model (LLM) training systems. Currently, Zijie drives the engineering initiatives for MoE support in Megatron-Core, where he collaborates closely with the team on the engineering development and performance enhancement of the MoE training system. Before joining NVIDIA, Zijie conducted research on communication optimization for distributed deep learning during his master's studies at Sun Yat-sen University.
Avatar photo

Posts by Zijie Yan

Conversational AI

Train Generative AI Models More Efficiently with New NVIDIA Megatron-Core Functionalities

First introduced in 2019, NVIDIA Megatron-LM sparked a wave of innovation in the AI community, enabling researchers and developers to use the underpinnings of... 11 MIN READ