Shizhe Diao

Shizhe Diao is a research scientist at NVIDIA Research and is passionate about the research in efficient training and alignment of foundation models. Shizhe completed PhD at the Hong Kong University of Science and Technology, advised by Professor Tong Zhang.
Avatar photo

Posts by Shizhe Diao

Generative AI

Hymba Hybrid-Head Architecture Boosts Small Language Model Performance

Transformers, with their attention-based architecture, have become the dominant choice for language models (LMs) due to their strong performance,... 12 MIN READ