Robin Kobus

Robin Kobus is a senior AI developer technology engineer at NVIDIA. His work focuses on optimizing large language model inference in TensorRT-LLM. Robin studied math and computer science at the Johannes Gutenberg University in Mainz, Germany. In his PhD thesis, he investigated the acceleration of bioinformatics algorithms on multi-GPU systems.
Avatar photo

Posts by Robin Kobus

Generative AI

NVIDIA TensorRT-LLM Now Supports Recurrent Drafting for Optimizing LLM Inference

Recurrent drafting (referred to as ReDrafter) is a novel speculative decoding technique developed and open-sourced by Apple for large language model (LLM)... 6 MIN READ