A typical recipe for improving LLMs involves multiple stages: synthetic data generation (SDG), model training through supervised fine-tuning (SFT) or reinforcement learning (RL), and model evaluation. Each stage requires using different libraries, which are often challenging to set up and difficult to use together. For example, you might use NVIDIA TensorRT-LLM or vLLM for SDG and NVIDIA…
]]>Synthetic data has become a standard part of large language model (LLM) post-training procedures. Using a large number of synthetically generated examples from either a single or cohort of open-source, commercially permissible LLMs, a base LLM is finetuned either with supervised finetuning or RLHF to gain instruction-following and reasoning skills. This process can be seen as a knowledge…
]]>