Build Custom Reasoning Models with Advanced, Open Post-Training Datasets – NVIDIA Technical Blog

Build Custom Reasoning Models with Advanced, Open Post-Training Datasets – NVIDIA Technical Blog News and tutorials for developers, data scientists, and IT admins 2025-07-03T22:20:47Z http://www.open-lab.net/blog/feed/ Vinh Nguyen <![CDATA[Build Custom Reasoning Models with Advanced, Open Post-Training Datasets]]> http://www.open-lab.net/blog/?p=98680 2025-05-29T19:05:03Z 2025-05-14T16:33:26Z

Synthetic data has become a standard part of large language model (LLM) post-training procedures. Using a large number of synthetically generated examples from...]]>

Synthetic data has become a standard part of large language model (LLM) post-training procedures. Using a large number of synthetically generated examples from... How the Llama-Nemotron 30M Post Training Dataset was created

How the Llama-Nemotron 30M Post Training Dataset was created

Synthetic data has become a standard part of large language model (LLM) post-training procedures. Using a large number of synthetically generated examples from either a single or cohort of open-source, commercially permissible LLMs, a base LLM is finetuned either with supervised finetuning or RLHF to gain instruction-following and reasoning skills. This process can be seen as a knowledge��

]]> 0 ��˳��97caoporen��