nvidia/Nemotron-Post-Training-Dataset-v1
Viewer
•
Updated
•
25.7M
•
11.9k
•
160
The SFT datasets for KORMo-10B were collected from diverse, publicly available source.
Note SFT datasets Englsih - nvidia/Nemotron-Post-Training-Dataset-v1 (~2.8B, sampling) - HuggingFaceTB/smoltalk2 (~259.5M, sampling) - KORMo-Team/IF-bilingual-sft (~1.08B) Korean - KORMo-Team/NemoPost-ko-synth-sft (3.37B) - KORMo-Team/IF-bilingual-sft (~0.45B)