metadata
license: apache-2.0
datasets:
- nbeerbower/GreatFirewall-DPO
- nbeerbower/Schule-DPO
- nbeerbower/Purpura-DPO
- nbeerbower/Arkhaios-DPO
- jondurbin/truthy-dpo-v0.1
- antiven0m/physical-reasoning-dpo
- flammenai/Date-DPO-NoAsterisks
- flammenai/Prude-Phi3-DPO
- jondurbin/gutenberg-dpo-v0.1
- nbeerbower/gutenberg2-dpo
- nbeerbower/gutenberg-moderne-dpo
- sam-paech/gutenberg3-dpo-gemma3-12b
- nbeerbower/human-writing-dpo
- nbeerbower/synthetic-fiction-dpo
- Atsunori/HelpSteer2-DPO
- GeneralReasoning/GeneralThought-430K
base_model:
- lemon07r/Qwen3-R1-SLERP-Q3T-8B
Wenyan-Qwen3-8B
An attempt to build a Xiaolong-like tune with more Gutenberg data on top of lemon07r/Qwen3-R1-SLERP-Q3T-8B.
Results
I haven't done much testing but the model will sometimes skip thinking. The second epoch may have overcooked it.
Data
Condensed and formatted data available here.