juneup
/

internlm2.5_7b_distill_orpo

Model card Files Files and versions

internlm2.5_7b_distill_orpo / README.md

ka1tovo's picture

Upload 2 files

803d8f8 verified 7 months ago

|

815 Bytes

internlm2.5_7b_distill_orpo

架构图

基座模型

https://huggingface.co/Slipstream-Max/Emollm-InternLM2.5-7B-chat-GGUF-fp16

数据集

数据集组成

PKU-SafeRLHF(https://huggingface.co/datasets/PKU-Alignment/PKU-SafeRLHF-single-dimension)经处理后最终数据集为(https://huggingface.co/datasets/juneup/PKU-SafeRLHF-orpo)

训练方式

orpo、λ=0.2，lr=5e-6

下载模型

git lfs install
git clone https://huggingface.co/juneup/internlm2.5_7b_distill_orpo

若不想克隆大型文件

GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/juneup/internlm2.5_7b_distill_orpo

在Ollama下载

ollama run Juneup/internlm2.5_7b_distill:orpo_q4_k_m