--- license: apache-2.0 datasets: - FreedomIntelligence/medical-o1-reasoning-SFT language: - zh base_model: - deepseek-ai/DeepSeek-R1-Distill-Qwen-14B --- # DeepSeek-R1-Distill-Qwen-14B LoRA Adapter ## 📌 模型简介 本 LoRA 适配器基于 **DeepSeek-R1-Distill-Qwen-14B** 进行微调,主要优化医学领域的问答和推理能力。 - 🔹 **基座模型**: [deepseek-ai/DeepSeek-R1-Distill-Qwen-14B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B) - 🔹 **微调方法**: LoRA(使用 [Unsloth](https://github.com/unslothai/unsloth) 进行优化) - 🔹 **适用场景**: 医学文本问答、医学知识增强 --- ## 📂 使用方法 ### 🔄 加载 LoRA 适配器 要使用本 LoRA 适配器,你需要加载原始 DeepSeek-R1-14B 模型,并应用 LoRA 权重: ```python from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel base_model = "deepseek-ai/DeepSeek-R1-Distill-Qwen-14B" lora_model = "your-huggingface-username/DeepSeek-R1-Distill-Qwen-14B-lora-med" tokenizer = AutoTokenizer.from_pretrained(base_model) model = AutoModelForCausalLM.from_pretrained(base_model, torch_dtype="auto", device_map="auto") model = PeftModel.from_pretrained(model, lora_model) ``` ### 🚀 推理示例 ```python input_text = "请问阿司匹林的主要适应症是什么?" inputs = tokenizer(input_text, return_tensors="pt").to("cuda") outputs = model.generate(**inputs, max_length=200) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` --- ## 🏗️ 训练信息 - **训练环境**: RTX 4090, CUDA 12.6, WSL Ubuntu - **训练框架**: `transformers` + `peft` + `unsloth` - **训练参数**: - LoRA Rank: 16 - Alpha: 32 - Dropout: 0.05 - Max Seq Length: 4096 --- ## 📜 许可证 本 LoRA 适配器基于 **DeepSeek-R1-Distill-Qwen-14B**,请遵守其[官方许可证](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-14B)。 --- ## 📞 联系方式 如果你有任何问题或建议,可以在讨论区留言,或者联系我!