Command-R 35B β CPT + SFT
Model type: Causal Language Model
Base model: ubitech-edg/qwen2.5-72b-cpt (which builds on Qwen/Qwen2.5-72B)
License: Apache 2.0
Framework: Axolotl + DeepSpeed ZeRO-1
Overview
qwen2.5-72b-cpt-sft is a two-stage trained version of Qwen 2.5-72B, combining continual pretraining (CPT) and supervised fine-tuning (SFT) using
LoRA adapters in 4-bit NF4 quantization for efficient adaptation. This release contains only the LoRA adapters for the SFT stage and training configuration,
allowing users to load them on top of the CPT adapters (which load on the official Qwen 2.5-72B base model). The CPT stage enhances domain knowledge,
while the SFT stage refines question-answering and conversational skills using synthetic QA data.
Training was performed on the Leonardo EuroHPC supercomputer using Axolotl 0.6 + DeepSpeed ZeRO-1 optimization with bfloat16 computation.
Training Setup
Stage 1 (CPT): Domain-adaptive continual pretraining
Stage 2 (SFT): Instruction fine-tuning
Adapter type: LoRA
Quantization: 4-bit NF4 (bnb)
Precision: bfloat16
Hardware: 8 nodes Γ 2 Γ NVIDIA A100 64GB GPUs
Framework: DeepSpeed ZeRO-1, Axolotl, PyTorch 2.5.1+cu121
Datasets
CPT Stage:
arxiv.jsonlgov.jsonlnews.jsonlwiki.jsonl
SFT Stage:
axolotl_deduplicated_synthetic_qa.jsonl
Hyperparameters
| Parameter | Value |
|---|---|
| Sequence length | 2048 |
| Micro batch size | 1 |
| Gradient accumulation | 4 |
| Epochs | 1 |
| Learning rate | 0.0001 |
| LR scheduler | cosine |
| Optimizer | AdamW (8-bit) |
| Warmup steps | 20 |
| Weight decay | 0.0 |
| LoRA rank (r) | 16 |
| LoRA alpha | 32 |
| LoRA dropout | 0.05 |
| LoRA target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
| Gradient checkpointing | β |
| Flash attention | β |
| Auto resume | β |
| bnb 4-bit compute dtype | bfloat16 |
| bnb 4-bit quant type | nf4 |
| bnb double quant | true |
| Validation set size | 0.3 |
| Evals per epoch | 10 |
Tokenizer
Tokenizer type: AutoTokenizer
Special token: <|end_of_text|> as pad_token
Files Included
This repository hosts LoRA adapters and Axolotl metadata only.
Contents:
- adapter_config.json
- adapter_model.safetensors
- config.json
- special_tokens_map.json
- tokenizer_config.json
- tokenizer.json
- README.md
Usage β Load and Apply the Adapters
To use this CPT + SFT variant in Python (chain CPT then SFT adapters):
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base_model = "Qwen/Qwen2.5-72B"
cpt_adapter = "ubitech-edg/qwen2.5-72b-cpt"
sft_adapter = "ubitech-edg/qwen2.5-72b-cpt-sft"
# Load base and tokenizer
tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(
base_model, device_map="auto", torch_dtype="bfloat16"
)
# Load CPT LoRA adapters
model = PeftModel.from_pretrained(model, cpt_adapter)
# Load SFT LoRA adapters
model = PeftModel.from_pretrained(model, sft_adapter)
model.eval()
prompt = "What is the role of AI in renewable energy optimization?"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=200)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
- Downloads last month
- 31