Kumru-2B LoRA Adapter
This repository provides a LoRA adapter distilled from the VNGRS Kumru-2B model (
vngrs-ai/Kumru-2B, the SFT/chat variant) to be applied on top of the base model
vngrs-ai/Kumru-2B-Base. The goal is to transfer Kumru’s chat/instruction behavior
to Kumru-2B-Base deployments with a lightweight file footprint.
Model Summary
- Base model:
vngrs-ai/Kumru-2B-Base - Source (target behavior) model:
vngrs-ai/Kumru-2B(SFT/chat) - echnique: Low-Rank Adaptation (LoRA)
- LoRA rank / alpha: 768 / 1024 (update these if you produce a different build)
- Layer coverage: All self-attention and MLP projections
- Output artifacts: PEFT-compatible
adapter_config.json+adapter_model.safetensors - License: Apache 2.0 (aligned with VNGRS Kumru model licensing)
Usage
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
base = "vngrs-ai/Kumru-2B-Base"
adapter = "ceofast/kumru-2b-lora"
tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(base, torch_dtype="auto", device_map="auto")
model = PeftModel.from_pretrained(model, adapter, device_map="auto")
messages = [
{"role": "system", "content": "Adın Kumru, Türkçe konuşan yardımcı bir modelsin."},
{"role": "user", "content": "İstanbul'un fethi hakkında kısa bilgi verir misin?"}
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device)
outputs = model.generate(inputs, max_new_tokens=512, temperature=0.7, top_p=0.9)
print(tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True))
Not: This adapter must be used together with
vngrs-ai/Kumru-2B-Base.
Extraction Process
The adapter is obtained by computing the delta between the base and the SFT checkpoints and factorizing it with SVD into low-rank components. In this release, the measured reconstruction error is approximately 0.409. To better preserve quality, you may increase rank/alpha and export a new version (e.g., rank 1024 / alpha 2048). A lower-error build will be added as soon as possible.
- Script: export_kumru.py
Known Limitations
- Kumru-2B is still a ~2B-parameter model; it may struggle with very long context, rare technical terms, and complex math.
- With low ranks, SVD-based LoRA can be less stable than the original SFT checkpoint.
- Training data is based on VNGRS’s public Turkish corpus cleaning pipeline; truthfulness/hallucination issues may still occur.
Framework versions
- PEFT 0.11.1
- Downloads last month
- 115
Model tree for ceofast/kumru-2b-lora
Base model
vngrs-ai/Kumru-2B-Base