Kumru-2B LoRA Adapter

This repository provides a LoRA adapter distilled from the VNGRS Kumru-2B model ( vngrs-ai/Kumru-2B, the SFT/chat variant) to be applied on top of the base model vngrs-ai/Kumru-2B-Base. The goal is to transfer Kumru’s chat/instruction behavior to Kumru-2B-Base deployments with a lightweight file footprint.

Model Summary

  • Base model: vngrs-ai/Kumru-2B-Base
  • Source (target behavior) model: vngrs-ai/Kumru-2B (SFT/chat)
  • echnique: Low-Rank Adaptation (LoRA)
  • LoRA rank / alpha: 768 / 1024 (update these if you produce a different build)
  • Layer coverage: All self-attention and MLP projections
  • Output artifacts: PEFT-compatible adapter_config.json + adapter_model.safetensors
  • License: Apache 2.0 (aligned with VNGRS Kumru model licensing)

Usage

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base = "vngrs-ai/Kumru-2B-Base"
adapter = "ceofast/kumru-2b-lora"

tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(base, torch_dtype="auto", device_map="auto")
model = PeftModel.from_pretrained(model, adapter, device_map="auto")

messages = [
    {"role": "system", "content": "Adın Kumru, Türkçe konuşan yardımcı bir modelsin."},
    {"role": "user", "content": "İstanbul'un fethi hakkında kısa bilgi verir misin?"}
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device)
outputs = model.generate(inputs, max_new_tokens=512, temperature=0.7, top_p=0.9)
print(tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True))

Not: This adapter must be used together with vngrs-ai/Kumru-2B-Base.

Extraction Process

The adapter is obtained by computing the delta between the base and the SFT checkpoints and factorizing it with SVD into low-rank components. In this release, the measured reconstruction error is approximately 0.409. To better preserve quality, you may increase rank/alpha and export a new version (e.g., rank 1024 / alpha 2048). A lower-error build will be added as soon as possible.

  • Script: export_kumru.py

Known Limitations

  • Kumru-2B is still a ~2B-parameter model; it may struggle with very long context, rare technical terms, and complex math.
  • With low ranks, SVD-based LoRA can be less stable than the original SFT checkpoint.
  • Training data is based on VNGRS’s public Turkish corpus cleaning pipeline; truthfulness/hallucination issues may still occur.

Framework versions

  • PEFT 0.11.1
Downloads last month
115
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ceofast/kumru-2b-lora

Adapter
(3)
this model

Dataset used to train ceofast/kumru-2b-lora

Space using ceofast/kumru-2b-lora 1