Qwen2.5-VL-7B CLTs (circuit-tracer format)
This is the circuit-tracer compatible version of Cross-Layer Transcoders (CLTs) trained on Qwen2.5-VL-7B-Instruct.
Usage with circuit-tracer
from circuit_tracer import ReplacementModel
model = ReplacementModel.from_pretrained(
model_name="Qwen/Qwen2.5-VL-7B-Instruct",
transcoder_set="KokosDev/qwen2p5vl-7b-clt",
)
Or use the convenience shortcut:
from circuit_tracer.vlm import VLModelWrapper
model = VLModelWrapper.from_pretrained(
'Qwen/Qwen2.5-VL-7B-Instruct',
transcoder_set='qwen', # Shortcut for this repo
dtype=torch.bfloat16
)
Model Details
- Architecture: Cross-Layer Transcoders (CLTs)
- Base Model: Qwen/Qwen2.5-VL-7B-Instruct
- Hidden Dimension: 3584
- Feature Dimension: 8192
- Layers: 27 (layers 0-26)
- Sparsity: ~12% L0
- Training Steps: 5000
Format
layer_*.safetensors: Transcoder weights for each layerconfig.yaml: Configuration for circuit-tracer- Uses safetensors format for fast loading
Training Details
- Optimizer: AdamW
- Learning Rate: 3e-4
- Scheduler: Cosine
- Target L0: 0.12
- Validation Loss: 10.3 - 19.1
Citation
If you use these transcoders in your research, please cite:
@misc{qwen2p5vl7b-clt,
author = {KokosDev},
title = {Qwen2.5-VL-7B Cross-Layer Transcoders},
year = {2025},
publisher = {HuggingFace},
url = {https://huggingface.co/KokosDev/qwen2p5vl-7b-clt}
}
- Downloads last month
- 29