Qwen2.5-VL-7B CLTs (circuit-tracer format)

This is the circuit-tracer compatible version of Cross-Layer Transcoders (CLTs) trained on Qwen2.5-VL-7B-Instruct.

Usage with circuit-tracer

from circuit_tracer import ReplacementModel

model = ReplacementModel.from_pretrained(
    model_name="Qwen/Qwen2.5-VL-7B-Instruct",
    transcoder_set="KokosDev/qwen2p5vl-7b-clt",
)

Or use the convenience shortcut:

from circuit_tracer.vlm import VLModelWrapper

model = VLModelWrapper.from_pretrained(
    'Qwen/Qwen2.5-VL-7B-Instruct',
    transcoder_set='qwen',  # Shortcut for this repo
    dtype=torch.bfloat16
)

Model Details

  • Architecture: Cross-Layer Transcoders (CLTs)
  • Base Model: Qwen/Qwen2.5-VL-7B-Instruct
  • Hidden Dimension: 3584
  • Feature Dimension: 8192
  • Layers: 27 (layers 0-26)
  • Sparsity: ~12% L0
  • Training Steps: 5000

Format

  • layer_*.safetensors: Transcoder weights for each layer
  • config.yaml: Configuration for circuit-tracer
  • Uses safetensors format for fast loading

Training Details

  • Optimizer: AdamW
  • Learning Rate: 3e-4
  • Scheduler: Cosine
  • Target L0: 0.12
  • Validation Loss: 10.3 - 19.1

Citation

If you use these transcoders in your research, please cite:

@misc{qwen2p5vl7b-clt,
  author = {KokosDev},
  title = {Qwen2.5-VL-7B Cross-Layer Transcoders},
  year = {2025},
  publisher = {HuggingFace},
  url = {https://huggingface.co/KokosDev/qwen2p5vl-7b-clt}
}
Downloads last month
29
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support