CDLM-LLaDA / README.md
minseo25's picture
Improve model card: Add metadata and update paper/GitHub links (#1)
5580119 verified
metadata
license: mit
pipeline_tag: text-generation
library_name: peft

CDLM-LLaDA LoRA adapter for LLaDA-8B-Instruct

This repository hosts the LoRA adapter for the LLaDA-8B-Instruct diffusion LLM (dLLM), produced with the CDLM (Consistency Diffusion Language Models) method. CDLM integrates consistency modeling and a block-wise causal attention mask so the student model becomes fully KV-cache compatible while retaining the strong local bidirectional modeling within each block. In practice, the adapter enables significantly faster inference with competitive quality.

Model details

  • Base model: GSAI-ML/LLaDA-8B-Instruct
  • Method: CDLM (consistency distillation + block-wise causal masking for KV-cache compatibility)
  • Format: PEFT LoRA adapter (adapter_model.safetensors, adapter_config.json)
  • Intended use: attach this adapter to the base LLaDA-8B-Instruct model for accelerated inference via the CDLM decoding path

How to use

This is a LoRA adapter, not a full model. You must load the base model and then attach this adapter. For best speedups, use the CDLM inference path in the accompanying codebase.

License

This adapter is released under the MIT License. The base model is governed by its own license; please ensure compliance with the base model’s terms.

Citation

@article{kim2025cdlm,
  title   = {CDLM: Consistency Diffusion Language Models for Faster Sampling},
  author  = {Kim, Minseo and Xu, Chenfeng and Hooper, Coleman and Singh, Harman 
             and Athiwaratkun, Ben and Zhang, Ce and Keutzer, Kurt and Gholami, Amir},
  journal = {arXiv preprint arXiv:2511.19269},
  year    = {2025},
  url     = {https://arxiv.org/abs/2511.19269}
}