Configuration Parsing Warning: In adapter_config.json: "peft.task_type" must be a string

CDLM-LLaDA LoRA adapter for LLaDA-8B-Instruct

This repository hosts the LoRA adapter for the LLaDA-8B-Instruct diffusion LLM (dLLM), produced with the CDLM (Consistency Diffusion Language Models) method. CDLM integrates consistency modeling and a block-wise causal attention mask so the student model becomes fully KV-cache compatible while retaining the strong local bidirectional modeling within each block. In practice, the adapter enables significantly faster inference with competitive quality.

Model details

Base model: GSAI-ML/LLaDA-8B-Instruct
Method: CDLM (consistency distillation + block-wise causal masking for KV-cache compatibility)
Format: PEFT LoRA adapter (adapter_model.safetensors, adapter_config.json)
Intended use: attach this adapter to the base LLaDA-8B-Instruct model for accelerated inference via the CDLM decoding path

How to use

This is a LoRA adapter, not a full model. You must load the base model and then attach this adapter. For best speedups, use the CDLM inference path in the accompanying codebase.

License

This adapter is released under the MIT License. The base model is governed by its own license; please ensure compliance with the base model’s terms.

Citation

@article{kim2025cdlm,
  title   = {CDLM: Consistency Diffusion Language Models for Faster Sampling},
  author  = {Kim, Minseo and Xu, Chenfeng and Hooper, Coleman and Singh, Harman 
             and Athiwaratkun, Ben and Zhang, Ce and Keutzer, Kurt and Gholami, Amir},
  journal = {arXiv preprint arXiv:2511.19269},
  year    = {2025},
  url     = {https://arxiv.org/abs/2511.19269}
}

Downloads last month: 105