Qwen3-Next-80B-A3B-Instruct-CPT-LoRA-HyperSwitch

A LoRA fine-tuned model based on Qwen/Qwen3-Next-80B-A3B-Instruct specialized for the Hyperswitch Rust codebase. This model excels at understanding payment processing patterns, Hyperswitch architecture, and Rust development practices.

🎯 Model Description

This LoRA adapter was trained on 16,731 samples extracted from the Hyperswitch codebase to enhance code understanding, explanation, and generation within the payment processing domain.

Base Model: Qwen/Qwen3-Next-80B-A3B-Instruct
Training Type: Causal Language Modeling (CLM) with LoRA
Domain: Payment Processing, Rust Development
Specialization: Hyperswitch codebase patterns and architecture

📊 Training Details

Dataset Composition

Total Samples: 16,731
- File-level samples: 2,120 complete files
- Granular samples: 14,611 extracted components
  - Functions: 4,121
  - Structs: 5,710
  - Traits: 223
  - Implementations: 4,296
  - Modules: 261

LoRA Configuration

r: 16                   # LoRA rank
alpha: 32              # LoRA alpha (2*r)
dropout: 0.05           # LoRA dropout
  target_modules:
    - "q_proj"   
    - "k_proj"    
    - "v_proj"   
    - "o_proj"    
  
  exclude_modules:
    - "block_sparse_moe"  
    - "w1"                
    - "w2"               
    - "w3"               
    - "gate"

Training Hyperparameters

Epochs: 3
Learning Rate: 2e-4 (cosine schedule)
Hardware: 4 x NVIDIA H200

🛠️ Technical Specifications

Context Window: 8,192 tokens
Precision: bfloat16
Inference Speed: Optimized with Flash Attention 2

🙏 Acknowledgments

Qwen Team for the excellent Qwen base model
Hyperswitch Team for the open-source payment processing platform
Hugging Face for the transformers and PEFT libraries

📞 Citation

@misc{hyperswitch-qwen-lora-2024,
  title={Qwen3-Next-80B-A3B-Instruct-CPT-LoRA-HyperSwitch},
  author={Aditya Narayan},
  year={2024},
  publisher={Hugging Face},
  url={https://huggingface.co/AdityaNarayan/Qwen3-Next-80B-A3B-Instruct-CPT-LoRA-HyperSwitch}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for AdityaNarayan/Qwen3-Next-80B-A3B-Instruct-CPT-LoRA-HyperSwitch

Base model

Qwen/Qwen3-Next-80B-A3B-Instruct

Finetuned

(16)

this model

AdityaNarayan
/

Qwen3-Next-80B-A3B-Instruct-CPT-LoRA-HyperSwitch