Qwen3-Next-80B-A3B-Instruct-CPT-LoRA-HyperSwitch
A LoRA fine-tuned model based on Qwen/Qwen3-Next-80B-A3B-Instruct specialized for the Hyperswitch Rust codebase. This model excels at understanding payment processing patterns, Hyperswitch architecture, and Rust development practices.
π― Model Description
This LoRA adapter was trained on 16,731 samples extracted from the Hyperswitch codebase to enhance code understanding, explanation, and generation within the payment processing domain.
- Base Model: Qwen/Qwen3-Next-80B-A3B-Instruct
- Training Type: Causal Language Modeling (CLM) with LoRA
- Domain: Payment Processing, Rust Development
- Specialization: Hyperswitch codebase patterns and architecture
π Training Details
Dataset Composition
- Total Samples: 16,731
- File-level samples: 2,120 complete files
- Granular samples: 14,611 extracted components
- Functions: 4,121
- Structs: 5,710
- Traits: 223
- Implementations: 4,296
- Modules: 261
LoRA Configuration
r: 16 # LoRA rank
alpha: 32 # LoRA alpha (2*r)
dropout: 0.05 # LoRA dropout
target_modules:
- "q_proj"
- "k_proj"
- "v_proj"
- "o_proj"
exclude_modules:
- "block_sparse_moe"
- "w1"
- "w2"
- "w3"
- "gate"
Training Hyperparameters
- Epochs: 3
- Learning Rate: 2e-4 (cosine schedule)
- Hardware: 4 x NVIDIA H200
π οΈ Technical Specifications
- Context Window: 8,192 tokens
- Precision: bfloat16
- Inference Speed: Optimized with Flash Attention 2
π Acknowledgments
- Qwen Team for the excellent Qwen base model
- Hyperswitch Team for the open-source payment processing platform
- Hugging Face for the transformers and PEFT libraries
π Citation
@misc{hyperswitch-qwen-lora-2024,
title={Qwen3-Next-80B-A3B-Instruct-CPT-LoRA-HyperSwitch},
author={Aditya Narayan},
year={2024},
publisher={Hugging Face},
url={https://huggingface.co/AdityaNarayan/Qwen3-Next-80B-A3B-Instruct-CPT-LoRA-HyperSwitch}
}
Model tree for AdityaNarayan/Qwen3-Next-80B-A3B-Instruct-CPT-LoRA-HyperSwitch
Base model
Qwen/Qwen3-Next-80B-A3B-Instruct