Qwen2-0.5B Fine-tuned with PyTorch DDP

This model is a fine-tuned version of Qwen/Qwen2-0.5B-Instruct, fine-tuned using PyTorch's DistributedDataParallel (DDP) for multi-GPU training performance comparison.

This model was fine-tuned using the arxiv-abstract-dataset on 2 ร— T4 16GB GPUs, achieving 1.34ร— speedup compared to single-GPU training.

For detailed implementation, performance benchmarks, and MLflow experiment tracking, please check out the project repository.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support