rewardfm/ant-rfm-qwen-4gpu-bs64-pref-prog-2frames-uniform-20251216-003917
Model Details
- Base Model: Qwen/Qwen3-VL-4B-Instruct
- Model Type: qwen3_vl
Training Run
- Wandb Run: ant_rfm_qwen_4gpu_bs64_pref_prog_2frames_uniform
- Wandb ID:
987at1tm - Project: rfm
- Notes: prog only training, uniform_sample strategy, 2 frames with absolute progress wrt total frames, all data
Citation
If you use this model, please cite:
- Downloads last month
- 135
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for rewardfm/ant-rfm-qwen-4gpu-bs64-pref-prog-2frames-uniform-20251216-003917
Base model
Qwen/Qwen3-VL-4B-Instruct