| base_model: | |
| - Qwen/Qwen3-VL-8B-Instruct | |
| datasets: | |
| - xashru/sphinx | |
| license: apache-2.0 | |
| pipeline_tag: image-text-to-text | |
| library_name: transformers | |
| This model is released alongside the paper | |
| [SPHINX: A Synthetic Environment for Visual Perception and Reasoning](https://arxiv.org/abs/2511.20814). | |
| It is trained on the SPHINX training split using Verl with GRPO. | |
| For code and more details, see the [GitHub repository](https://github.com/xashru/sphinx). |