---
base_model:
- Qwen/Qwen3-VL-8B-Instruct
datasets:
- xashru/sphinx
license: apache-2.0
pipeline_tag: image-text-to-text
library_name: transformers
---

This model is released alongside the paper
[SPHINX: A Synthetic Environment for Visual Perception and Reasoning](https://arxiv.org/abs/2511.20814).
It is trained on the SPHINX training split using Verl with GRPO.

For code and more details, see the [GitHub repository](https://github.com/xashru/sphinx).