--- base_model: - Qwen/Qwen3-VL-8B-Instruct datasets: - xashru/sphinx license: apache-2.0 pipeline_tag: image-text-to-text library_name: transformers --- This model is released alongside the paper [SPHINX: A Synthetic Environment for Visual Perception and Reasoning](https://arxiv.org/abs/2511.20814). It is trained on the SPHINX training split using Verl with GRPO. For code and more details, see the [GitHub repository](https://github.com/xashru/sphinx).