---
license: mit
datasets:
- cadene/droid_1.0.1
language:
- en
base_model:
- stabilityai/stable-video-diffusion-img2vid
pipeline_tag: robotics
tags:
- action_conditioned_video_model
---
👉 Ctrl-World: A Controllable Generative World Model for Robot Manipulation
[Yanjiang Guo*](https://robert-gyj.github.io), [Lucy Xiaoyang Shi*](https://lucys0.github.io), [Jianyu Chen](http://people.iiis.tsinghua.edu.cn/~jychen/), [Chelsea Finn](https://ai.stanford.edu/~cbfinn/)
\*Equal contribution; Stanford University, Tsinghua University
## TL; DR:
[**Ctrl-World**](https://sites.google.com/view/ctrl-world) is an action-conditioned world model compatible with modern VLA policies and enables policy-in-the-loop rollouts entirely in imagination, which can be used to evaluate and improve the **instruction following** ability of VLA.