--- license: cc-by-nc-sa-4.0 pipeline_tag: image-to-image --- # 🪶 MagicQuill V2: Precise and Interactive Image Editing with Layered Visual Cues - **Paper:** [MagicQuillV2: Precise and Interactive Image Editing with Layered Visual Cues](https://huggingface.co/papers/2512.03046) - **Project Page:** https://magicquill.art/v2/ - **Code Repository:** https://github.com/zliucz/MagicQuillV2 - **Hugging Face Spaces Demo:** https://huggingface.co/spaces/AI4Editing/MagicQuillV2

**TLDR:** MagicQuill V2 introduces a layered composition paradigm to generative image editing, disentangling creative intent into controllable visual cues (Content, Spatial, Structural, Color) for precise and intuitive control. ## Hardware Requirements Our model is based on Flux Kontext, which is large and computationally intensive. - **VRAM**: Approximately **40GB** of VRAM is required for inference. - **Speed**: It takes about **30 seconds** to generate a single image. > **Important**: This is a research project focused on pushing the boundaries of interactive image editing. If you do not have sufficient GPU memory, we recommend checking out our [**MagicQuill V1**](https://github.com/ant-research/MagicQuill) or trying the online demo on [**Hugging Face Spaces**](https://huggingface.co/spaces/AI4Editing/MagicQuillV2). ## Setup 1. **Clone the repository** ```bash git clone https://github.com/magic-quill/MagicQuillV2.git cd MagicQuillV2 ``` 2. **Create environment** ```bash conda create -n MagicQuillV2 python=3.10 -y conda activate MagicQuillV2 ``` 3. **Install dependencies** ```bash pip install -r requirements.txt ``` 4. **Download models** Download the models from [Hugging Face](https://huggingface.co/LiuZichen/MagicQuillV2-models) and place them in the `models/` directory. ```bash huggingface-cli download LiuZichen/MagicQuillV2-models --local-dir models ``` 5. **Run the demo** ```bash python app.py ``` ## System Overview The MagicQuill V2 interactive system is designed to unify our layered composition framework.
MagicQuill V2 UI
### Key Upgrades from V1 1. **Toolbar (A)**: Features a new **Local Edit Brush** for defining the target editing area, along with tools for sketching edges and applying color. 2. **Visual Cue Manager (B)**: Holds all content layer visual cues (**foreground props**) that users can drag onto the canvas to define what to generate. 3. **Image Segmentation Panel (C)**: Accessed via the segment icon, this panel allows precise object extraction using SAM (Segment Anything Model) with positive/negative dots or bounding boxes. ## Citation If you find MagicQuill V2 useful for your research, please cite our paper: ```bibtex @article{liu2025magicquillv2, title={MagicQuill V2: Precise and Interactive Image Editing with Layered Visual Cues}, author={Zichen Liu, Yue Yu, Hao Ouyang, Qiuyu Wang, Shuailei Ma, Ka Leong Cheng, Wen Wang, Qingyan Bai, Yuxuan Zhang, Yanhong Zeng, Yixuan Li, Xing Zhu, Yujun Shen, Qifeng Chen}, journal={arXiv:2512.03046}, year={2025} } ```