LiuZichen nielsr HF Staff commited on
Commit
eeaa5e6
·
verified ·
1 Parent(s): 8eeb72d

Enhance model card with pipeline tag, paper, project, and code links (#1)

Browse files

- Enhance model card with pipeline tag, paper, project, and code links (5c1476c7250111adf51f685c5f60a665371929ea)


Co-authored-by: Niels Rogge <[email protected]>

Files changed (1) hide show
  1. README.md +87 -3
README.md CHANGED
@@ -1,3 +1,87 @@
1
- ---
2
- license: cc-by-nc-sa-4.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-sa-4.0
3
+ pipeline_tag: image-to-image
4
+ ---
5
+
6
+ # 🪶 MagicQuill V2: Precise and Interactive Image Editing with Layered Visual Cues
7
+
8
+ - **Paper:** [MagicQuillV2: Precise and Interactive Image Editing with Layered Visual Cues](https://huggingface.co/papers/2512.03046)
9
+ - **Project Page:** https://magicquill.art/v2/
10
+ - **Code Repository:** https://github.com/zliucz/MagicQuillV2
11
+ - **Hugging Face Spaces Demo:** https://huggingface.co/spaces/AI4Editing/MagicQuillV2
12
+
13
+ <br>
14
+
15
+ <div align="center">
16
+ <video src="https://github.com/user-attachments/assets/58079152-7729-48ed-9bb4-0ddfd1873dd0" width="100%" controls autoplay muted loop></video>
17
+ </div>
18
+
19
+ <br>
20
+
21
+ **TLDR:** MagicQuill V2 introduces a layered composition paradigm to generative image editing, disentangling creative intent into controllable visual cues (Content, Spatial, Structural, Color) for precise and intuitive control.
22
+
23
+ ## Hardware Requirements
24
+
25
+ Our model is based on Flux Kontext, which is large and computationally intensive.
26
+ - **VRAM**: Approximately **40GB** of VRAM is required for inference.
27
+ - **Speed**: It takes about **30 seconds** to generate a single image.
28
+
29
+ > **Important**: This is a research project focused on pushing the boundaries of interactive image editing. If you do not have sufficient GPU memory, we recommend checking out our [**MagicQuill V1**](https://github.com/ant-research/MagicQuill) or trying the online demo on [**Hugging Face Spaces**](https://huggingface.co/spaces/AI4Editing/MagicQuillV2).
30
+
31
+ ## Setup
32
+
33
+ 1. **Clone the repository**
34
+ ```bash
35
+ git clone https://github.com/magic-quill/MagicQuillV2.git
36
+ cd MagicQuillV2
37
+ ```
38
+
39
+ 2. **Create environment**
40
+ ```bash
41
+ conda create -n MagicQuillV2 python=3.10 -y
42
+ conda activate MagicQuillV2
43
+ ```
44
+
45
+ 3. **Install dependencies**
46
+ ```bash
47
+ pip install -r requirements.txt
48
+ ```
49
+
50
+ 4. **Download models**
51
+ Download the models from [Hugging Face](https://huggingface.co/LiuZichen/MagicQuillV2-models) and place them in the `models/` directory.
52
+
53
+ ```bash
54
+ huggingface-cli download LiuZichen/MagicQuillV2-models --local-dir models
55
+ ```
56
+
57
+ 5. **Run the demo**
58
+ ```bash
59
+ python app.py
60
+ ```
61
+
62
+ ## System Overview
63
+
64
+ The MagicQuill V2 interactive system is designed to unify our layered composition framework.
65
+
66
+ <div align="center">
67
+ <img src="https://github.com/zliucz/MagicQuillV2/raw/main/assets/V2_UI.png" alt="MagicQuill V2 UI" width="100%">
68
+ </div>
69
+
70
+ ### Key Upgrades from V1
71
+
72
+ 1. **Toolbar (A)**: Features a new **Local Edit Brush** for defining the target editing area, along with tools for sketching edges and applying color.
73
+ 2. **Visual Cue Manager (B)**: Holds all content layer visual cues (**foreground props**) that users can drag onto the canvas to define what to generate.
74
+ 3. **Image Segmentation Panel (C)**: Accessed via the segment icon, this panel allows precise object extraction using SAM (Segment Anything Model) with positive/negative dots or bounding boxes.
75
+
76
+ ## Citation
77
+
78
+ If you find MagicQuill V2 useful for your research, please cite our paper:
79
+
80
+ ```bibtex
81
+ @article{liu2025magicquillv2,
82
+ title={MagicQuill V2: Precise and Interactive Image Editing with Layered Visual Cues},
83
+ author={Zichen Liu, Yue Yu, Hao Ouyang, Qiuyu Wang, Shuailei Ma, Ka Leong Cheng, Wen Wang, Qingyan Bai, Yuxuan Zhang, Yanhong Zeng, Yixuan Li, Xing Zhu, Yujun Shen, Qifeng Chen},
84
+ journal={arXiv:2512.03046},
85
+ year={2025}
86
+ }
87
+ ```