Kranex7 commited on
Commit
d9e3326
·
verified ·
1 Parent(s): 857660e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -85
README.md CHANGED
@@ -1,85 +1 @@
1
- # Stable Diffusion 3 Inpainting Pipeline
2
-
3
- This is the implementation of `Stable Diffusion 3 Inpainting Pipeline`.
4
-
5
- | input image | input mask image | output |
6
- |:-------------------------:|:-------------------------:|:-------------------------:|
7
- |<img src="./overture-creations-5sI6fQgYIuo.png" width = "400" /> | <img src="./overture-creations-5sI6fQgYIuo_mask.png" width = "400" /> | <img src="./overture-creations-5sI6fQgYIuo_output.jpg" width = "400" /> |
8
- |<img src="./overture-creations-5sI6fQgYIuo.png" width = "400" /> | <img src="./overture-creations-5sI6fQgYIuo_mask.png" width = "400" /> | <img src="./overture-creations-5sI6fQgYIuo_tiger.jpg" width = "400" /> |
9
- |<img src="./overture-creations-5sI6fQgYIuo.png" width = "400" /> | <img src="./overture-creations-5sI6fQgYIuo_mask.png" width = "400" /> | <img src="./overture-creations-5sI6fQgYIuo_panda.jpg" width = "400" /> |
10
-
11
- **Please ensure that the version of diffusers >= 0.29.1**
12
-
13
- ## Model
14
-
15
- [Stable Diffusion 3 Medium](https://stability.ai/news/stable-diffusion-3-medium) is a Multimodal Diffusion Transformer (MMDiT) text-to-image model that features greatly improved performance in image quality, typography, complex prompt understanding, and resource-efficiency.
16
-
17
- For more technical details, please refer to the [Research paper](https://stability.ai/news/stable-diffusion-3-research-paper).
18
-
19
- Please note: this model is released under the Stability Non-Commercial Research Community License. For a Creator License or an Enterprise License visit Stability.ai or [contact us](https://stability.ai/license) for commercial licensing details.
20
-
21
-
22
- ### Model Description
23
-
24
- - **Developed by:** Stability AI
25
- - **Model type:** MMDiT text-to-image generative model
26
- - **Model Description:** This is a model that can be used to generate images based on text prompts. It is a Multimodal Diffusion Transformer
27
- (https://arxiv.org/abs/2403.03206) that uses three fixed, pretrained text encoders
28
- ([OpenCLIP-ViT/G](https://github.com/mlfoundations/open_clip), [CLIP-ViT/L](https://github.com/openai/CLIP/tree/main) and [T5-xxl](https://huggingface.co/google/t5-v1_1-xxl))
29
-
30
- ## Demo
31
-
32
- Make sure you upgrade to the latest version of diffusers: pip install -U diffusers. And then you can run:
33
-
34
- ```python
35
- import torch
36
- from torchvision import transforms
37
-
38
- from pipeline_stable_diffusion_3_inpaint import StableDiffusion3InpaintPipeline
39
- from diffusers.utils import load_image
40
-
41
- def preprocess_image(image):
42
- image = image.convert("RGB")
43
- image = transforms.CenterCrop((image.size[1] // 64 * 64, image.size[0] // 64 * 64))(image)
44
- image = transforms.ToTensor()(image)
45
- image = image.unsqueeze(0).to("cuda")
46
- return image
47
-
48
- def preprocess_mask(mask):
49
- mask = mask.convert("L")
50
- mask = transforms.CenterCrop((mask.size[1] // 64 * 64, mask.size[0] // 64 * 64))(mask)
51
- mask = transforms.ToTensor()(mask)
52
- mask = mask.to("cuda")
53
- return mask
54
-
55
- pipe = StableDiffusion3InpaintPipeline.from_pretrained(
56
- "stabilityai/stable-diffusion-3-medium-diffusers",
57
- torch_dtype=torch.float16,
58
- ).to("cuda")
59
-
60
- prompt = "Face of a yellow cat, high resolution, sitting on a park bench"
61
- source_image = load_image(
62
- "./overture-creations-5sI6fQgYIuo.png"
63
- )
64
- source = preprocess_image(source_image)
65
- mask = preprocess_mask(
66
- load_image(
67
- "./overture-creations-5sI6fQgYIuo_mask.png"
68
- )
69
- )
70
-
71
- image = pipe(
72
- prompt=prompt,
73
- image=source,
74
- mask_image=mask,
75
- height=1024,
76
- width=1024,
77
- num_inference_steps=50,
78
- guidance_scale=7.0,
79
- strength=0.6,
80
- ).images[0]
81
-
82
- image.save("overture-creations-5sI6fQgYIuo_output.jpg")
83
- ```
84
-
85
-
 
1
+ pip install diffusers transformers torch accelerate safetensors