Linaqruf
/

hermitage-xl

+---
+license: openrail++
+language:
+- en
+pipeline_tag: text-to-image
+tags:
+- stable-diffusion
+- stable-diffusion-diffusers
+inference: true
+widget:
+- text: >-
+    masterpiece, best quality, 1girl, brown hair, green eyes, colorful, autumn,
+    cumulonimbus clouds, lighting, blue sky, falling leaves, garden
+  example_title: example 1girl
+- text: >-
+    masterpiece, best quality, 1boy, medium hair, blonde hair, blue eyes,
+    bishounen, colorful, autumn, cumulonimbus clouds, lighting, blue sky,
+    falling leaves, garden
+  example_title: example 1boy
+library_name: diffusers
+---
+<style>
+  .title {
+    font-size: 2.5em;
+    text-align: center;
+    color: #333;
+    font-family: Arial, sans-serif;
+    text-transform: uppercase;
+    letter-spacing: 0.05em;
+    padding: 0.5em 0;
+    background: transparent;
+    box-shadow: 0px 0px 20px 0px rgba(0,0,0,0.15);
+    margin-bottom: 2em;
+    display: inline-block;
+    width: auto;
+  }
+  .title span {
+    background: -webkit-linear-gradient(45deg, #fe6b8b 30%, #ff8e53 90%);
+    -webkit-background-clip: text;
+    -webkit-text-fill-color: transparent;
+  }
+  .image-grid {
+    display: grid;
+    grid-template-columns: repeat(3, 1fr);
+    gap: 0.5em;
+  }
+  .image-item {
+    box-shadow: 0px 0px 10px 0px rgba(0,0,0,0.15);
+    padding: 10px;
+  }
+  .image-item img {
+    width: 100%;
+    height: 100%;
+    object-fit: cover;
+    border-radius: 10px;
+    transition: transform .2s;
+  }
+  .image-item img:hover {
+    transform: scale(1.1);
+  }
+  .custom-table {
+    table-layout: fixed;
+    width: 100%;
+    border-collapse: collapse;
+  }
+  .custom-table td {
+    width: 50%;
+    vertical-align: top;
+    padding: 10px;
+    box-shadow: 0px 0px 10px 0px rgba(0,0,0,0.15);
+  }
+  .custom-image {
+    width: 100%;
+    height: 100%;
+    object-fit: cover;
+    border-radius: 10px;
+    transition: transform .2s;
+  }
+  .custom-image:hover {
+    transform: scale(1.1);
+  }
+</style>
+<h1 class="title"><span>Hermitage XL</span></h1>
+<div class="image-grid">
+  <div class="image-item">
+    <a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/sample1.png">
+      <img src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/sample1.png">
+    </a>
+  </div>
+  <div class="image-item">
+    <a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/sample2.png">
+      <img src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/sample2.png">
+    </a>
+  </div>
+  <div class="image-item">
+    <a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/sample3.png">
+      <img src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/sample3.png">
+    </a>
+  </div>
+  <div class="image-item">
+    <a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/sample4.png">
+      <img src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/sample4.png">
+    </a>
+  </div>
+  <div class="image-item">
+    <a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/sample5.png">
+      <img src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/sample5.png">
+    </a>
+  </div>
+  <div class="image-item">
+    <a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/sample6.png">
+      <img src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/sample6.png">
+    </a>
+  </div>
+</div>
+## Overview
+Hermitage XL is a high-resolution, latent text-to-image diffusion model. The model has been fine-tuned using a learning rate of 4e-7 over 5000 steps with a batch size of 16 on a curated dataset of superior-quality anime-style images. This model is derived from Stable Diffusion XL 1.0.
+e.g. **_1girl, white hair, golden eyes, beautiful eyes, detail, flower meadow, cumulonimbus clouds, lighting, detailed sky, garden_**
+- Use it with the [`Stable Diffusion Webui`](https://github.com/AUTOMATIC1111/stable-diffusion-webui)
+- Use it with 🧨 [`diffusers`](https://huggingface.co/docs/diffusers/index)
+- Use it with the [`ComfyUI`](https://github.com/comfyanonymous/ComfyUI)
+## Features
+1. High-Resolution Images: The model trained with 1024x1024 resolution. The model is trained using [NovelAI Aspect Ratio Bucketing Tool](https://github.com/NovelAI/novelai-aspect-ratio-bucketing) so that it can be trained at non-square resolutions.
+2. Anime-styled Generation: Based on given text prompts, the model can create high quality anime-styled images.
+3. Fine-Tuned Diffusion Process: The model utilizes a fine-tuned diffusion process to ensure high quality and unique image output.
+## Model Details
+- **Developed by:** [Linaqruf](https://github.com/Linaqruf)
+- **Model type:** Diffusion-based text-to-image generative model
+- **Model Description:** This is a model that can be used to generate and modify anime-themed images based on text prompts.
+- **License:** [CreativeML Open RAIL++-M License](https://huggingface.co/stabilityai/stable-diffusion-2/blob/main/LICENSE-MODEL)
+- **Finetuned from model:** [Stable Diffusion XL 1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0)
+## How to Use:
+- Download `Hermitage XL` [here](https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/hermitage-xl.safetensors), the model is in `.safetensors` format.
+- You need to use Danbooru-style tag as prompt instead of natural language, otherwise you will get realistic result instead of anime
+- You can use any generic negative prompt or use the following suggested negative prompt to guide the model towards high aesthetic generationse:
+```
+lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry
+```
+- And, the following should also be prepended to prompts to get high aesthetic results:
+```
+masterpiece, best quality, illustration, beautiful detailed, finely detailed, dramatic light, intricate details
+```
+## 🧨 Diffusers
+Make sure to upgrade diffusers to >= 0.18.2:
+```
+pip install diffusers --upgrade
+```
+In addition make sure to install `transformers`, `safetensors`, `accelerate` as well as the invisible watermark:
+```
+pip install invisible_watermark transformers accelerate safetensors
+```
+Running the pipeline (if you don't swap the scheduler it will run with the default **EulerDiscreteScheduler** in this example we are swapping it to **EulerAncestralDiscreteScheduler**:
+```py
+import torch
+from torch import autocast
+from diffusers.models import AutoencoderKL
+from diffusers import StableDiffusionXLPipeline, EulerAncestralDiscreteScheduler
+model = "Linaqruf/hermitage-xl"
+vae = AutoencoderKL.from_pretrained("stabilityai/sdxl-vae")
+pipe = StableDiffusionXLPipeline.from_pretrained(
+    model,
+    torch_dtype=torch.float16,
+    use_safetensors=True,
+    variant="fp16",
+    vae=vae
+    )
+pipe.scheduler = EulerAncestralDiscreteScheduler.from_config(pipe.scheduler.config)
+pipe.to('cuda')
+prompt = "masterpiece, best quality, 1girl, green hair, sweater, looking at viewer, upper body, beanie, outdoors, watercolor, night, turtleneck"
+negative_prompt = "lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts, signature, watermark, username, blurry"
+image = pipe(
+    prompt,
+    negative_prompt=negative_prompt,
+    width=1024,
+    height=1024,
+    guidance_scale=12,
+    target_size=(1024,1024),
+    original_size=(4096,4096),
+    num_inference_steps=50
+    ).images[0]
+image.save("anime_girl.png")
+```
+## Limitation
+1. This model inherit Stable Diffusion XL 1.0 [limitation](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0#limitations)
+2. This model is overfitted and cannot follow prompts well, because it's fine-tuned for 5000 steps with small scale datasets.
+3. It's only a preview model to find good hyperparameter and training config for Stable Diffusion XL 1.0
+## Example
+Here is some cherrypicked samples and comparison between available models:
+<table class="custom-table">
+  <tr>
+    <td>
+      <a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/image1.png">
+        <img class="custom-image" src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/image1.png" alt="sample1">
+      </a>
+      <a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/image3.png">
+        <img class="custom-image" src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/image3.png" alt="sample3">
+      </a>
+    </td>
+    <td>
+      <a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/image2.png">
+        <img class="custom-image" src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/image2.png" alt="sample2">
+      </a>
+      <a href="https://huggingface.co/Linaqruf/hermitage-xl/blob/main/sample_images/image4.png">
+        <img class="custom-image" src="https://huggingface.co/Linaqruf/hermitage-xl/resolve/main/sample_images/image4.png" alt="sample4">
+      </a>
+    </td>
+  </tr>
+</table>
+<hr>