lodestones
/

Chroma1-Base

+---
+license: apache-2.0
+pipeline_tag: text-to-image
+---
+# Chroma1-Base
+Chroma1-Base is an **8.9B** parameter text-to-image foundational model based on **FLUX.1-schnell**. It is fully **Apache 2.0 licensed**, ensuring that anyone can use, modify, and build upon it.
+As a **base model**, Chroma1 is intentionally designed to be an excellent starting point for **finetuning**. It provides a strong, neutral foundation for developers, researchers, and artists to create specialized models.
+for the fast CFG "baked" version please go to [Chroma1-Flash](https://huggingface.co/lodestones/Chroma1-Flash).
+### Key Features
+*   **High-Performance Base:** 8.9B parameters, built on the powerful FLUX.1 architecture.
+*   **Easily Finetunable:** Designed as an ideal checkpoint for creating custom, specialized models.
+*   **Community-Driven & Open-Source:** Fully transparent with an Apache 2.0 license, and training history.
+*   **Flexible by Design:** Provides a flexible foundation for a wide range of generative tasks.
+## Special Thanks
+A massive thank you to our supporters who make this project possible.
+*   **Anonymous donor** whose incredible generosity funded the pretraining run and data collections. Your support has been transformative for open-source AI.
+*   **Fictional.ai** for their fantastic support and for helping push the boundaries of open-source AI. You can try Chroma on their platform:
+[![FictionalChromaBanner_1.png](./images/FictionalChromaBanner_1.png)](https://fictional.ai/?ref=chroma_hf)
+## How to Use
+### `diffusers` Library
+```python
+import torch
+from diffusers import ChromaPipeline
+pipe = ChromaPipeline.from_pretrained("lodestones/Chroma1-Base", torch_dtype=torch.bfloat16)
+pipe.enable_model_cpu_offload()
+prompt = [
+    "A high-fashion close-up portrait of a blonde woman in clear sunglasses. The image uses a bold teal and red color split for dramatic lighting. The background is a simple teal-green. The photo is sharp and well-composed, and is designed for viewing with anaglyph 3D glasses for optimal effect. It looks professionally done."
+]
+negative_prompt =  ["low quality, ugly, unfinished, out of focus, deformed, disfigure, blurry, smudged, restricted palette, flat colors"]
+image = pipe(
+    prompt=prompt,
+    negative_prompt=negative_prompt,
+    generator=torch.Generator("cpu").manual_seed(433),
+    num_inference_steps=40,
+    guidance_scale=3.0,
+    num_images_per_prompt=1,
+).images[0]
+image.save("chroma.png")
+```
+ComfyUI
+For advanced users and customized workflows, you can use Chroma with ComfyUI.
+**Requirements:**
+*   A working ComfyUI installation.
+*   [Chroma checkpoint](https://huggingface.co/lodestones/Chroma) (latest version).
+*   [T5 XXL Text Encoder](https://huggingface.co/comfyanonymous/flux_text_encoders/resolve/main/t5xxl_fp16.safetensors).
+*   [FLUX VAE](https://huggingface.co/lodestones/Chroma/resolve/main/ae.safetensors).
+*   [Chroma Workflow JSON](https://huggingface.co/lodestones/Chroma/resolve/main/ChromaSimpleWorkflow20250507.json).
+**Setup:**
+1.  Place the `T5_xxl` model in your `ComfyUI/models/clip` folder.
+2.  Place the `FLUX VAE` in your `ComfyUI/models/vae` folder.
+3.  Place the `Chroma checkpoint` in your `ComfyUI/models/diffusion_models` folder.
+4.  Load the Chroma workflow file into ComfyUI and run.
+## Model Details
+*   **Architecture:** Based on the 8.9B parameter FLUX.1-schnell model.
+*   **Training Data:** Trained on a 5M sample dataset curated from a 20M pool, including artistic, photographic, and niche styles.
+*   **Technical Report:** A comprehensive technical paper detailing the architectural modifications and training process is forthcoming.
+## Intended Use
+Chroma is intended to be used as a **base model** for researchers and developers to build upon. It is ideal for:
+*   Finetuning on specific styles, concepts, or characters.
+*   Research into generative model behavior, alignment, and safety.
+*   As a foundational component in larger AI systems.
+## Limitations and Bias Statement
+Chroma is trained on a broad, filtered dataset from the internet. As such, it may reflect the biases and stereotypes present in its training data. The model is released in a state as is and has not been aligned with a specific safety filter.
+Users are responsible for their own use of this model. It has the potential to generate content that may be considered harmful, explicit, or offensive. I encourage developers to implement appropriate safeguards and ethical considerations in their downstream applications.
+## Summary of Architectural Modifications
+*(For a full breakdown, tech report soon-ish.)*
+*   **12B → 8.9B Parameters:**
+    *   **TL;DR:** I replaced a 3.3B parameter timestep-encoding layer with a more efficient 250M parameter FFN, as the original was vastly oversized for its task.
+*   **MMDiT Masking:**
+    *   **TL;DR:** Masking T5 padding tokens enhanced fidelity and increased training stability by preventing the model from focusing on irrelevant `<pad>` tokens.
+*   **Custom Timestep Distributions:**
+    *   **TL;DR:** I implemented a custom timestep sampling distribution (`-x^2`) to prevent loss spikes and ensure the model trains effectively on both high-noise and low-noise regions.
+## P.S
+Chroma1-Base is Chroma-v.48
+## Citation
+```
+@misc{rock2025chroma,
+  author = {Lodestone Rock},
+  title = {Chroma1-Base},
+  year = {2025},
+  publisher = {Hugging Face},
+  journal = {Hugging Face repository},
+  howpublished = {\url{https://huggingface.co/lodestones/Chroma1-Base}},
+}
+```