TL;DR:
BRIA 3.1 is our new text-to-image model that achieves high-quality generation while being trained exclusively on fully licensed data. We offer both API access and direct access to the model weights, making integration seamless for developers. The model is designed for both high visual fidelity and strong prompt alignment, while remaining relatively lightweight at 4 billion parameters.
Note: Soon available as open source for non-commercial use
Key Features
Improved Aesthetics: Generates highly appealing images across photorealistic, illustrative, and graphic styles.
High Prompt Alignment: Ensuring precise adherence to user-provided textual descriptions for more accurate and meaningful outputs.
Legally Compliant: Offers full legal liability coverage for copyright and privacy infringements. Thanks to training on 100% licensed data from leading data partners, we ensure the ethical use of content.
Attribution Engine: Our proprietary patented attribution engine compensates our data partners based on the generated images themselves.
Customizable Technology: Provides access to source code and weights for extensive customization.
Training Data and Attribution
BRIA 3.1 was trained exclusively on 100% licensed data from leading data partners. Our dataset does not include copyrighted materials such as fictional characters, logos, trademarks, public figures, harmful content, or privacy-infringing content.
To enable this unprecedented dataset, we utilize our Patented Attribution Engine, which fairly compensates our data partners based on the generated images. This ensures full legal liability coverage for copyright and privacy compliance.
Get Access
Bria 3.1 is avaialabe everywhere you build, either as source-code and weights, ComfyUI nodes or API endpoints.
- API Endpoint: Bria.ai
- ComfyUI: Use it in workflows
For more information, please visit our website.
Join our Discord community for more information, tutorials, tools, and to connect with other users!
For Commercial Use
- Purchase: for commercial license simply click Here.
Code example using Diffusers
pip install diffusers, hf_hub_download
from huggingface_hub import hf_hub_download
import os
try:
local_dir = os.path.dirname(__file__)
except:
local_dir = '.'
hf_hub_download(repo_id="briaai/BRIA-3.1", filename='pipeline_bria.py', local_dir=local_dir)
hf_hub_download(repo_id="briaai/BRIA-3.1", filename='transformer_bria.py', local_dir=local_dir)
hf_hub_download(repo_id="briaai/BRIA-3.1", filename='bria_utils.py', local_dir=local_dir)
import torch
from pipeline_bria import BriaPipeline
# trust_remote_code = True - allows loading a transformer which is not present at the transformers library(from transformer/bria_transformer.py)
pipe = BriaPipeline.from_pretrained("briaai/BRIA-3.1", torch_dtype=torch.bfloat16,trust_remote_code=True)
pipe.to(device="cuda")
prompt = "A portrait of a Beautiful and playful ethereal singer, golden designs, highly detailed, blurry background"
negative_prompt = "Logo,Watermark,Ugly,Morbid,Extra fingers,Poorly drawn hands,Mutation,Blurry,Extra limbs,Gross proportions,Missing arms,Mutated hands,Long neck,Duplicate,Mutilated,Mutilated hands,Poorly drawn face,Deformed,Bad anatomy,Cloned face,Malformed limbs,Missing legs,Too many fingers"
images = pipe(prompt=prompt, negative_prompt=negative_prompt, height=1024, width=1024).images[0]
Some tips for using our text-to-image model at inference:
Using negative prompt is recommended.
For Fine-tuning, use zeros instead of null text embedding.
We support multiple aspect ratios, yet resolution should overall consists approximately
1024*1024=1Mpixels, for example:((1024,1024), (1280, 768), (1344, 768), (832, 1216), (1152, 832), (1216, 832), (960,1088)Use 30-50 steps (higher is better)
Use
guidance_scaleof 5.0
Technical Details
These advancements were made possible through several key technical upgrades:
First, we augmented our large dataset with synthetic captions generated by cutting-edge vision-language models. Then, we improve our architecture by integrating state-of-the-art transformers, specifically using MMDIT and DIT layers, while training with a rectified flows objective. This approach is similar to other open models, such as AuraFlow, Flux, and SD3. BRIA 3.1 also employs 2D RoPE for positional embeddings, KQ normalization for enhanced training stability, and noise shifting for high-resolution training.
To ensure affordable inference and fine-tuning, BRIA 3.1 is designed to be compact, consisting of 28 MMDIT layers and 8 DIT layers, totaling 4 billion parameters. We exclusively use the T5 text encoder, avoiding CLIP to minimize unwanted biases. For spatial compression, we employ an open-source VAE f8, after confirming that the VAE does not introduce bias into the model.
Our base model is not distilled and natively supports classifier-free guidance, offering full flexibility for fine-tuning.
Additionally, BRIA 3.1 is trained on multiple aspect ratios and resolutions, allowing it to natively produce 1-megapixel images both horizontally and vertically.
Finally, we also provide full support for diffusers code libraries and ComfyUI, enabling fast experimentation and deployment.
Fine-tuning code would be provided soon.
- Downloads last month
- 5