LoRA Adapters: TinyLlama-1.1B Quote Generator
This repository contains the LoRA (Low-Rank Adaptation) adapter weights for a version of TinyLlama/TinyLlama-1.1B-Chat-v1.0 fine-tuned to generate motivational quotes.
These are only the adapter weights, not the full model. You must load these adapters onto the base TinyLlama model to use them.
This model was trained in Google Colab on a T4 GPU using QLoRA. The training process specialized the model, resulting in a 2.4x inference speedup on the same GPU compared to the base model.
- Base Model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
- Dataset: Abirate/english_quotes
β‘ Quick Start (How to use)
This shows how to load the 4-bit quantized base model and merge these adapters for fast inference on a GPU.
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline, BitsAndBytesConfig
from peft import PeftModel
base_model_name = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"
adapter_repo_name = "bkqz/tinyllama-quotes-adapters" # This repo
# 1. Load the 4-bit base model
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.float16,
bnb_4bit_use_double_quant=True,
)
base_model = AutoModelForCausalLM.from_pretrained(
base_model_name,
quantization_config=bnb_config,
device_map="auto",
trust_remote_code=True,
)
# 2. Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained(base_model_name, trust_remote_code=True)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"
# 3. Load the LoRA adapters from this repo
finetuned_model = PeftModel.from_pretrained(base_model, adapter_repo_name)
print("Base model and LoRA adapters loaded.")
# 4. Cast adapters to float16 to fix data type mismatch
finetuned_model.to(torch.float16)
# 5. Set up the generation pipeline
pipe = pipeline(
"text-generation",
model=finetuned_model,
tokenizer=tokenizer,
device_map="auto"
)
# 6. Generate a quote
prompt = "Keyword: life\nQuote:"
result = pipe(
prompt,
max_new_tokens=80,
do_sample=True,
temperature=0.7,
top_p=0.9,
eos_token_id=tokenizer.eos_token_id
)
print(result[0]['generated_text'])
π¬ Prompt Format
This model was trained on a very specific format. For best results, your prompt must end with \nQuote:.
Keyword: [YOUR_KEYWORD]\nQuote:
The model will generate a single quote and append - Unknown.
π οΈ Training Procedure
This model was fine-tuned using trl.SFTTrainer with QLoRA.
- Dataset: The
Abirate/english_quotesdataset was "exploded" so that each(quote, tag)pair became a unique training example. - Format: The training text was formatted as
Keyword: [tag]\nQuote: [quote] - Unknown. This was done to overwrite the base model's habit of adding real authors. - Evaluation: The model was trained with an 10% evaluation split and
early_stopping_patience=3to prevent overfitting.
Framework Versions
- TRL: 0.25.0
- Transformers: 4.57.1
- Pytorch: 2.8.0+cu126
- Datasets: 4.0.0
- Tokenizers: 0.22.1
- Downloads last month
- 7
Model tree for bkqz/tinyllama-quotes-adapters
Base model
TinyLlama/TinyLlama-1.1B-Chat-v1.0