text-to-sql-dpo / README.md
faizack's picture
Upload README.md with huggingface_hub
d61c85f verified
---
base_model: unsloth/llama-3-8B
library_name: peft
pipeline_tag: text-generation
tags:
- text-to-sql
- dpo
- lora
- transformers
- trl
- sql-generation
- database
---
# Text-to-SQL DPO Model
A Direct Preference Optimization (DPO) fine-tuned LLaMA-3-8B model specialized for text-to-SQL generation tasks. This model has been trained using LoRA (Low-Rank Adaptation) for efficient parameter-efficient fine-tuning.
## Model Details
### Model Description
This model is a fine-tuned version of LLaMA-3-8B using Direct Preference Optimization (DPO) specifically for text-to-SQL tasks. It has been trained on preference pairs to generate accurate SQL queries from natural language descriptions.
- **Developed by:** faizack
- **Model type:** Causal Language Model with LoRA adapter
- **Language(s) (NLP):** English
- **License:** Apache 2.0 (inherited from base model)
- **Finetuned from model:** unsloth/llama-3-8B
### Model Sources
- **Repository:** [Text-to-SQL DPO Repository](https://github.com/IDEAS-Incubator/text-to-sql_DPO)
- **Base Model:** [unsloth/llama-3-8B](https://huggingface.co/unsloth/llama-3-8B)
## Uses
### Direct Use
This model is designed for generating SQL queries from natural language descriptions. It can be used for:
- Converting natural language questions to SQL queries
- Database query generation
- Text-to-SQL applications
- Database interaction interfaces
### Example Usage
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
# Load the base model and tokenizer
base_model = "unsloth/llama-3-8B"
tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(base_model, torch_dtype=torch.float16)
# Load the LoRA adapter
model = PeftModel.from_pretrained(model, "faizack/text-to-sql-dpo")
# Generate SQL query
prompt = "Show me all users from the customers table"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```
### Out-of-Scope Use
This model should not be used for:
- General-purpose text generation beyond SQL queries
- Generating malicious or harmful SQL queries
- Database operations without proper validation
- Production use without proper testing and validation
## Bias, Risks, and Limitations
### Limitations
- The model is specialized for SQL generation and may not perform well on other tasks
- Generated SQL queries should be validated before execution
- Performance may vary depending on database schema complexity
- The model may generate queries that are syntactically correct but logically incorrect
### Recommendations
- Always validate generated SQL queries before execution
- Test the model on your specific database schema
- Use appropriate safety measures when executing generated queries
- Consider the model's limitations when integrating into production systems
## How to Get Started with the Model
### Installation
```bash
pip install transformers peft torch
```
### Quick Start
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
# Load model and adapter
base_model = "unsloth/llama-3-8B"
model = AutoModelForCausalLM.from_pretrained(base_model)
model = PeftModel.from_pretrained(model, "faizack/text-to-sql-dpo")
tokenizer = AutoTokenizer.from_pretrained(base_model)
# Generate SQL
prompt = "Find all orders placed in the last 30 days"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=150, temperature=0.1)
sql_query = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(sql_query)
```
## Training Details
### Training Data
The model was trained on the `zerolink/zsql-sqlite-dpo` dataset, which contains preference pairs for text-to-SQL tasks.
### Training Procedure
#### Training Hyperparameters
- **Training regime:** DPO (Direct Preference Optimization)
- **Epochs:** 6
- **Batch size:** 2
- **Gradient accumulation:** 32
- **Learning rate:** 5e-5
- **LoRA rank:** 16
- **LoRA alpha:** 16
- **LoRA dropout:** 0.05
- **Target modules:** q_proj, v_proj
#### Training Infrastructure
- **Base model:** unsloth/llama-3-8B
- **Framework:** PEFT (Parameter-Efficient Fine-Tuning)
- **Training method:** LoRA (Low-Rank Adaptation)
- **Total steps:** 120
- **Steps per epoch:** 3660
## Technical Specifications
### Model Architecture
- **Base architecture:** LLaMA-3-8B
- **Adapter type:** LoRA
- **Trainable parameters:** ~16M (LoRA adapter only)
- **Total parameters:** ~8B (base model + adapter)
### Compute Infrastructure
- **Hardware:** GPU-based training
- **Framework versions:**
- PEFT: 0.17.1
- Transformers: 4.56.2
- PyTorch: Compatible with CUDA
## Citation
If you use this model in your research, please cite:
```bibtex
@misc{text-to-sql-dpo-2024,
title={Text-to-SQL DPO Model},
author={faizack},
year={2024},
url={https://huggingface.co/faizack/text-to-sql-dpo}
}
```
## Model Card Contact
For questions or issues related to this model, please contact the model author or open an issue in the repository.
## Framework versions
- PEFT 0.17.1
- Transformers 4.56.2