text-to-sql-dpo / README.md

Upload README.md with huggingface_hub

d61c85f verified about 1 month ago

5.22 kB

	---
	base_model: unsloth/llama-3-8B
	library_name: peft
	pipeline_tag: text-generation
	tags:
	- text-to-sql
	- dpo
	- lora
	- transformers
	- trl
	- sql-generation
	- database
	---

	# Text-to-SQL DPO Model

	A Direct Preference Optimization (DPO) fine-tuned LLaMA-3-8B model specialized for text-to-SQL generation tasks. This model has been trained using LoRA (Low-Rank Adaptation) for efficient parameter-efficient fine-tuning.

	## Model Details

	### Model Description

	This model is a fine-tuned version of LLaMA-3-8B using Direct Preference Optimization (DPO) specifically for text-to-SQL tasks. It has been trained on preference pairs to generate accurate SQL queries from natural language descriptions.

	- Developed by: faizack
	- Model type: Causal Language Model with LoRA adapter
	- Language(s) (NLP): English
	- License: Apache 2.0 (inherited from base model)
	- Finetuned from model: unsloth/llama-3-8B

	### Model Sources

	- Repository: [Text-to-SQL DPO Repository](https://github.com/IDEAS-Incubator/text-to-sql_DPO)
	- Base Model: [unsloth/llama-3-8B](https://huggingface.co/unsloth/llama-3-8B)

	## Uses

	### Direct Use

	This model is designed for generating SQL queries from natural language descriptions. It can be used for:

	- Converting natural language questions to SQL queries
	- Database query generation
	- Text-to-SQL applications
	- Database interaction interfaces

	### Example Usage

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	from peft import PeftModel
	import torch

	# Load the base model and tokenizer
	base_model = "unsloth/llama-3-8B"
	tokenizer = AutoTokenizer.from_pretrained(base_model)
	model = AutoModelForCausalLM.from_pretrained(base_model, torch_dtype=torch.float16)

	# Load the LoRA adapter
	model = PeftModel.from_pretrained(model, "faizack/text-to-sql-dpo")

	# Generate SQL query
	prompt = "Show me all users from the customers table"
	inputs = tokenizer(prompt, return_tensors="pt")
	outputs = model.generate(**inputs, max_length=100)
	response = tokenizer.decode(outputs[0], skip_special_tokens=True)
	print(response)
	```

	### Out-of-Scope Use

	This model should not be used for:
	- General-purpose text generation beyond SQL queries
	- Generating malicious or harmful SQL queries
	- Database operations without proper validation
	- Production use without proper testing and validation

	## Bias, Risks, and Limitations

	### Limitations

	- The model is specialized for SQL generation and may not perform well on other tasks
	- Generated SQL queries should be validated before execution
	- Performance may vary depending on database schema complexity
	- The model may generate queries that are syntactically correct but logically incorrect

	### Recommendations

	- Always validate generated SQL queries before execution
	- Test the model on your specific database schema
	- Use appropriate safety measures when executing generated queries
	- Consider the model's limitations when integrating into production systems

	## How to Get Started with the Model

	### Installation

	```bash
	pip install transformers peft torch
	```

	### Quick Start

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	from peft import PeftModel

	# Load model and adapter
	base_model = "unsloth/llama-3-8B"
	model = AutoModelForCausalLM.from_pretrained(base_model)
	model = PeftModel.from_pretrained(model, "faizack/text-to-sql-dpo")
	tokenizer = AutoTokenizer.from_pretrained(base_model)

	# Generate SQL
	prompt = "Find all orders placed in the last 30 days"
	inputs = tokenizer(prompt, return_tensors="pt")
	outputs = model.generate(**inputs, max_length=150, temperature=0.1)
	sql_query = tokenizer.decode(outputs[0], skip_special_tokens=True)
	print(sql_query)
	```

	## Training Details

	### Training Data

	The model was trained on the `zerolink/zsql-sqlite-dpo` dataset, which contains preference pairs for text-to-SQL tasks.

	### Training Procedure

	#### Training Hyperparameters

	- Training regime: DPO (Direct Preference Optimization)
	- Epochs: 6
	- Batch size: 2
	- Gradient accumulation: 32
	- Learning rate: 5e-5
	- LoRA rank: 16
	- LoRA alpha: 16
	- LoRA dropout: 0.05
	- Target modules: q_proj, v_proj

	#### Training Infrastructure

	- Base model: unsloth/llama-3-8B
	- Framework: PEFT (Parameter-Efficient Fine-Tuning)
	- Training method: LoRA (Low-Rank Adaptation)
	- Total steps: 120
	- Steps per epoch: 3660

	## Technical Specifications

	### Model Architecture

	- Base architecture: LLaMA-3-8B
	- Adapter type: LoRA
	- Trainable parameters: ~16M (LoRA adapter only)
	- Total parameters: ~8B (base model + adapter)

	### Compute Infrastructure

	- Hardware: GPU-based training
	- Framework versions:
	- PEFT: 0.17.1
	- Transformers: 4.56.2
	- PyTorch: Compatible with CUDA

	## Citation

	If you use this model in your research, please cite:

	```bibtex
	@misc{text-to-sql-dpo-2024,
	title={Text-to-SQL DPO Model},
	author={faizack},
	year={2024},
	url={https://huggingface.co/faizack/text-to-sql-dpo}
	}
	```

	## Model Card Contact

	For questions or issues related to this model, please contact the model author or open an issue in the repository.

	## Framework versions

	- PEFT 0.17.1
	- Transformers 4.56.2