|
|
--- |
|
|
base_model: unsloth/Llama-3.2-1B-Instruct-bnb-4bit |
|
|
library_name: transformers |
|
|
pipeline_tag: text-generation |
|
|
tags: |
|
|
- gguf |
|
|
- fine-tuned |
|
|
- lima |
|
|
language: |
|
|
- en |
|
|
license: apache-2.0 |
|
|
--- |
|
|
|
|
|
# Llama-3.2-1B-Instruct-bnb-4bit-lima - GGUF Format |
|
|
|
|
|
GGUF format quantizations for llama.cpp/Ollama. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **Base Model**: [unsloth/Llama-3.2-1B-Instruct-bnb-4bit](https://huggingface.co/unsloth/Llama-3.2-1B-Instruct-bnb-4bit) |
|
|
- **Format**: gguf |
|
|
- **Dataset**: [GAIR/lima](https://huggingface.co/datasets/GAIR/lima) |
|
|
- **Size**: 0.75 GB - 2.31 GB |
|
|
- **Usage**: llama.cpp / Ollama |
|
|
|
|
|
## Related Models |
|
|
|
|
|
- **LoRA Adapters**: [fs90/Llama-3.2-1B-Instruct-bnb-4bit-lima-lora](https://huggingface.co/fs90/Llama-3.2-1B-Instruct-bnb-4bit-lima-lora) - Smaller LoRA-only adapters |
|
|
- **Merged FP16 Model**: [fs90/Llama-3.2-1B-Instruct-bnb-4bit-lima](https://huggingface.co/fs90/Llama-3.2-1B-Instruct-bnb-4bit-lima) - Original unquantized model in FP16 |
|
|
|
|
|
|
|
|
## Prompt Format |
|
|
|
|
|
This model uses the **Llama 3.2** chat template. |
|
|
|
|
|
### Ollama Template Format |
|
|
|
|
|
``` |
|
|
{{ if .Messages }} |
|
|
{{- if or .System .Tools }}<|start_header_id|>system<|end_header_id|> |
|
|
{{- if .System }} |
|
|
|
|
|
{{ .System }} |
|
|
{{- end }} |
|
|
{{- if .Tools }} |
|
|
|
|
|
You are a helpful assistant with tool calling capabilities. When you receive a tool call response, use the output to format an answer to the original use question. |
|
|
{{- end }} |
|
|
{{- end }}<|eot_id|> |
|
|
{{- range $i, $_ := .Messages }} |
|
|
{{- $last := eq (len (slice $.Messages $i)) 1 }} |
|
|
{{- if eq .Role "user" }}<|start_header_id|>user<|end_header_id|> |
|
|
{{- if and $.Tools $last }} |
|
|
|
|
|
Given the following functions, please respond with a JSON for a function call with its proper arguments that best answers the given prompt. |
|
|
|
|
|
Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}. Do not use variables. |
|
|
|
|
|
{{ $.Tools }} |
|
|
{{- end }} |
|
|
|
|
|
{{ .Content }}<|eot_id|>{{ if $last }}<|start_header_id|>assistant<|end_header_id|> |
|
|
|
|
|
{{ end }} |
|
|
{{- else if eq .Role "assistant" }}<|start_header_id|>assistant<|end_header_id|> |
|
|
{{- if .ToolCalls }} |
|
|
|
|
|
{{- range .ToolCalls }}{"name": "{{ .Function.Name }}", "parameters": {{ .Function.Arguments }}}{{ end }} |
|
|
{{- else }} |
|
|
|
|
|
{{ .Content }}{{ if not $last }}<|eot_id|>{{ end }} |
|
|
{{- end }} |
|
|
{{- else if eq .Role "tool" }}<|start_header_id|>ipython<|end_header_id|> |
|
|
|
|
|
{{ .Content }}<|eot_id|>{{ if $last }}<|start_header_id|>assistant<|end_header_id|> |
|
|
|
|
|
{{ end }} |
|
|
{{- end }} |
|
|
{{- end }} |
|
|
{{- else }} |
|
|
{{- if .System }}<|start_header_id|>system<|end_header_id|> |
|
|
|
|
|
{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|> |
|
|
|
|
|
{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|> |
|
|
|
|
|
{{ end }}{{ .Response }}{{ if .Response }}<|eot_id|>{{ end }} |
|
|
``` |
|
|
|
|
|
|
|
|
## Training Details |
|
|
|
|
|
- **LoRA Rank**: 16 |
|
|
- **Training Steps**: 129 |
|
|
- **Training Loss**: 2.3025 |
|
|
- **Max Seq Length**: 4086 |
|
|
- **Training Scope**: 1,030 samples (1 epoch(s), full dataset) |
|
|
|
|
|
For complete training configuration, see the LoRA adapters repository/directory. |
|
|
|
|
|
## Available Quantizations |
|
|
|
|
|
| Quantization | File | Size | Quality | |
|
|
|--------------|------|------|---------| |
|
|
| **F16** | [Llama-3.2-1B-Instruct-bnb-4bit-lima-F16.gguf](Llama-3.2-1B-Instruct-bnb-4bit-lima-F16.gguf) | 2.31 GB | Full precision (largest) | |
|
|
| **Q4_K_M** | [Llama-3.2-1B-Instruct-bnb-4bit-lima-Q4_K_M.gguf](Llama-3.2-1B-Instruct-bnb-4bit-lima-Q4_K_M.gguf) | 0.75 GB | Good balance (recommended) | |
|
|
| **Q6_K** | [Llama-3.2-1B-Instruct-bnb-4bit-lima-Q6_K.gguf](Llama-3.2-1B-Instruct-bnb-4bit-lima-Q6_K.gguf) | 0.95 GB | High quality | |
|
|
| **Q8_0** | [Llama-3.2-1B-Instruct-bnb-4bit-lima-Q8_0.gguf](Llama-3.2-1B-Instruct-bnb-4bit-lima-Q8_0.gguf) | 1.23 GB | Very high quality, near original | |
|
|
|
|
|
**Usage:** Use the dropdown menu above to select a quantization, then follow HuggingFace's provided instructions. |
|
|
|
|
|
## License |
|
|
|
|
|
Based on unsloth/Llama-3.2-1B-Instruct-bnb-4bit and trained on GAIR/lima. |
|
|
Please refer to the original model and dataset licenses. |
|
|
|
|
|
## Credits |
|
|
|
|
|
**Trained by:** Farhan Syah |
|
|
|
|
|
**Training pipeline:** |
|
|
- [unsloth-finetuning](https://github.com/farhan-syah/unsloth-finetuning) by [@farhan-syah](https://github.com/farhan-syah) |
|
|
- [Unsloth](https://github.com/unslothai/unsloth) - 2x faster LLM fine-tuning |
|
|
|
|
|
**Base components:** |
|
|
- Base model: [unsloth/Llama-3.2-1B-Instruct-bnb-4bit](https://huggingface.co/unsloth/Llama-3.2-1B-Instruct-bnb-4bit) |
|
|
- Training dataset: [GAIR/lima](https://huggingface.co/datasets/GAIR/lima) by GAIR |
|
|
|