fs90
/

Llama-3.2-1B-Instruct-bnb-4bit-gsm8k-GGUF

+---
+base_model: unsloth/Llama-3.2-1B-Instruct-bnb-4bit
+library_name: transformers
+pipeline_tag: text-generation
+tags:
+- gguf
+- fine-tuned
+- gsm8k
+language:
+- en
+license: apache-2.0
+---
+# Llama-3.2-1B-Instruct-bnb-4bit-gsm8k - GGUF Format
+GGUF format quantizations for llama.cpp/Ollama.
+## Model Details
+- **Base Model**: [unsloth/Llama-3.2-1B-Instruct-bnb-4bit](https://huggingface.co/unsloth/Llama-3.2-1B-Instruct-bnb-4bit)
+- **Format**: gguf
+- **Dataset**: [openai/gsm8k](https://huggingface.co/datasets/openai/gsm8k)
+- **Size**: 0.75 GB - 2.31 GB
+- **Usage**: llama.cpp / Ollama
+## Related Models
+- **LoRA Adapters**: [fs90/Llama-3.2-1B-Instruct-bnb-4bit-gsm8k-lora](https://huggingface.co/fs90/Llama-3.2-1B-Instruct-bnb-4bit-gsm8k-lora) - Smaller LoRA-only adapters
+- **Merged FP16 Model**: [fs90/Llama-3.2-1B-Instruct-bnb-4bit-gsm8k](https://huggingface.co/fs90/Llama-3.2-1B-Instruct-bnb-4bit-gsm8k) - Original unquantized model in FP16
+## Prompt Format
+This model uses the **Llama 3.2** chat template.
+### Ollama Template Format
+```
+{{ if .Messages }}
+{{- if or .System .Tools }}<|start_header_id|>system<|end_header_id|>
+{{- if .System }}
+{{ .System }}
+{{- end }}
+{{- if .Tools }}
+You are a helpful assistant with tool calling capabilities. When you receive a tool call response, use the output to format an answer to the original use question.
+{{- end }}
+{{- end }}<|eot_id|>
+{{- range $i, $_ := .Messages }}
+{{- $last := eq (len (slice $.Messages $i)) 1 }}
+{{- if eq .Role "user" }}<|start_header_id|>user<|end_header_id|>
+{{- if and $.Tools $last }}
+Given the following functions, please respond with a JSON for a function call with its proper arguments that best answers the given prompt.
+Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}. Do not use variables.
+{{ $.Tools }}
+{{- end }}
+{{ .Content }}<|eot_id|>{{ if $last }}<|start_header_id|>assistant<|end_header_id|>
+{{ end }}
+{{- else if eq .Role "assistant" }}<|start_header_id|>assistant<|end_header_id|>
+{{- if .ToolCalls }}
+{{- range .ToolCalls }}{"name": "{{ .Function.Name }}", "parameters": {{ .Function.Arguments }}}{{ end }}
+{{- else }}
+{{ .Content }}{{ if not $last }}<|eot_id|>{{ end }}
+{{- end }}
+{{- else if eq .Role "tool" }}<|start_header_id|>ipython<|end_header_id|>
+{{ .Content }}<|eot_id|>{{ if $last }}<|start_header_id|>assistant<|end_header_id|>
+{{ end }}
+{{- end }}
+{{- end }}
+{{- else }}
+{{- if .System }}<|start_header_id|>system<|end_header_id|>
+{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>
+{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>
+{{ end }}{{ .Response }}{{ if .Response }}<|eot_id|>{{ end }}
+```
+## Training Details
+- **LoRA Rank**: 32
+- **Training Steps**: 1870
+- **Training Loss**: 0.7500
+- **Max Seq Length**: 2048
+- **Training Scope**: 7,473 samples (2 epoch(s), full dataset)
+For complete training configuration, see the LoRA adapters repository/directory.
+## Benchmark Results
+*Benchmarked on the merged 16-bit safetensor model*
+*Evaluated: 2025-11-24 14:29*
+| Model | Type | gsm8k |
+|-------|------|--------|
+| unsloth/Llama-3.2-1B-Instruct-bnb-4bit | Base | 0.1463 |
+| Llama-3.2-1B-Instruct-bnb-4bit-gsm8k | Fine-tuned | 0.3230 |
+## Available Quantizations
+| Quantization | File | Size | Quality |
+|--------------|------|------|---------|
+| **F16** | [Llama-3.2-1B-Instruct-bnb-4bit-gsm8k-F16.gguf](Llama-3.2-1B-Instruct-bnb-4bit-gsm8k-F16.gguf) | 2.31 GB | Full precision (largest) |
+| **Q4_K_M** | [Llama-3.2-1B-Instruct-bnb-4bit-gsm8k-Q4_K_M.gguf](Llama-3.2-1B-Instruct-bnb-4bit-gsm8k-Q4_K_M.gguf) | 0.75 GB | Good balance (recommended) |
+| **Q6_K** | [Llama-3.2-1B-Instruct-bnb-4bit-gsm8k-Q6_K.gguf](Llama-3.2-1B-Instruct-bnb-4bit-gsm8k-Q6_K.gguf) | 0.95 GB | High quality |
+| **Q8_0** | [Llama-3.2-1B-Instruct-bnb-4bit-gsm8k-Q8_0.gguf](Llama-3.2-1B-Instruct-bnb-4bit-gsm8k-Q8_0.gguf) | 1.23 GB | Very high quality, near original |
+**Usage:** Use the dropdown menu above to select a quantization, then follow HuggingFace's provided instructions.
+## License
+Based on unsloth/Llama-3.2-1B-Instruct-bnb-4bit and trained on openai/gsm8k.
+Please refer to the original model and dataset licenses.
+## Credits
+**Trained by:** Your Name
+**Training pipeline:**
+- [unsloth-finetuning](https://github.com/farhan-syah/unsloth-finetuning) by [@farhan-syah](https://github.com/farhan-syah)
+- [Unsloth](https://github.com/unslothai/unsloth) - 2x faster LLM fine-tuning
+**Base components:**
+- Base model: [unsloth/Llama-3.2-1B-Instruct-bnb-4bit](https://huggingface.co/unsloth/Llama-3.2-1B-Instruct-bnb-4bit)
+- Training dataset: [openai/gsm8k](https://huggingface.co/datasets/openai/gsm8k) by openai