Llama-3.2-1B-Instruct-bnb-4bit-gsm8k - GGUF Format
GGUF format quantizations for llama.cpp/Ollama.
Model Details
- Base Model: unsloth/Llama-3.2-1B-Instruct-bnb-4bit
- Format: gguf
- Dataset: openai/gsm8k
- Size: 0.75 GB - 2.31 GB
- Usage: llama.cpp / Ollama
Related Models
- LoRA Adapters: fs90/Llama-3.2-1B-Instruct-bnb-4bit-gsm8k-lora - Smaller LoRA-only adapters
- Merged FP16 Model: fs90/Llama-3.2-1B-Instruct-bnb-4bit-gsm8k - Original unquantized model in FP16
Prompt Format
This model uses the Llama 3.2 chat template.
Ollama Template Format
{{ if .Messages }}
{{- if or .System .Tools }}<|start_header_id|>system<|end_header_id|>
{{- if .System }}
{{ .System }}
{{- end }}
{{- if .Tools }}
You are a helpful assistant with tool calling capabilities. When you receive a tool call response, use the output to format an answer to the original use question.
{{- end }}
{{- end }}<|eot_id|>
{{- range $i, $_ := .Messages }}
{{- $last := eq (len (slice $.Messages $i)) 1 }}
{{- if eq .Role "user" }}<|start_header_id|>user<|end_header_id|>
{{- if and $.Tools $last }}
Given the following functions, please respond with a JSON for a function call with its proper arguments that best answers the given prompt.
Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}. Do not use variables.
{{ $.Tools }}
{{- end }}
{{ .Content }}<|eot_id|>{{ if $last }}<|start_header_id|>assistant<|end_header_id|>
{{ end }}
{{- else if eq .Role "assistant" }}<|start_header_id|>assistant<|end_header_id|>
{{- if .ToolCalls }}
{{- range .ToolCalls }}{"name": "{{ .Function.Name }}", "parameters": {{ .Function.Arguments }}}{{ end }}
{{- else }}
{{ .Content }}{{ if not $last }}<|eot_id|>{{ end }}
{{- end }}
{{- else if eq .Role "tool" }}<|start_header_id|>ipython<|end_header_id|>
{{ .Content }}<|eot_id|>{{ if $last }}<|start_header_id|>assistant<|end_header_id|>
{{ end }}
{{- end }}
{{- end }}
{{- else }}
{{- if .System }}<|start_header_id|>system<|end_header_id|>
{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>
{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>
{{ end }}{{ .Response }}{{ if .Response }}<|eot_id|>{{ end }}
Training Details
- LoRA Rank: 32
- Training Steps: 1870
- Training Loss: 0.7500
- Max Seq Length: 2048
- Training Scope: 7,473 samples (2 epoch(s), full dataset)
For complete training configuration, see the LoRA adapters repository/directory.
Benchmark Results
Benchmarked on the merged 16-bit safetensor model
Evaluated: 2025-11-24 14:29
| Model | Type | gsm8k |
|---|---|---|
| unsloth/Llama-3.2-1B-Instruct-bnb-4bit | Base | 0.1463 |
| Llama-3.2-1B-Instruct-bnb-4bit-gsm8k | Fine-tuned | 0.3230 |
Available Quantizations
| Quantization | File | Size | Quality |
|---|---|---|---|
| F16 | Llama-3.2-1B-Instruct-bnb-4bit-gsm8k-F16.gguf | 2.31 GB | Full precision (largest) |
| Q4_K_M | Llama-3.2-1B-Instruct-bnb-4bit-gsm8k-Q4_K_M.gguf | 0.75 GB | Good balance (recommended) |
| Q6_K | Llama-3.2-1B-Instruct-bnb-4bit-gsm8k-Q6_K.gguf | 0.95 GB | High quality |
| Q8_0 | Llama-3.2-1B-Instruct-bnb-4bit-gsm8k-Q8_0.gguf | 1.23 GB | Very high quality, near original |
Usage: Use the dropdown menu above to select a quantization, then follow HuggingFace's provided instructions.
License
Based on unsloth/Llama-3.2-1B-Instruct-bnb-4bit and trained on openai/gsm8k. Please refer to the original model and dataset licenses.
Credits
Trained by: Your Name
Training pipeline:
- unsloth-finetuning by @farhan-syah
- Unsloth - 2x faster LLM fine-tuning
Base components:
- Base model: unsloth/Llama-3.2-1B-Instruct-bnb-4bit
- Training dataset: openai/gsm8k by openai
- Downloads last month
- 83
Hardware compatibility
Log In
to view the estimation
4-bit
6-bit
8-bit
16-bit
Model tree for fs90/Llama-3.2-1B-Instruct-bnb-4bit-gsm8k-GGUF
Base model
meta-llama/Llama-3.2-1B-Instruct
Quantized
unsloth/Llama-3.2-1B-Instruct-bnb-4bit