fs90 commited on
Commit
f3be27e
·
verified ·
1 Parent(s): 1bfef1f

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +139 -0
README.md ADDED
@@ -0,0 +1,139 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: unsloth/Llama-3.2-1B-Instruct-bnb-4bit
3
+ library_name: transformers
4
+ pipeline_tag: text-generation
5
+ tags:
6
+ - gguf
7
+ - fine-tuned
8
+ - gsm8k
9
+ language:
10
+ - en
11
+ license: apache-2.0
12
+ ---
13
+
14
+ # Llama-3.2-1B-Instruct-bnb-4bit-gsm8k - GGUF Format
15
+
16
+ GGUF format quantizations for llama.cpp/Ollama.
17
+
18
+ ## Model Details
19
+
20
+ - **Base Model**: [unsloth/Llama-3.2-1B-Instruct-bnb-4bit](https://huggingface.co/unsloth/Llama-3.2-1B-Instruct-bnb-4bit)
21
+ - **Format**: gguf
22
+ - **Dataset**: [openai/gsm8k](https://huggingface.co/datasets/openai/gsm8k)
23
+ - **Size**: 0.75 GB - 2.31 GB
24
+ - **Usage**: llama.cpp / Ollama
25
+
26
+ ## Related Models
27
+
28
+ - **LoRA Adapters**: [fs90/Llama-3.2-1B-Instruct-bnb-4bit-gsm8k-lora](https://huggingface.co/fs90/Llama-3.2-1B-Instruct-bnb-4bit-gsm8k-lora) - Smaller LoRA-only adapters
29
+ - **Merged FP16 Model**: [fs90/Llama-3.2-1B-Instruct-bnb-4bit-gsm8k](https://huggingface.co/fs90/Llama-3.2-1B-Instruct-bnb-4bit-gsm8k) - Original unquantized model in FP16
30
+
31
+
32
+ ## Prompt Format
33
+
34
+ This model uses the **Llama 3.2** chat template.
35
+
36
+ ### Ollama Template Format
37
+
38
+ ```
39
+ {{ if .Messages }}
40
+ {{- if or .System .Tools }}<|start_header_id|>system<|end_header_id|>
41
+ {{- if .System }}
42
+
43
+ {{ .System }}
44
+ {{- end }}
45
+ {{- if .Tools }}
46
+
47
+ You are a helpful assistant with tool calling capabilities. When you receive a tool call response, use the output to format an answer to the original use question.
48
+ {{- end }}
49
+ {{- end }}<|eot_id|>
50
+ {{- range $i, $_ := .Messages }}
51
+ {{- $last := eq (len (slice $.Messages $i)) 1 }}
52
+ {{- if eq .Role "user" }}<|start_header_id|>user<|end_header_id|>
53
+ {{- if and $.Tools $last }}
54
+
55
+ Given the following functions, please respond with a JSON for a function call with its proper arguments that best answers the given prompt.
56
+
57
+ Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}. Do not use variables.
58
+
59
+ {{ $.Tools }}
60
+ {{- end }}
61
+
62
+ {{ .Content }}<|eot_id|>{{ if $last }}<|start_header_id|>assistant<|end_header_id|>
63
+
64
+ {{ end }}
65
+ {{- else if eq .Role "assistant" }}<|start_header_id|>assistant<|end_header_id|>
66
+ {{- if .ToolCalls }}
67
+
68
+ {{- range .ToolCalls }}{"name": "{{ .Function.Name }}", "parameters": {{ .Function.Arguments }}}{{ end }}
69
+ {{- else }}
70
+
71
+ {{ .Content }}{{ if not $last }}<|eot_id|>{{ end }}
72
+ {{- end }}
73
+ {{- else if eq .Role "tool" }}<|start_header_id|>ipython<|end_header_id|>
74
+
75
+ {{ .Content }}<|eot_id|>{{ if $last }}<|start_header_id|>assistant<|end_header_id|>
76
+
77
+ {{ end }}
78
+ {{- end }}
79
+ {{- end }}
80
+ {{- else }}
81
+ {{- if .System }}<|start_header_id|>system<|end_header_id|>
82
+
83
+ {{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>
84
+
85
+ {{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>
86
+
87
+ {{ end }}{{ .Response }}{{ if .Response }}<|eot_id|>{{ end }}
88
+ ```
89
+
90
+
91
+ ## Training Details
92
+
93
+ - **LoRA Rank**: 32
94
+ - **Training Steps**: 1870
95
+ - **Training Loss**: 0.7500
96
+ - **Max Seq Length**: 2048
97
+ - **Training Scope**: 7,473 samples (2 epoch(s), full dataset)
98
+
99
+ For complete training configuration, see the LoRA adapters repository/directory.
100
+
101
+ ## Benchmark Results
102
+
103
+ *Benchmarked on the merged 16-bit safetensor model*
104
+
105
+ *Evaluated: 2025-11-24 14:29*
106
+
107
+ | Model | Type | gsm8k |
108
+ |-------|------|--------|
109
+ | unsloth/Llama-3.2-1B-Instruct-bnb-4bit | Base | 0.1463 |
110
+ | Llama-3.2-1B-Instruct-bnb-4bit-gsm8k | Fine-tuned | 0.3230 |
111
+
112
+
113
+ ## Available Quantizations
114
+
115
+ | Quantization | File | Size | Quality |
116
+ |--------------|------|------|---------|
117
+ | **F16** | [Llama-3.2-1B-Instruct-bnb-4bit-gsm8k-F16.gguf](Llama-3.2-1B-Instruct-bnb-4bit-gsm8k-F16.gguf) | 2.31 GB | Full precision (largest) |
118
+ | **Q4_K_M** | [Llama-3.2-1B-Instruct-bnb-4bit-gsm8k-Q4_K_M.gguf](Llama-3.2-1B-Instruct-bnb-4bit-gsm8k-Q4_K_M.gguf) | 0.75 GB | Good balance (recommended) |
119
+ | **Q6_K** | [Llama-3.2-1B-Instruct-bnb-4bit-gsm8k-Q6_K.gguf](Llama-3.2-1B-Instruct-bnb-4bit-gsm8k-Q6_K.gguf) | 0.95 GB | High quality |
120
+ | **Q8_0** | [Llama-3.2-1B-Instruct-bnb-4bit-gsm8k-Q8_0.gguf](Llama-3.2-1B-Instruct-bnb-4bit-gsm8k-Q8_0.gguf) | 1.23 GB | Very high quality, near original |
121
+
122
+ **Usage:** Use the dropdown menu above to select a quantization, then follow HuggingFace's provided instructions.
123
+
124
+ ## License
125
+
126
+ Based on unsloth/Llama-3.2-1B-Instruct-bnb-4bit and trained on openai/gsm8k.
127
+ Please refer to the original model and dataset licenses.
128
+
129
+ ## Credits
130
+
131
+ **Trained by:** Your Name
132
+
133
+ **Training pipeline:**
134
+ - [unsloth-finetuning](https://github.com/farhan-syah/unsloth-finetuning) by [@farhan-syah](https://github.com/farhan-syah)
135
+ - [Unsloth](https://github.com/unslothai/unsloth) - 2x faster LLM fine-tuning
136
+
137
+ **Base components:**
138
+ - Base model: [unsloth/Llama-3.2-1B-Instruct-bnb-4bit](https://huggingface.co/unsloth/Llama-3.2-1B-Instruct-bnb-4bit)
139
+ - Training dataset: [openai/gsm8k](https://huggingface.co/datasets/openai/gsm8k) by openai