Upload README.md

4703c4e verified 23 days ago

4.45 kB

	---
	base_model: unsloth/Llama-3.2-1B-Instruct-bnb-4bit
	library_name: transformers
	pipeline_tag: text-generation
	tags:
	- gguf
	- fine-tuned
	- lima
	language:
	- en
	license: apache-2.0
	---

	# Llama-3.2-1B-Instruct-bnb-4bit-lima - GGUF Format

	GGUF format quantizations for llama.cpp/Ollama.

	## Model Details

	- Base Model: [unsloth/Llama-3.2-1B-Instruct-bnb-4bit](https://huggingface.co/unsloth/Llama-3.2-1B-Instruct-bnb-4bit)
	- Format: gguf
	- Dataset: [GAIR/lima](https://huggingface.co/datasets/GAIR/lima)
	- Size: 0.75 GB - 2.31 GB
	- Usage: llama.cpp / Ollama

	## Related Models

	- LoRA Adapters: [fs90/Llama-3.2-1B-Instruct-bnb-4bit-lima-lora](https://huggingface.co/fs90/Llama-3.2-1B-Instruct-bnb-4bit-lima-lora) - Smaller LoRA-only adapters
	- Merged FP16 Model: [fs90/Llama-3.2-1B-Instruct-bnb-4bit-lima](https://huggingface.co/fs90/Llama-3.2-1B-Instruct-bnb-4bit-lima) - Original unquantized model in FP16


	## Prompt Format

	This model uses the Llama 3.2 chat template.

	### Ollama Template Format

	```
	{{ if .Messages }}
	{{- if or .System .Tools }}<\|start_header_id\|>system<\|end_header_id\|>
	{{- if .System }}

	{{ .System }}
	{{- end }}
	{{- if .Tools }}

	You are a helpful assistant with tool calling capabilities. When you receive a tool call response, use the output to format an answer to the original use question.
	{{- end }}
	{{- end }}<\|eot_id\|>
	{{- range $i, $_ := .Messages }}
	{{- $last := eq (len (slice $.Messages $i)) 1 }}
	{{- if eq .Role "user" }}<\|start_header_id\|>user<\|end_header_id\|>
	{{- if and $.Tools $last }}

	Given the following functions, please respond with a JSON for a function call with its proper arguments that best answers the given prompt.

	Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}. Do not use variables.

	{{ $.Tools }}
	{{- end }}

	{{ .Content }}<\|eot_id\|>{{ if $last }}<\|start_header_id\|>assistant<\|end_header_id\|>

	{{ end }}
	{{- else if eq .Role "assistant" }}<\|start_header_id\|>assistant<\|end_header_id\|>
	{{- if .ToolCalls }}

	{{- range .ToolCalls }}{"name": "{{ .Function.Name }}", "parameters": {{ .Function.Arguments }}}{{ end }}
	{{- else }}

	{{ .Content }}{{ if not $last }}<\|eot_id\|>{{ end }}
	{{- end }}
	{{- else if eq .Role "tool" }}<\|start_header_id\|>ipython<\|end_header_id\|>

	{{ .Content }}<\|eot_id\|>{{ if $last }}<\|start_header_id\|>assistant<\|end_header_id\|>

	{{ end }}
	{{- end }}
	{{- end }}
	{{- else }}
	{{- if .System }}<\|start_header_id\|>system<\|end_header_id\|>

	{{ .System }}<\|eot_id\|>{{ end }}{{ if .Prompt }}<\|start_header_id\|>user<\|end_header_id\|>

	{{ .Prompt }}<\|eot_id\|>{{ end }}<\|start_header_id\|>assistant<\|end_header_id\|>

	{{ end }}{{ .Response }}{{ if .Response }}<\|eot_id\|>{{ end }}
	```


	## Training Details

	- LoRA Rank: 16
	- Training Steps: 129
	- Training Loss: 2.3025
	- Max Seq Length: 4086
	- Training Scope: 1,030 samples (1 epoch(s), full dataset)

	For complete training configuration, see the LoRA adapters repository/directory.

	## Available Quantizations

	\| Quantization \| File \| Size \| Quality \|
	\|--------------\|------\|------\|---------\|
	\| F16 \| [Llama-3.2-1B-Instruct-bnb-4bit-lima-F16.gguf](Llama-3.2-1B-Instruct-bnb-4bit-lima-F16.gguf) \| 2.31 GB \| Full precision (largest) \|
	\| Q4_K_M \| [Llama-3.2-1B-Instruct-bnb-4bit-lima-Q4_K_M.gguf](Llama-3.2-1B-Instruct-bnb-4bit-lima-Q4_K_M.gguf) \| 0.75 GB \| Good balance (recommended) \|
	\| Q6_K \| [Llama-3.2-1B-Instruct-bnb-4bit-lima-Q6_K.gguf](Llama-3.2-1B-Instruct-bnb-4bit-lima-Q6_K.gguf) \| 0.95 GB \| High quality \|
	\| Q8_0 \| [Llama-3.2-1B-Instruct-bnb-4bit-lima-Q8_0.gguf](Llama-3.2-1B-Instruct-bnb-4bit-lima-Q8_0.gguf) \| 1.23 GB \| Very high quality, near original \|

	Usage: Use the dropdown menu above to select a quantization, then follow HuggingFace's provided instructions.

	## License

	Based on unsloth/Llama-3.2-1B-Instruct-bnb-4bit and trained on GAIR/lima.
	Please refer to the original model and dataset licenses.

	## Credits

	Trained by: Farhan Syah

	Training pipeline:
	- [unsloth-finetuning](https://github.com/farhan-syah/unsloth-finetuning) by [@farhan-syah](https://github.com/farhan-syah)
	- [Unsloth](https://github.com/unslothai/unsloth) - 2x faster LLM fine-tuning

	Base components:
	- Base model: [unsloth/Llama-3.2-1B-Instruct-bnb-4bit](https://huggingface.co/unsloth/Llama-3.2-1B-Instruct-bnb-4bit)
	- Training dataset: [GAIR/lima](https://huggingface.co/datasets/GAIR/lima) by GAIR