File size: 2,770 Bytes
3552348
 
 
 
 
 
 
 
 
 
 
87a72d6
 
 
3552348
 
87a72d6
3552348
87a72d6
3552348
87a72d6
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3552348
87a72d6
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
---
base_model: unsloth/llama-3.2-3b-instruct-unsloth-bnb-4bit
tags:
- text-generation-inference
- transformers
- unsloth
- llama
- trl
license: apache-2.0
language:
- en
- tl
datasets:
- Linggowiktiks/AnoNa
---

# 🦙 Liyama-3B

**Liyama-3B** is a fine-tuned version of Meta’s LLaMA-3B (3.2) model, built to understand and respond fluently in **Tagalog**. It was trained on the **AnoNa** dataset over **3 epochs**, aiming for natural, context-aware instruction-following in Filipino.

---

## 🔤 Origin of the Name
The name **Liyama** is a Tagalified version of *llama*, reflecting both its LLaMA base and its Tagalog-focused language capabilities. It mirrors how Filipino often adapts foreign terms into familiar, phonetic forms—like *camera → kamera*, *lion → leon*, and now, *llama → liyama*.

---

## 🧠 Training Data: The AnoNa Dataset

Liyama-3B was trained solely on **response completions** from the **AnoNa** dataset — a self-instruct corpus generated using **Gemini 1.5** and **2.0**.

Inspired by **SimpleQnA**, the dataset contains short, helpful instruction-response pairs. But **AnoNa** introduces several improvements:

-**Less English, More Tagalog** prompts  
-**Less IFEVAL-style formatting**  
-**No overuse of modifiers** in instructions  
-**Balanced task types** to avoid dominant categories  
-**Complex tasks favored** (65% complex / 35% simple)  
-**Reduced sycophancy** and generic praise  
-**Improved follow-up handling**  
-**AI self-intro appears only when relevant**  
-**Implicit chain-of-thought reasoning**, not labeled  
-**Extra task types** added to increase variety  

This focus creates a model that's practical, straightforward, and tuned for **realistic conversational use in Filipino**, without excessive formatting or irrelevant disclaimers.

---

## 🗣️ Use Case

Liyama-3B is ideal for:
- Answering questions in Tagalog
- Writing essays, reflections, and letters in Filipino
- Following natural instructions, even when mixed with English
- Chat-based tasks where fluency and tone matter
- Educational or community apps centered around local language use

---

## 📦 Model Details

| Feature            | Value                      |
|--------------------|----------------------------|
| Base Model         | LLaMA-3B v3.2              |
| Fine-tuned Dataset | AnoNa                      |
| Epochs             | 3                          |
| Language Focus     | Tagalog (with some English)|
| Prompt Format      | Responses only             |

---

Liyama-3B is part of a broader effort to create open, practical Filipino-language models for real use—not just benchmarks. Expect follow-ups tuned for multi-turn chat, reasoning, and creative tasks.