AitanaTA Model Card
AitanaTA-2b-Instruct is a translation LLM that has been instruction-tuned from SalamandraTA-2b-Instructed. This model results from instruction-tuning on parallel data from Salamandra-2b-base. AitanaTA-2b-Instruct has been specifically instruction-tuned for translation between Spanish and Valencian, with a focus on sentence-level translation.
Disclaimer
This model has been developed and instruction-tuned specifically for translation between Spanish and Valencian. Its use outside of translation tasks is not recommended, as performance and reliability have not been evaluated for other natural language processing applications. The authors are not responsible for potential errors, misinterpretations, or inappropriate use of the model beyond its intended purpose.
Model Details
Model Description
- Developed by: Language Processing and Information Systems and Centro de Inteligencia Digital Provincia de Alicante
- Language(s) (NLP): Spanish and Valencian
- License: This model is released under the Apache 2.0 license, a permissive open-source license.
- Finetuned from model: This model follows the same instruction pattern as SalamandraTA-2B-Instructed. The only adaptation was focusing the instructions on translation between Valencian and Spanish, in both directions.
Training Details
Training Data
The data comes from L'Associació de Mitjans d'Informació i Comunicació (AMIC), from which we built a parallel sentence-level dataset. This dataset was specifically created to align Spanish and Valencian sentences, ensuring high-quality parallel examples for training and evaluation.
== Language pairs== Valencian -> Spanish: 738777 (mean_src_len=165.6, mean_tgt_len=168.6)
Training Hyperparameters
Training regime:
- epochs: 1
- learning_rate: 1e-5
- beta1: 0.9
- beta2: 0.99
- weight_decay: 0
- global_batch_size: 64
- micro_batch_size: 2
- log_interval: 5
- save_interval: 5
- lr_warmup_steps: 100
- max_seq_length: 2048
How to use
You can translate between Spanish to Valencian. The instruction-following model uses the commonly adopted ChatML template:
<|im_start|>system
{SYSTEM PROMPT}<|im_end|>
<|im_start|>user
{USER PROMPT}<|im_end|>
<|im_start|>assistant
{MODEL RESPONSE}<|im_end|>
<|im_start|>user
[...]
The easiest way to apply it is by using the tokenizer's built-in functions, as shown in the following snippet.
from datetime import datetime
from transformers import AutoTokenizer, AutoModelForCausalLM
import transformers
import torch
model_id = "gplsi/Aitana-TA-2B-S"
source = 'Spanish'
target = 'Valencian'
sentence = "La inteligencia artificial está transformando el mundo."
text = f"Translate the following text from {source} into {target}.\n{source}: {sentence} \n{target}:"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="auto",
dtype=torch.bfloat16
)
message = [ { "role": "user", "content": text } ]
prompt = tokenizer.apply_chat_template(
message,
tokenize=False,
add_generation_prompt=True,
)
inputs = tokenizer.encode(prompt, add_special_tokens=False, return_tensors="pt")
input_length = inputs.shape[1]
outputs = model.generate(input_ids=inputs.to(model.device),
max_new_tokens=400,
early_stopping=True,
num_beams=5)
print(tokenizer.decode(outputs[0, input_length:], skip_special_tokens=True))
# La intel·ligència artificial està transformant el món.
Using this template, each turn is preceded by a <|im_start|> delimiter and the role of the entity
(either user, for content supplied by the user, or assistant for LLM responses), and finished with the <|im_end|> token.
Evaluation
Testing Data
Our objective was to evaluate translation between Spanish and Valencian. To test the model, we used the Phrases task from IberoBench.
Metrics
For evaluation, we relied on a set of standard translation metrics: COMET, BLEU, TER, and ChrF.
Results on Phrases tasks
| Task | Metric | Aitana-TA-2B-S | salamandraTA-2B-instruct |
|---|---|---|---|
| phrases_es-va | BLEU | 63.35 | 62.40 |
| phrases_va-es | BLEU | 81.15 | 75.49 |
| phrases_va-ca | BLEU | 79.24 | 82.07 |
| phrases_ca-va | BLEU | 79.95 | 76.53 |
| =============== | ======= | =================== | |
| phrases_es-va | ChrF | 83.32 | 82.18 |
| phrases_va-es | ChrF | 91.03 | 88.15 |
| phrases_va-ca | ChrF | 90.74 | 91.43 |
| phrases_ca-va | ChrF | 91.61 | 89.57 |
| =============== | ======= | =================== | |
| phrases_es-va | COMET | 0.93 | 0.93 |
| phrases_va-es | COMET | 0.95 | 0.95 |
| phrases_va-ca | COMET | 0.95 | 0.96 |
| phrases_ca-va | COMET | 0.96 | 0.95 |
| =============== | ======= | =================== | |
| phrases_es-va | TER | 22.45 | 23.26 |
| phrases_va-es | TER | 11.21 | 15.06 |
| phrases_va-ca | TER | 10.11 | 10.68 |
| phrases_ca-va | TER | 10.54 | 13.67 |
Technical Specifications
Model Architecture and Objective
The same as the base model.
Hardware and Software
For training, we used custom code developed on top of the PyTorch Lightning framework (Fabric Lightning), enabling model sharding across multiple GPUs through Fully Sharded Data Parallel (FSDP).
Compute Infrastructure
This model was trained on NVIDIA DGX systems equipped with A100 GPUs, which enabled efficient large-scale training. For this model, we used 4-A100 GPUs.
Funding
This work is funded by the Ministerio para la Transformación Digital y de la Función Pública, co-financed by the EU – NextGenerationEU, within the framework of the project Desarrollo de Modelos ALIA.
Citation
If you use this model in your research or work, please cite it as follows:
@misc{gplsi-5w1h-llama3b,
author = {Sepúlveda-Torres, Robiert and Galeano, Santiago and Gutiérrez, Yoan},
title = {Aitana-TA-2B-S: Translation model from Spanish to Valencian},
year = {2025},
howpublished = {\url{https://huggingface.co/gplsi/Aitana-TA-2B-S}},
note = {Accessed: 2025-10-03}
}
- Downloads last month
- 38