datasets:
  - custom
library_name: onmt
model-index:
  - name: Malayalam to Hindi Translation
    results:
      - task:
          name: Translation
          type: translation
        dataset:
          name: Custom Hindi- Malayalam Parallel Corpus
          type: translation
        metrics:
          - name: BLEU
            type: bleu
            value: 35.5
          - name : COMET
          - type:comet
          - value: 0.582


🇮🇳 Malayalam to Hindi Translation Model (OpenNMT)

This is a Neural Machine Translation (NMT) model trained to translate Malayalam (ml) to Hindi (hi) using the OpenNMT framework. It was trained on a custom curated low-resource parallel corpus.

 Model Architecture

- Framework: **OpenNMT (PyTorch)**
- Architecture: **Transformer**
- Type: **Sequence-to-sequence**
- Layers: 6 encoder / 6 decoder
- Embedding size: 512
- FFN size: 2048
- Attention heads: 8
- Positional encoding: sinusoidal
- Tokenizer: SentencePiece (trained jointly on hi-ml)
- Vocabulary size: 32,000 (joint BPE)


Evaluation

The model was evaluated on a manually annotated Hindi-Malayalam test set consisting of 10,000 sentence pairs.

| Metric | Score |
|--------|-------|
| BLEU   | 35.5  |
| COMET  | 0.582 |

Usage

 IN CLI


onmt_translate \
    -model model.tm_best_checkpoint.pt \
    -src input.txt \
    -output output.txt \
    -replace_unk \
    -verbose \
    -gpu -1 \
    -min_length 1

Dataset

This model was trained on a custom dataset compiled from:

* (https://github.com/AI4Bharat/IndicTrans)
* Manually aligned Malayalam-Hindi sentences from news and educational data
Downloads last month: -; Downloads are not tracked for this model. How to track