YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Rgveda Embedding Model - Optimized for Deployment

This repository contains the rgveda-embedding-gemma model optimized for deployment.

Based on Ganaraj/rgveda-embedding-gemma, a fine-tuned embedding model for Sanskrit/Devanagari text from the Rigveda.

📋 ONNX Format Available

This repository includes ONNX model files!

Due to limitations in exporting the Gemma3TextModel architecture, this repo uses a hybrid approach:

  • Base transformer: ONNX format (onnx/model.onnx + onnx/model.onnx_data) from onnx-community/embeddinggemma-300m-ONNX
  • Fine-tuning: Rigveda-specific dense layer weights (weights/dense1_weight.npy, weights/dense2_weight.npy)
  • Inference: Combines ONNX Runtime for transformer with numpy for fine-tuned layers

This provides:

  • ✅ ONNX compatibility (uses ONNX Runtime)
  • ✅ Rigveda-specific fine-tuning (dense layer weights)
  • ✅ Production-ready deployment
  • ✅ Standard repository structure

Model Information

  • Base Model: google/embeddinggemma-300m
  • Fine-tuned for: Rigveda text embedding and retrieval
  • Languages: Sanskrit (Devanagari script)
  • Embedding Dimension: 768
  • Max Sequence Length: 2048 tokens

Model Architecture

1. Transformer (Gemma3TextModel) - 300M parameters
2. Pooling (mean pooling with attention mask)
3. Dense Layer 1: 768 → 3072 (no bias)
4. Dense Layer 2: 3072 → 768 (no bias)  
5. L2 Normalization

Installation

pip install transformers torch numpy

Usage

ONNX Inference (Recommended)

from inference_onnx import RgvedaEmbeddingONNXHybrid

# Initialize
model = RgvedaEmbeddingONNXHybrid(".")

# Encode texts
prefixes = {
    "query": "task: search result | query: ",
    "document": "title: none | text: ",
}

query = prefixes["query"] + "वृष्टि-विद्युत्-सदृशं दैविकं आगमनम्"
documents = [
    prefixes["document"] + "असामि हि प्रयज्यवः",
    prefixes["document"] + "उत द्वार उशतीर् वि श्रयन्ताम्",
]

# Get embeddings
query_emb = model.encode(query)
doc_embs = model.encode(documents)

# Compute similarity
similarities = query_emb @ doc_embs.T
print(similarities)

Prompt Instructions

Use these prefixes for optimal performance:

Use Case Prefix
Search Query task: search result | query: {text}
Document/Passage title: none | text: {text}
Question Answering task: question answering | query: {text}
Classification task: classification | query: {text}
Semantic Similarity task: sentence similarity | query: {text}

Repository Structure

.
├── onnx/
│   ├── model.onnx              # ONNX model graph (469 KB)
│   └── model.onnx_data         # ONNX model weights (1.1 GB)
├── weights/
│   ├── dense1_weight.npy       # Fine-tuned dense layer 1 (3072×768)
│   └── dense2_weight.npy       # Fine-tuned dense layer 2 (768×3072)
├── inference_onnx.py           # ONNX inference script (recommended)
├── inference.py                # PyTorch inference script (alternative)
├── tokenizer.json              # Tokenizer vocabulary
├── tokenizer_config.json       # Tokenizer settings
├── special_tokens_map.json     # Special tokens
└── README.md                   # This file

Performance

The model achieves:

  • Cosine Accuracy (test): 0.9553
  • Optimized for Sanskrit/Rigveda text retrieval
  • Trained on 51,368 samples

Citation

Original Model

@misc{ganaraj2024rgveda,
  author = {Ganaraj},
  title = {rgveda-embedding-gemma},
  year = {2024},
  publisher = {Hugging Face},
  url = {https://huggingface.co/Ganaraj/rgveda-embedding-gemma}
}

Base Model

@misc{embeddinggemma,
  title = {EmbeddingGemma},
  author = {Google DeepMind},
  year = {2024},
  publisher = {Hugging Face},
  url = {https://huggingface.co/google/embeddinggemma-300m}
}

License

This model inherits the Gemma license from the base model. Please refer to the Gemma Terms of Use.

Acknowledgments

  • Base model: google/embeddinggemma-300m
  • Fine-tuning: Ganaraj
  • Conversion: Optimized for deployment with PyTorch/ONNX compatibility
Downloads last month
38
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support