jinaai
/

xlm-roberta-flash-implementation

🇪🇺 Region: EU

Model card Files Files and versions

xlm-roberta-flash-implementation / README.md

jupyterjazz's picture

refine-codebase (#33)

2646361 verified over 1 year ago

|

557 Bytes

	Core implementation of Jina XLM-RoBERTa

	This implementation is adapted from [XLM-Roberta](https://huggingface.co/docs/transformers/en/model_doc/xlm-roberta). In contrast to the original implementation, this model uses Rotary positional encodings and supports flash-attention 2.

	### Models that use this implementation

	to be added soon


	### Converting weights

	Weights from an [original XLMRoberta model](https://huggingface.co/FacebookAI/xlm-roberta-large) can be converted using the `convert_roberta_weights_to_flash.py` script in the model repository.