SentenceTransformer based on cointegrated/LaBSE-en-ru
This is a sentence-transformers model finetuned from cointegrated/LaBSE-en-ru. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: cointegrated/LaBSE-en-ru
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Dense({'in_features': 768, 'out_features': 768, 'bias': True, 'activation_function': 'torch.nn.modules.activation.Tanh'})
(3): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("Solomennikova/labse_funetuned_for_categories")
# Run inference
sentences = [
'набор мебель для прихожая',
'Тумбы для обуви',
'Туалетные столики',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Training Details
Training Dataset
Unnamed Dataset
- Size: 164,318 training samples
- Columns:
sentence_0andsentence_1 - Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1 type string string details - min: 3 tokens
- mean: 5.77 tokens
- max: 17 tokens
- min: 3 tokens
- mean: 6.82 tokens
- max: 14 tokens
- Samples:
sentence_0 sentence_1 матрасный основаниеОснования для кроватейпростыня сатинаНаволочкиcostaТумбы - Loss:
MultipleNegativesRankingLosswith these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Training Hyperparameters
Non-Default Hyperparameters
per_device_train_batch_size: 64per_device_eval_batch_size: 64num_train_epochs: 50multi_dataset_batch_sampler: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 64per_device_eval_batch_size: 64per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1num_train_epochs: 50max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robin
Training Logs
Click to expand
| Epoch | Step | Training Loss |
|---|---|---|
| 0.1947 | 500 | 4.5321 |
| 0.3894 | 1000 | 3.8501 |
| 0.5841 | 1500 | 3.6982 |
| 0.7788 | 2000 | 3.5882 |
| 0.9735 | 2500 | 3.5182 |
| 1.1682 | 3000 | 3.4484 |
| 1.3629 | 3500 | 3.4114 |
| 1.5576 | 4000 | 3.3743 |
| 1.7523 | 4500 | 3.3427 |
| 1.9470 | 5000 | 3.3218 |
| 2.1417 | 5500 | 3.2676 |
| 2.3364 | 6000 | 3.2554 |
| 2.5312 | 6500 | 3.2181 |
| 2.7259 | 7000 | 3.2045 |
| 2.9206 | 7500 | 3.1862 |
| 3.1153 | 8000 | 3.1527 |
| 3.3100 | 8500 | 3.0955 |
| 3.5047 | 9000 | 3.0996 |
| 3.6994 | 9500 | 3.0927 |
| 3.8941 | 10000 | 3.0835 |
| 4.0888 | 10500 | 3.0255 |
| 4.2835 | 11000 | 2.9838 |
| 4.4782 | 11500 | 2.9863 |
| 4.6729 | 12000 | 2.9808 |
| 4.8676 | 12500 | 2.9844 |
| 5.0623 | 13000 | 2.9292 |
| 5.2570 | 13500 | 2.8713 |
| 5.4517 | 14000 | 2.8853 |
| 5.6464 | 14500 | 2.8795 |
| 5.8411 | 15000 | 2.8686 |
| 6.0358 | 15500 | 2.8367 |
| 6.2305 | 16000 | 2.7667 |
| 6.4252 | 16500 | 2.7688 |
| 6.6199 | 17000 | 2.7618 |
| 6.8146 | 17500 | 2.7709 |
| 7.0093 | 18000 | 2.7628 |
| 7.2040 | 18500 | 2.6531 |
| 7.3988 | 19000 | 2.6788 |
| 7.5935 | 19500 | 2.6685 |
| 7.7882 | 20000 | 2.7033 |
| 7.9829 | 20500 | 2.6823 |
| 8.1776 | 21000 | 2.5856 |
| 8.3723 | 21500 | 2.5718 |
| 8.5670 | 22000 | 2.616 |
| 8.7617 | 22500 | 2.6025 |
| 8.9564 | 23000 | 2.6056 |
| 9.1511 | 23500 | 2.5101 |
| 9.3458 | 24000 | 2.5131 |
| 9.5405 | 24500 | 2.5113 |
| 9.7352 | 25000 | 2.5394 |
| 9.9299 | 25500 | 2.5543 |
| 10.1246 | 26000 | 2.4642 |
| 10.3193 | 26500 | 2.4361 |
| 10.5140 | 27000 | 2.4512 |
| 10.7087 | 27500 | 2.4664 |
| 10.9034 | 28000 | 2.475 |
| 11.0981 | 28500 | 2.4116 |
| 11.2928 | 29000 | 2.367 |
| 11.4875 | 29500 | 2.3674 |
| 11.6822 | 30000 | 2.4078 |
| 11.8769 | 30500 | 2.4239 |
| 12.0717 | 31000 | 2.3825 |
| 12.2664 | 31500 | 2.3082 |
| 12.4611 | 32000 | 2.3397 |
| 12.6558 | 32500 | 2.3281 |
| 12.8505 | 33000 | 2.3602 |
| 13.0452 | 33500 | 2.3268 |
| 13.2399 | 34000 | 2.2552 |
| 13.4346 | 34500 | 2.2549 |
| 13.6293 | 35000 | 2.2813 |
| 13.8240 | 35500 | 2.3085 |
| 14.0187 | 36000 | 2.2883 |
| 14.2134 | 36500 | 2.2031 |
| 14.4081 | 37000 | 2.2178 |
| 14.6028 | 37500 | 2.2312 |
| 14.7975 | 38000 | 2.2357 |
| 14.9922 | 38500 | 2.2585 |
| 15.1869 | 39000 | 2.1408 |
| 15.3816 | 39500 | 2.1626 |
| 15.5763 | 40000 | 2.1845 |
| 15.7710 | 40500 | 2.2172 |
| 15.9657 | 41000 | 2.2133 |
| 16.1604 | 41500 | 2.1009 |
| 16.3551 | 42000 | 2.1331 |
| 16.5498 | 42500 | 2.1417 |
| 16.7445 | 43000 | 2.1469 |
| 16.9393 | 43500 | 2.1676 |
| 17.1340 | 44000 | 2.0622 |
| 17.3287 | 44500 | 2.0603 |
| 17.5234 | 45000 | 2.0909 |
| 17.7181 | 45500 | 2.1163 |
| 17.9128 | 46000 | 2.131 |
| 18.1075 | 46500 | 2.059 |
| 18.3022 | 47000 | 2.024 |
| 18.4969 | 47500 | 2.0563 |
| 18.6916 | 48000 | 2.0669 |
| 18.8863 | 48500 | 2.087 |
| 19.0810 | 49000 | 2.0452 |
| 19.2757 | 49500 | 1.9731 |
| 19.4704 | 50000 | 2.0031 |
| 19.6651 | 50500 | 2.0318 |
| 19.8598 | 51000 | 2.0514 |
| 20.0545 | 51500 | 2.0381 |
| 20.2492 | 52000 | 1.9449 |
| 20.4439 | 52500 | 1.9689 |
| 20.6386 | 53000 | 1.9848 |
| 20.8333 | 53500 | 2.0179 |
| 21.0280 | 54000 | 1.9892 |
| 21.2227 | 54500 | 1.8909 |
| 21.4174 | 55000 | 1.942 |
| 21.6121 | 55500 | 1.9603 |
| 21.8069 | 56000 | 1.9785 |
| 22.0016 | 56500 | 2.0078 |
| 22.1963 | 57000 | 1.882 |
| 22.3910 | 57500 | 1.9084 |
| 22.5857 | 58000 | 1.9256 |
| 22.7804 | 58500 | 1.9274 |
| 22.9751 | 59000 | 1.9576 |
| 23.1698 | 59500 | 1.8427 |
| 23.3645 | 60000 | 1.8742 |
| 23.5592 | 60500 | 1.894 |
| 23.7539 | 61000 | 1.9073 |
| 23.9486 | 61500 | 1.9407 |
| 24.1433 | 62000 | 1.8524 |
| 24.3380 | 62500 | 1.8412 |
| 24.5327 | 63000 | 1.8768 |
| 24.7274 | 63500 | 1.8663 |
| 24.9221 | 64000 | 1.9083 |
| 25.1168 | 64500 | 1.8294 |
| 25.3115 | 65000 | 1.8095 |
| 25.5062 | 65500 | 1.8445 |
| 25.7009 | 66000 | 1.8411 |
| 25.8956 | 66500 | 1.8734 |
| 26.0903 | 67000 | 1.8253 |
| 26.2850 | 67500 | 1.782 |
| 26.4798 | 68000 | 1.8062 |
| 26.6745 | 68500 | 1.8333 |
| 26.8692 | 69000 | 1.8488 |
| 27.0639 | 69500 | 1.8223 |
| 27.2586 | 70000 | 1.7619 |
| 27.4533 | 70500 | 1.7874 |
| 27.6480 | 71000 | 1.8049 |
| 27.8427 | 71500 | 1.8165 |
| 28.0374 | 72000 | 1.8073 |
| 28.2321 | 72500 | 1.735 |
| 28.4268 | 73000 | 1.7548 |
| 28.6215 | 73500 | 1.7831 |
| 28.8162 | 74000 | 1.7963 |
| 29.0109 | 74500 | 1.8057 |
| 29.2056 | 75000 | 1.7101 |
| 29.4003 | 75500 | 1.7343 |
| 29.5950 | 76000 | 1.7544 |
| 29.7897 | 76500 | 1.7583 |
| 29.9844 | 77000 | 1.8093 |
| 30.1791 | 77500 | 1.6939 |
| 30.3738 | 78000 | 1.7245 |
| 30.5685 | 78500 | 1.7235 |
| 30.7632 | 79000 | 1.7489 |
| 30.9579 | 79500 | 1.7696 |
| 31.1526 | 80000 | 1.7008 |
| 31.3474 | 80500 | 1.6873 |
| 31.5421 | 81000 | 1.7093 |
| 31.7368 | 81500 | 1.7317 |
| 31.9315 | 82000 | 1.7503 |
| 32.1262 | 82500 | 1.6979 |
| 32.3209 | 83000 | 1.6945 |
| 32.5156 | 83500 | 1.6963 |
| 32.7103 | 84000 | 1.7047 |
| 32.9050 | 84500 | 1.7119 |
| 33.0997 | 85000 | 1.6779 |
| 33.2944 | 85500 | 1.6628 |
| 33.4891 | 86000 | 1.6773 |
| 33.6838 | 86500 | 1.6851 |
| 33.8785 | 87000 | 1.7201 |
| 34.0732 | 87500 | 1.6765 |
| 34.2679 | 88000 | 1.6453 |
| 34.4626 | 88500 | 1.6501 |
| 34.6573 | 89000 | 1.665 |
| 34.8520 | 89500 | 1.7058 |
| 35.0467 | 90000 | 1.6666 |
| 35.2414 | 90500 | 1.6337 |
| 35.4361 | 91000 | 1.6371 |
| 35.6308 | 91500 | 1.6644 |
| 35.8255 | 92000 | 1.6585 |
| 36.0202 | 92500 | 1.6702 |
| 36.2150 | 93000 | 1.615 |
| 36.4097 | 93500 | 1.6217 |
| 36.6044 | 94000 | 1.6447 |
| 36.7991 | 94500 | 1.6542 |
| 36.9938 | 95000 | 1.6621 |
| 37.1885 | 95500 | 1.602 |
| 37.3832 | 96000 | 1.615 |
| 37.5779 | 96500 | 1.6211 |
| 37.7726 | 97000 | 1.6405 |
| 37.9673 | 97500 | 1.6465 |
| 38.1620 | 98000 | 1.596 |
| 38.3567 | 98500 | 1.5918 |
| 38.5514 | 99000 | 1.6215 |
| 38.7461 | 99500 | 1.6223 |
| 38.9408 | 100000 | 1.619 |
| 39.1355 | 100500 | 1.6038 |
| 39.3302 | 101000 | 1.5901 |
| 39.5249 | 101500 | 1.5883 |
| 39.7196 | 102000 | 1.6072 |
| 39.9143 | 102500 | 1.6249 |
| 40.1090 | 103000 | 1.5904 |
| 40.3037 | 103500 | 1.5753 |
| 40.4984 | 104000 | 1.5932 |
| 40.6931 | 104500 | 1.5997 |
| 40.8879 | 105000 | 1.5997 |
| 41.0826 | 105500 | 1.5821 |
| 41.2773 | 106000 | 1.5626 |
| 41.4720 | 106500 | 1.5698 |
| 41.6667 | 107000 | 1.5781 |
| 41.8614 | 107500 | 1.5862 |
| 42.0561 | 108000 | 1.5791 |
| 42.2508 | 108500 | 1.5514 |
| 42.4455 | 109000 | 1.565 |
| 42.6402 | 109500 | 1.5763 |
| 42.8349 | 110000 | 1.5841 |
| 43.0296 | 110500 | 1.574 |
| 43.2243 | 111000 | 1.5506 |
| 43.4190 | 111500 | 1.5595 |
| 43.6137 | 112000 | 1.5541 |
| 43.8084 | 112500 | 1.5591 |
| 44.0031 | 113000 | 1.5731 |
| 44.1978 | 113500 | 1.5371 |
| 44.3925 | 114000 | 1.5544 |
| 44.5872 | 114500 | 1.5474 |
| 44.7819 | 115000 | 1.5525 |
| 44.9766 | 115500 | 1.5662 |
| 45.1713 | 116000 | 1.5315 |
| 45.3660 | 116500 | 1.5306 |
| 45.5607 | 117000 | 1.5329 |
| 45.7555 | 117500 | 1.5393 |
| 45.9502 | 118000 | 1.5575 |
| 46.1449 | 118500 | 1.5309 |
| 46.3396 | 119000 | 1.5162 |
| 46.5343 | 119500 | 1.536 |
| 46.7290 | 120000 | 1.5325 |
| 46.9237 | 120500 | 1.5441 |
| 47.1184 | 121000 | 1.5203 |
| 47.3131 | 121500 | 1.5223 |
| 47.5078 | 122000 | 1.5289 |
| 47.7025 | 122500 | 1.5294 |
| 47.8972 | 123000 | 1.5277 |
| 48.0919 | 123500 | 1.5243 |
| 48.2866 | 124000 | 1.5164 |
| 48.4813 | 124500 | 1.5233 |
| 48.6760 | 125000 | 1.5129 |
| 48.8707 | 125500 | 1.5154 |
| 49.0654 | 126000 | 1.5148 |
| 49.2601 | 126500 | 1.5144 |
| 49.4548 | 127000 | 1.4996 |
| 49.6495 | 127500 | 1.51 |
| 49.8442 | 128000 | 1.5164 |
Framework Versions
- Python: 3.10.12
- Sentence Transformers: 4.0.1
- Transformers: 4.48.3
- PyTorch: 2.6.0+cu124
- Accelerate: 1.5.2
- Datasets: 3.4.1
- Tokenizers: 0.21.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
- Downloads last month
- -
Model tree for Solomennikova/labse_funetuned_for_categories
Base model
cointegrated/LaBSE-en-ru