dranreb1660 commited on
Commit
c812d84
·
1 Parent(s): 8a3e163

Add BGE Cross-Encoder reranker v1.1

Browse files
README.md ADDED
@@ -0,0 +1,324 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - cross-encoder
5
+ - generated_from_trainer
6
+ - dataset_size:570914
7
+ - loss:BinaryCrossEntropyLoss
8
+ base_model: BAAI/bge-reranker-base
9
+ pipeline_tag: text-ranking
10
+ library_name: sentence-transformers
11
+ ---
12
+
13
+ # CrossEncoder based on BAAI/bge-reranker-base
14
+
15
+ This is a [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) model finetuned from [BAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base) using the [sentence-transformers](https://www.SBERT.net) library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
16
+
17
+ ## Model Details
18
+
19
+ ### Model Description
20
+ - **Model Type:** Cross Encoder
21
+ - **Base model:** [BAAI/bge-reranker-base](https://huggingface.co/BAAI/bge-reranker-base) <!-- at revision 2cfc18c9415c912f9d8155881c133215df768a70 -->
22
+ - **Maximum Sequence Length:** 512 tokens
23
+ - **Number of Output Labels:** 1 label
24
+ <!-- - **Training Dataset:** Unknown -->
25
+ <!-- - **Language:** Unknown -->
26
+ <!-- - **License:** Unknown -->
27
+
28
+ ### Model Sources
29
+
30
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
31
+ - **Documentation:** [Cross Encoder Documentation](https://www.sbert.net/docs/cross_encoder/usage/usage.html)
32
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
33
+ - **Hugging Face:** [Cross Encoders on Hugging Face](https://huggingface.co/models?library=sentence-transformers&other=cross-encoder)
34
+
35
+ ## Usage
36
+
37
+ ### Direct Usage (Sentence Transformers)
38
+
39
+ First install the Sentence Transformers library:
40
+
41
+ ```bash
42
+ pip install -U sentence-transformers
43
+ ```
44
+
45
+ Then you can load this model and run inference.
46
+ ```python
47
+ from sentence_transformers import CrossEncoder
48
+
49
+ # Download from the 🤗 Hub
50
+ model = CrossEncoder("cross_encoder_model_id")
51
+ # Get scores for pairs of texts
52
+ pairs = [
53
+ ['what is the life expectancy for babies with trisomy 18', "Most babies born with this condition die within the first few days or weeks of life, as they have so many medical complications. Just 5% to 10% make it past their first year. Like trisomy 18, no one knows why some babies get this condition. It's known that the chance increases with the mother's age, though women of any age can have a child with trisomy 13. About 80% of babies with trisomy 18 or 13 are born to mothers under 35. The condition can be diagnosed before birth with the same tests used to identify trisomy 18, or after birth by a physical examination. Trisomy 18 is a condition where you have three copies of each chromosome 18 in your body's cells instead of two. This can lead to serious physical and mental disabilities. There is no cure, though treatment can include surgeries, medicines, breathing tubes, and feeding tubes. Some parents opt just for comfort care. Life expectancy is usually a year or less. How old is the oldest living person with trisomy 18? The oldest people were reported to be in their early 40s a few years ago. But it's unclear if they are alive today. Are babies with trisomy 18 less active in the womb? Yes, they are often less active. Trisomy 18 is a condition caused by a problem in your chromosomes."],
54
+ ['what are some reasons doctors prescribe benzodiazepines', "But they can be habit-forming, especially if you take them regularly or for a long time. If you think you or a loved one may have a problem with benzodiazepine misuse, contact a doctor or a drug hotline. What is benzodiazepine abuse? Doctors define benzodiazepine abuse as using these drugs for non-medical reasons to get high, What is the incidence of benzodiazepine abuse? In a 12-month period between 2014 and 2015, more than 5 million people in the U.S. reported they had misused benzodiazepines. That's out of 30 million adults who used the drugs at all that year."],
55
+ ['i ve been experiencing frequent migraines that don t respond well to medication what devices could i try to help manage my pain', "You might try them if medications don't work well for you, or use them together with medication. The devices include: Single-pulse transcranial magnetic stimulation (eNeura). To use this prescription device, you place it on the back of your head. It sends a pulse of magnetic energy that affects electrical signaling in your brain, which may stop or reduce pain. It's used both to treat and prevent attacks. External trigeminal nerve stimulation (Cefaly). This nonprescription device targets the trigeminal nerve , which provides sensation to parts of your head and face. It delivers stimulation through electrodes you place on your forehead. You can use it to treat or prevent headaches. You might also hear this type of device called a transcutaneous electrical nerve stimulation unit. Noninvasive vagal nerve stimulator (Gammacore). This type works on your vagus nerve, a long nerve that's a key player in your nervous system. To use it, you hold it in your hand and place it against your neck. It's approved to both stop and prevent headaches and requires a prescription. Remote electrical neuromodulator (Nerivio). You also need a prescription for this device, which you apply to your upper arm and control with a phone app. It stimulates nerves in your arm that are part of a pathway for pain signals. It's used to get rid of migraine pain once it starts. Combined occiputal and trigeminal neurostimulation (Relivion). This headband-like device stimulates the trigeminal nerve as well as the occipital nerve that runs along the back of your head. This can help stop an attack. The device, which is available by prescription, is controlled via a phone app. Botox for migraine If you have chronic migraine, shots of botulinum toxin type A ( Botox ) can help reduce the frequency of your headaches."],
56
+ ['what is the survival rate for aml', "Treatment is usually with a high-dose combination chemotherapy regimen, which may also include targeted therapy. Your cancer care team will help you choose the right treatment based on your specific situation. Is AML cancer curable? Generally speaking, AML is not curable. The only potential cure for AML is an allogeneic stem cell transplant. That is when you have the stem cells in your bone marrow replaced with those from a donor. In many cases, your doctor will only recommend testing for an allogeneic stem cell transplant when you've been treated for AML and it has come back within 12 months. Unfortunately, not everyone is a candidate for stem cell transplant. It's a very intensive treatment that you must be healthy enough to undergo, and you must also have a compatible donor. What is the survival rate for AML? This is tough to answer because there are many different types of AML, and each type has differences in treatment options and survival rates. In general: People younger than 65 usually fare better than older people. People with certain genetic mutations tend to do better than others. People with lower white blood cell counts at diagnosis have better outcomes. People without leukemia cells detected in their brains and spinal cords tend to do better. About 50%-80% of adults with AML achieve complete remission after treatment. Remission can last months or years. Unfortunately, about 50% of those who achieve complete remission will have a recurrence. At that point, your doctor may recommend testing for a stem cell transplant, enrolling in a clinical trial, or additional chemotherapy. Your doctor will do a staging workup after your diagnosis, which will give them a better idea of your prognosis based on your specific medical situation."],
57
+ ['what are some risk factors for having an undescended testicle in a newborn', 'overview\nGuide In the last few months of a pregnancy , a baby goes through all kinds of changes. The eyes open wide, the bones fully form, and weight gain ramps up. For boys, it’s also when the testicles move from the lower belly to the scrotum, that pouch of skin below the penis . But sometimes, one or both testicles don’t fall into place. That’s called an undescended testicle. It can happen to any baby boy, but it’s more common for those born earlier than expected. More often than not, the testicle drops into the scrotum on its own by the time the baby is 6 months old. If it doesn’t, the child will likely need surgery. Doctors aren’t sure why it happens. They think it’s related to genes, the mother’s health, and outside influences that change how hormones and nerves normally work. Even though the cause isn’t clear, certain factors might make an undescended testicle more likely: An earlier-than-expected birth Family history of them or other problems with how genitals develop Health conditions, such as Down syndrome , that affect how a fetus grows Low birth weight Contact by the parents with certain chemicals (pesticides) that kill bugs -- these are often used on farms It may also be more likely if the mother: Has diabetes (type 1, type 2, or gestational) Is obese Smoked cigarettes or drank alcohol during pregnancy The main sign: You can’t see or feel the testicle in the scrotum. When both are undescended, the scrotum looks flat and smaller than you’d expect it to be. Some boys have what’s called a retractile testicle. It may move up into their groin when they are cold or scared but moves back down on its own. It’s generally not a problem. The difference is that an undescended testicle stays up -- it doesn’t move back and forth.'],
58
+ ]
59
+ scores = model.predict(pairs)
60
+ print(scores.shape)
61
+ # (5,)
62
+
63
+ # Or rank different texts based on similarity to a single text
64
+ ranks = model.rank(
65
+ 'what is the life expectancy for babies with trisomy 18',
66
+ [
67
+ "Most babies born with this condition die within the first few days or weeks of life, as they have so many medical complications. Just 5% to 10% make it past their first year. Like trisomy 18, no one knows why some babies get this condition. It's known that the chance increases with the mother's age, though women of any age can have a child with trisomy 13. About 80% of babies with trisomy 18 or 13 are born to mothers under 35. The condition can be diagnosed before birth with the same tests used to identify trisomy 18, or after birth by a physical examination. Trisomy 18 is a condition where you have three copies of each chromosome 18 in your body's cells instead of two. This can lead to serious physical and mental disabilities. There is no cure, though treatment can include surgeries, medicines, breathing tubes, and feeding tubes. Some parents opt just for comfort care. Life expectancy is usually a year or less. How old is the oldest living person with trisomy 18? The oldest people were reported to be in their early 40s a few years ago. But it's unclear if they are alive today. Are babies with trisomy 18 less active in the womb? Yes, they are often less active. Trisomy 18 is a condition caused by a problem in your chromosomes.",
68
+ "But they can be habit-forming, especially if you take them regularly or for a long time. If you think you or a loved one may have a problem with benzodiazepine misuse, contact a doctor or a drug hotline. What is benzodiazepine abuse? Doctors define benzodiazepine abuse as using these drugs for non-medical reasons to get high, What is the incidence of benzodiazepine abuse? In a 12-month period between 2014 and 2015, more than 5 million people in the U.S. reported they had misused benzodiazepines. That's out of 30 million adults who used the drugs at all that year.",
69
+ "You might try them if medications don't work well for you, or use them together with medication. The devices include: Single-pulse transcranial magnetic stimulation (eNeura). To use this prescription device, you place it on the back of your head. It sends a pulse of magnetic energy that affects electrical signaling in your brain, which may stop or reduce pain. It's used both to treat and prevent attacks. External trigeminal nerve stimulation (Cefaly). This nonprescription device targets the trigeminal nerve , which provides sensation to parts of your head and face. It delivers stimulation through electrodes you place on your forehead. You can use it to treat or prevent headaches. You might also hear this type of device called a transcutaneous electrical nerve stimulation unit. Noninvasive vagal nerve stimulator (Gammacore). This type works on your vagus nerve, a long nerve that's a key player in your nervous system. To use it, you hold it in your hand and place it against your neck. It's approved to both stop and prevent headaches and requires a prescription. Remote electrical neuromodulator (Nerivio). You also need a prescription for this device, which you apply to your upper arm and control with a phone app. It stimulates nerves in your arm that are part of a pathway for pain signals. It's used to get rid of migraine pain once it starts. Combined occiputal and trigeminal neurostimulation (Relivion). This headband-like device stimulates the trigeminal nerve as well as the occipital nerve that runs along the back of your head. This can help stop an attack. The device, which is available by prescription, is controlled via a phone app. Botox for migraine If you have chronic migraine, shots of botulinum toxin type A ( Botox ) can help reduce the frequency of your headaches.",
70
+ "Treatment is usually with a high-dose combination chemotherapy regimen, which may also include targeted therapy. Your cancer care team will help you choose the right treatment based on your specific situation. Is AML cancer curable? Generally speaking, AML is not curable. The only potential cure for AML is an allogeneic stem cell transplant. That is when you have the stem cells in your bone marrow replaced with those from a donor. In many cases, your doctor will only recommend testing for an allogeneic stem cell transplant when you've been treated for AML and it has come back within 12 months. Unfortunately, not everyone is a candidate for stem cell transplant. It's a very intensive treatment that you must be healthy enough to undergo, and you must also have a compatible donor. What is the survival rate for AML? This is tough to answer because there are many different types of AML, and each type has differences in treatment options and survival rates. In general: People younger than 65 usually fare better than older people. People with certain genetic mutations tend to do better than others. People with lower white blood cell counts at diagnosis have better outcomes. People without leukemia cells detected in their brains and spinal cords tend to do better. About 50%-80% of adults with AML achieve complete remission after treatment. Remission can last months or years. Unfortunately, about 50% of those who achieve complete remission will have a recurrence. At that point, your doctor may recommend testing for a stem cell transplant, enrolling in a clinical trial, or additional chemotherapy. Your doctor will do a staging workup after your diagnosis, which will give them a better idea of your prognosis based on your specific medical situation.",
71
+ 'overview\nGuide In the last few months of a pregnancy , a baby goes through all kinds of changes. The eyes open wide, the bones fully form, and weight gain ramps up. For boys, it’s also when the testicles move from the lower belly to the scrotum, that pouch of skin below the penis . But sometimes, one or both testicles don’t fall into place. That’s called an undescended testicle. It can happen to any baby boy, but it’s more common for those born earlier than expected. More often than not, the testicle drops into the scrotum on its own by the time the baby is 6 months old. If it doesn’t, the child will likely need surgery. Doctors aren’t sure why it happens. They think it’s related to genes, the mother’s health, and outside influences that change how hormones and nerves normally work. Even though the cause isn’t clear, certain factors might make an undescended testicle more likely: An earlier-than-expected birth Family history of them or other problems with how genitals develop Health conditions, such as Down syndrome , that affect how a fetus grows Low birth weight Contact by the parents with certain chemicals (pesticides) that kill bugs -- these are often used on farms It may also be more likely if the mother: Has diabetes (type 1, type 2, or gestational) Is obese Smoked cigarettes or drank alcohol during pregnancy The main sign: You can’t see or feel the testicle in the scrotum. When both are undescended, the scrotum looks flat and smaller than you’d expect it to be. Some boys have what’s called a retractile testicle. It may move up into their groin when they are cold or scared but moves back down on its own. It’s generally not a problem. The difference is that an undescended testicle stays up -- it doesn’t move back and forth.',
72
+ ]
73
+ )
74
+ # [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
75
+ ```
76
+
77
+ <!--
78
+ ### Direct Usage (Transformers)
79
+
80
+ <details><summary>Click to see the direct usage in Transformers</summary>
81
+
82
+ </details>
83
+ -->
84
+
85
+ <!--
86
+ ### Downstream Usage (Sentence Transformers)
87
+
88
+ You can finetune this model on your own dataset.
89
+
90
+ <details><summary>Click to expand</summary>
91
+
92
+ </details>
93
+ -->
94
+
95
+ <!--
96
+ ### Out-of-Scope Use
97
+
98
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
99
+ -->
100
+
101
+ <!--
102
+ ## Bias, Risks and Limitations
103
+
104
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
105
+ -->
106
+
107
+ <!--
108
+ ### Recommendations
109
+
110
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
111
+ -->
112
+
113
+ ## Training Details
114
+
115
+ ### Training Dataset
116
+
117
+ #### Unnamed Dataset
118
+
119
+ * Size: 570,914 training samples
120
+ * Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
121
+ * Approximate statistics based on the first 1000 samples:
122
+ | | sentence_0 | sentence_1 | label |
123
+ |:--------|:-----------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------|:------------------------------------------------|
124
+ | type | string | string | int |
125
+ | details | <ul><li>min: 17 characters</li><li>mean: 98.9 characters</li><li>max: 274 characters</li></ul> | <ul><li>min: 117 characters</li><li>mean: 1647.18 characters</li><li>max: 2339 characters</li></ul> | <ul><li>0: ~47.70%</li><li>1: ~52.30%</li></ul> |
126
+ * Samples:
127
+ | sentence_0 | sentence_1 | label |
128
+ |:---------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
129
+ | <code>what is the life expectancy for babies with trisomy 18</code> | <code>Most babies born with this condition die within the first few days or weeks of life, as they have so many medical complications. Just 5% to 10% make it past their first year. Like trisomy 18, no one knows why some babies get this condition. It's known that the chance increases with the mother's age, though women of any age can have a child with trisomy 13. About 80% of babies with trisomy 18 or 13 are born to mothers under 35. The condition can be diagnosed before birth with the same tests used to identify trisomy 18, or after birth by a physical examination. Trisomy 18 is a condition where you have three copies of each chromosome 18 in your body's cells instead of two. This can lead to serious physical and mental disabilities. There is no cure, though treatment can include surgeries, medicines, breathing tubes, and feeding tubes. Some parents opt just for comfort care. Life expectancy is usually a year or less. How old is the oldest living person with trisomy 18? The oldest people wer...</code> | <code>1</code> |
130
+ | <code>what are some reasons doctors prescribe benzodiazepines</code> | <code>But they can be habit-forming, especially if you take them regularly or for a long time. If you think you or a loved one may have a problem with benzodiazepine misuse, contact a doctor or a drug hotline. What is benzodiazepine abuse? Doctors define benzodiazepine abuse as using these drugs for non-medical reasons to get high, What is the incidence of benzodiazepine abuse? In a 12-month period between 2014 and 2015, more than 5 million people in the U.S. reported they had misused benzodiazepines. That's out of 30 million adults who used the drugs at all that year.</code> | <code>0</code> |
131
+ | <code>i ve been experiencing frequent migraines that don t respond well to medication what devices could i try to help manage my pain</code> | <code>You might try them if medications don't work well for you, or use them together with medication. The devices include: Single-pulse transcranial magnetic stimulation (eNeura). To use this prescription device, you place it on the back of your head. It sends a pulse of magnetic energy that affects electrical signaling in your brain, which may stop or reduce pain. It's used both to treat and prevent attacks. External trigeminal nerve stimulation (Cefaly). This nonprescription device targets the trigeminal nerve , which provides sensation to parts of your head and face. It delivers stimulation through electrodes you place on your forehead. You can use it to treat or prevent headaches. You might also hear this type of device called a transcutaneous electrical nerve stimulation unit. Noninvasive vagal nerve stimulator (Gammacore). This type works on your vagus nerve, a long nerve that's a key player in your nervous system. To use it, you hold it in your hand and place it against your neck. It...</code> | <code>1</code> |
132
+ * Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
133
+ ```json
134
+ {
135
+ "activation_fn": "torch.nn.modules.linear.Identity",
136
+ "pos_weight": null
137
+ }
138
+ ```
139
+
140
+ ### Training Hyperparameters
141
+ #### Non-Default Hyperparameters
142
+
143
+ - `per_device_train_batch_size`: 96
144
+ - `per_device_eval_batch_size`: 96
145
+ - `num_train_epochs`: 1
146
+
147
+ #### All Hyperparameters
148
+ <details><summary>Click to expand</summary>
149
+
150
+ - `overwrite_output_dir`: False
151
+ - `do_predict`: False
152
+ - `eval_strategy`: no
153
+ - `prediction_loss_only`: True
154
+ - `per_device_train_batch_size`: 96
155
+ - `per_device_eval_batch_size`: 96
156
+ - `per_gpu_train_batch_size`: None
157
+ - `per_gpu_eval_batch_size`: None
158
+ - `gradient_accumulation_steps`: 1
159
+ - `eval_accumulation_steps`: None
160
+ - `torch_empty_cache_steps`: None
161
+ - `learning_rate`: 5e-05
162
+ - `weight_decay`: 0.0
163
+ - `adam_beta1`: 0.9
164
+ - `adam_beta2`: 0.999
165
+ - `adam_epsilon`: 1e-08
166
+ - `max_grad_norm`: 1
167
+ - `num_train_epochs`: 1
168
+ - `max_steps`: -1
169
+ - `lr_scheduler_type`: linear
170
+ - `lr_scheduler_kwargs`: {}
171
+ - `warmup_ratio`: 0.0
172
+ - `warmup_steps`: 0
173
+ - `log_level`: passive
174
+ - `log_level_replica`: warning
175
+ - `log_on_each_node`: True
176
+ - `logging_nan_inf_filter`: True
177
+ - `save_safetensors`: True
178
+ - `save_on_each_node`: False
179
+ - `save_only_model`: False
180
+ - `restore_callback_states_from_checkpoint`: False
181
+ - `no_cuda`: False
182
+ - `use_cpu`: False
183
+ - `use_mps_device`: False
184
+ - `seed`: 42
185
+ - `data_seed`: None
186
+ - `jit_mode_eval`: False
187
+ - `use_ipex`: False
188
+ - `bf16`: False
189
+ - `fp16`: False
190
+ - `fp16_opt_level`: O1
191
+ - `half_precision_backend`: auto
192
+ - `bf16_full_eval`: False
193
+ - `fp16_full_eval`: False
194
+ - `tf32`: None
195
+ - `local_rank`: 0
196
+ - `ddp_backend`: None
197
+ - `tpu_num_cores`: None
198
+ - `tpu_metrics_debug`: False
199
+ - `debug`: []
200
+ - `dataloader_drop_last`: False
201
+ - `dataloader_num_workers`: 0
202
+ - `dataloader_prefetch_factor`: None
203
+ - `past_index`: -1
204
+ - `disable_tqdm`: False
205
+ - `remove_unused_columns`: True
206
+ - `label_names`: None
207
+ - `load_best_model_at_end`: False
208
+ - `ignore_data_skip`: False
209
+ - `fsdp`: []
210
+ - `fsdp_min_num_params`: 0
211
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
212
+ - `fsdp_transformer_layer_cls_to_wrap`: None
213
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
214
+ - `deepspeed`: None
215
+ - `label_smoothing_factor`: 0.0
216
+ - `optim`: adamw_torch
217
+ - `optim_args`: None
218
+ - `adafactor`: False
219
+ - `group_by_length`: False
220
+ - `length_column_name`: length
221
+ - `ddp_find_unused_parameters`: None
222
+ - `ddp_bucket_cap_mb`: None
223
+ - `ddp_broadcast_buffers`: False
224
+ - `dataloader_pin_memory`: True
225
+ - `dataloader_persistent_workers`: False
226
+ - `skip_memory_metrics`: True
227
+ - `use_legacy_prediction_loop`: False
228
+ - `push_to_hub`: False
229
+ - `resume_from_checkpoint`: None
230
+ - `hub_model_id`: None
231
+ - `hub_strategy`: every_save
232
+ - `hub_private_repo`: None
233
+ - `hub_always_push`: False
234
+ - `gradient_checkpointing`: False
235
+ - `gradient_checkpointing_kwargs`: None
236
+ - `include_inputs_for_metrics`: False
237
+ - `include_for_metrics`: []
238
+ - `eval_do_concat_batches`: True
239
+ - `fp16_backend`: auto
240
+ - `push_to_hub_model_id`: None
241
+ - `push_to_hub_organization`: None
242
+ - `mp_parameters`:
243
+ - `auto_find_batch_size`: False
244
+ - `full_determinism`: False
245
+ - `torchdynamo`: None
246
+ - `ray_scope`: last
247
+ - `ddp_timeout`: 1800
248
+ - `torch_compile`: False
249
+ - `torch_compile_backend`: None
250
+ - `torch_compile_mode`: None
251
+ - `include_tokens_per_second`: False
252
+ - `include_num_input_tokens_seen`: False
253
+ - `neftune_noise_alpha`: None
254
+ - `optim_target_modules`: None
255
+ - `batch_eval_metrics`: False
256
+ - `eval_on_start`: False
257
+ - `use_liger_kernel`: False
258
+ - `eval_use_gather_object`: False
259
+ - `average_tokens_across_devices`: False
260
+ - `prompts`: None
261
+ - `batch_sampler`: batch_sampler
262
+ - `multi_dataset_batch_sampler`: proportional
263
+
264
+ </details>
265
+
266
+ ### Training Logs
267
+ | Epoch | Step | Training Loss |
268
+ |:------:|:----:|:-------------:|
269
+ | 0.0841 | 500 | 0.5028 |
270
+ | 0.1681 | 1000 | 0.4081 |
271
+ | 0.2522 | 1500 | 0.3872 |
272
+ | 0.3362 | 2000 | 0.3738 |
273
+ | 0.4203 | 2500 | 0.3639 |
274
+ | 0.5044 | 3000 | 0.3551 |
275
+ | 0.5884 | 3500 | 0.3464 |
276
+ | 0.6725 | 4000 | 0.338 |
277
+ | 0.7566 | 4500 | 0.329 |
278
+ | 0.8406 | 5000 | 0.3297 |
279
+ | 0.9247 | 5500 | 0.3293 |
280
+
281
+
282
+ ### Framework Versions
283
+ - Python: 3.11.12
284
+ - Sentence Transformers: 4.1.0
285
+ - Transformers: 4.52.2
286
+ - PyTorch: 2.6.0+cu124
287
+ - Accelerate: 1.7.0
288
+ - Datasets: 2.14.4
289
+ - Tokenizers: 0.21.1
290
+
291
+ ## Citation
292
+
293
+ ### BibTeX
294
+
295
+ #### Sentence Transformers
296
+ ```bibtex
297
+ @inproceedings{reimers-2019-sentence-bert,
298
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
299
+ author = "Reimers, Nils and Gurevych, Iryna",
300
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
301
+ month = "11",
302
+ year = "2019",
303
+ publisher = "Association for Computational Linguistics",
304
+ url = "https://arxiv.org/abs/1908.10084",
305
+ }
306
+ ```
307
+
308
+ <!--
309
+ ## Glossary
310
+
311
+ *Clearly define terms in order to be accessible across audiences.*
312
+ -->
313
+
314
+ <!--
315
+ ## Model Card Authors
316
+
317
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
318
+ -->
319
+
320
+ <!--
321
+ ## Model Card Contact
322
+
323
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
324
+ -->
config.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f2af5a422902d88eff307b9bbdb309302ac80d1281f356280fa78d32e851b45c
3
+ size 877
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:addc159b1eaa1083eb27b2971375787617053292a88201f848fc990c40b10597
3
+ size 1112201932
sentencepiece.bpe.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cfc8146abe2a0488e9e2a0c56de7952f7c11ab059eca145a0a727afce0db2865
3
+ size 5069051
special_tokens_map.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:66e573bfc6c3b062381e41274f7fd4143daaf01926888ffbd880c87aa6368443
3
+ size 963
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4a8d0b7573869188be52cca17a27a84f3cfbc0a5536c28ee1eca82903e8c68c6
3
+ size 17083051
tokenizer_config.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8f0e65f245d8e127fb2796272e044e0764601b02237f2089b03b24ea317398c4
3
+ size 1201