CrossEncoder based on ncbi/MedCPT-Cross-Encoder
This is a Cross Encoder model finetuned from ncbi/MedCPT-Cross-Encoder using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
Model Details
Model Description
- Model Type: Cross Encoder
- Base model: ncbi/MedCPT-Cross-Encoder
- Maximum Sequence Length: 512 tokens
- Number of Output Labels: 1 label
Model Sources
- Documentation: Sentence Transformers Documentation
- Documentation: Cross Encoder Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Cross Encoders on Hugging Face
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import CrossEncoder
# Download from the ๐ค Hub
model = CrossEncoder("cross_encoder_model_id")
# Get scores for pairs of texts
pairs = [
['Its8, a fission yeast homolog of Mcd4 and Pig-n, is involved in GPI anchor synthesis and shares an essential function with calcineurin in cytokinesis. In fission yeast, calcineurin is required for cytokinesis and ion homeostasis; however, most of its physiological roles remain obscure. To identify genes that share an essential function with calcineurin, we screened for mutations that confer sensitivity to the calcineurin inhibitor FK506 and high temperature and isolated the mutant its8-1. its8(+) encodes a homolog of the budding yeast MCD4 and human Pig-n that are involved in glycosylphosphatidylinositol (GPI) anchor synthesis. Consistently, reduced inositol labeling of proteins suggested impaired GPI anchor synthesis in its8-1 mutants. The temperature upshift induced a further decrease in inositol labeling and caused dramatic increases in the frequency of septation in its8-1 mutants. BE49385A, an inhibitor of MCD4 and Pig-n, also increased the septation index of the wild-type cell. Osmotic stabilization suppressed these morphological defects, indicating that cell wall weakness caused by impaired GPI anchor synthesis resulted in abnormal cytokinesis. Furthermore, calcineurin-deleted cells exhibited hypersensitivity to BE49385A, and FK506 exacerbated the cytokinesis defects of the its8-1 mutant. Thus, calcineurin and Its8 may share an essential function in cytokinesis and cell viability through the regulation of cell wall integrity.', 'G2P01414 - PIGN - 606097.0 - HGNC:8967 - MCD4; PIG-N - PIGN-related multiple congenital anomalies-hypotonia-seizures syndrome - 614080 - nan - biallelic_autosomal - nan - definitive - absent gene product; altered gene product structure - inframe_insertion; missense_variant; inframe_deletion - undetermined - inferred'],
['Its8, a fission yeast homolog of Mcd4 and Pig-n, is involved in GPI anchor synthesis and shares an essential function with calcineurin in cytokinesis. In fission yeast, calcineurin is required for cytokinesis and ion homeostasis; however, most of its physiological roles remain obscure. To identify genes that share an essential function with calcineurin, we screened for mutations that confer sensitivity to the calcineurin inhibitor FK506 and high temperature and isolated the mutant its8-1. its8(+) encodes a homolog of the budding yeast MCD4 and human Pig-n that are involved in glycosylphosphatidylinositol (GPI) anchor synthesis. Consistently, reduced inositol labeling of proteins suggested impaired GPI anchor synthesis in its8-1 mutants. The temperature upshift induced a further decrease in inositol labeling and caused dramatic increases in the frequency of septation in its8-1 mutants. BE49385A, an inhibitor of MCD4 and Pig-n, also increased the septation index of the wild-type cell. Osmotic stabilization suppressed these morphological defects, indicating that cell wall weakness caused by impaired GPI anchor synthesis resulted in abnormal cytokinesis. Furthermore, calcineurin-deleted cells exhibited hypersensitivity to BE49385A, and FK506 exacerbated the cytokinesis defects of the its8-1 mutant. Thus, calcineurin and Its8 may share an essential function in cytokinesis and cell viability through the regulation of cell wall integrity.', 'G2P02996 - MFF - 614785.0 - HGNC:24858 - C2ORF33; GL004 - MFF-related encephalopathy due to defective mitochondrial and peroxisomal fission - 617086 - nan - biallelic_autosomal - nan - definitive - absent gene product - nan - loss of function - inferred'],
['Its8, a fission yeast homolog of Mcd4 and Pig-n, is involved in GPI anchor synthesis and shares an essential function with calcineurin in cytokinesis. In fission yeast, calcineurin is required for cytokinesis and ion homeostasis; however, most of its physiological roles remain obscure. To identify genes that share an essential function with calcineurin, we screened for mutations that confer sensitivity to the calcineurin inhibitor FK506 and high temperature and isolated the mutant its8-1. its8(+) encodes a homolog of the budding yeast MCD4 and human Pig-n that are involved in glycosylphosphatidylinositol (GPI) anchor synthesis. Consistently, reduced inositol labeling of proteins suggested impaired GPI anchor synthesis in its8-1 mutants. The temperature upshift induced a further decrease in inositol labeling and caused dramatic increases in the frequency of septation in its8-1 mutants. BE49385A, an inhibitor of MCD4 and Pig-n, also increased the septation index of the wild-type cell. Osmotic stabilization suppressed these morphological defects, indicating that cell wall weakness caused by impaired GPI anchor synthesis resulted in abnormal cytokinesis. Furthermore, calcineurin-deleted cells exhibited hypersensitivity to BE49385A, and FK506 exacerbated the cytokinesis defects of the its8-1 mutant. Thus, calcineurin and Its8 may share an essential function in cytokinesis and cell viability through the regulation of cell wall integrity.', 'G2P01166 - CERT1 - 604677.0 - HGNC:2205 - CERT; COL4A3BP; GPBP; STARD11 - CERT1-related intellectual disability - 616351 - nan - monoallelic_autosomal - restricted mutation set - definitive - altered gene product structure - nan - gain of function - inferred'],
['Its8, a fission yeast homolog of Mcd4 and Pig-n, is involved in GPI anchor synthesis and shares an essential function with calcineurin in cytokinesis. In fission yeast, calcineurin is required for cytokinesis and ion homeostasis; however, most of its physiological roles remain obscure. To identify genes that share an essential function with calcineurin, we screened for mutations that confer sensitivity to the calcineurin inhibitor FK506 and high temperature and isolated the mutant its8-1. its8(+) encodes a homolog of the budding yeast MCD4 and human Pig-n that are involved in glycosylphosphatidylinositol (GPI) anchor synthesis. Consistently, reduced inositol labeling of proteins suggested impaired GPI anchor synthesis in its8-1 mutants. The temperature upshift induced a further decrease in inositol labeling and caused dramatic increases in the frequency of septation in its8-1 mutants. BE49385A, an inhibitor of MCD4 and Pig-n, also increased the septation index of the wild-type cell. Osmotic stabilization suppressed these morphological defects, indicating that cell wall weakness caused by impaired GPI anchor synthesis resulted in abnormal cytokinesis. Furthermore, calcineurin-deleted cells exhibited hypersensitivity to BE49385A, and FK506 exacerbated the cytokinesis defects of the its8-1 mutant. Thus, calcineurin and Its8 may share an essential function in cytokinesis and cell viability through the regulation of cell wall integrity.', 'G2P01646 - STAT2 - 600556.0 - HGNC:11363 - STAT113 - STAT2-related viral induced severe multiorgan dysfunction related with impaired mitochondrial fission - nan - nan - biallelic_autosomal - nan - limited - absent gene product - nan - loss of function - inferred'],
['Its8, a fission yeast homolog of Mcd4 and Pig-n, is involved in GPI anchor synthesis and shares an essential function with calcineurin in cytokinesis. In fission yeast, calcineurin is required for cytokinesis and ion homeostasis; however, most of its physiological roles remain obscure. To identify genes that share an essential function with calcineurin, we screened for mutations that confer sensitivity to the calcineurin inhibitor FK506 and high temperature and isolated the mutant its8-1. its8(+) encodes a homolog of the budding yeast MCD4 and human Pig-n that are involved in glycosylphosphatidylinositol (GPI) anchor synthesis. Consistently, reduced inositol labeling of proteins suggested impaired GPI anchor synthesis in its8-1 mutants. The temperature upshift induced a further decrease in inositol labeling and caused dramatic increases in the frequency of septation in its8-1 mutants. BE49385A, an inhibitor of MCD4 and Pig-n, also increased the septation index of the wild-type cell. Osmotic stabilization suppressed these morphological defects, indicating that cell wall weakness caused by impaired GPI anchor synthesis resulted in abnormal cytokinesis. Furthermore, calcineurin-deleted cells exhibited hypersensitivity to BE49385A, and FK506 exacerbated the cytokinesis defects of the its8-1 mutant. Thus, calcineurin and Its8 may share an essential function in cytokinesis and cell viability through the regulation of cell wall integrity.', 'G2P03036 - NDUFA8 - 603359.0 - HGNC:7692 - MGC793; PGIV - NDUFA8-related developmental disorder - nan - nan - biallelic_autosomal - nan - strong - absent gene product - nan - loss of function - inferred'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)
# Or rank different texts based on similarity to a single text
ranks = model.rank(
'Its8, a fission yeast homolog of Mcd4 and Pig-n, is involved in GPI anchor synthesis and shares an essential function with calcineurin in cytokinesis. In fission yeast, calcineurin is required for cytokinesis and ion homeostasis; however, most of its physiological roles remain obscure. To identify genes that share an essential function with calcineurin, we screened for mutations that confer sensitivity to the calcineurin inhibitor FK506 and high temperature and isolated the mutant its8-1. its8(+) encodes a homolog of the budding yeast MCD4 and human Pig-n that are involved in glycosylphosphatidylinositol (GPI) anchor synthesis. Consistently, reduced inositol labeling of proteins suggested impaired GPI anchor synthesis in its8-1 mutants. The temperature upshift induced a further decrease in inositol labeling and caused dramatic increases in the frequency of septation in its8-1 mutants. BE49385A, an inhibitor of MCD4 and Pig-n, also increased the septation index of the wild-type cell. Osmotic stabilization suppressed these morphological defects, indicating that cell wall weakness caused by impaired GPI anchor synthesis resulted in abnormal cytokinesis. Furthermore, calcineurin-deleted cells exhibited hypersensitivity to BE49385A, and FK506 exacerbated the cytokinesis defects of the its8-1 mutant. Thus, calcineurin and Its8 may share an essential function in cytokinesis and cell viability through the regulation of cell wall integrity.',
[
'G2P01414 - PIGN - 606097.0 - HGNC:8967 - MCD4; PIG-N - PIGN-related multiple congenital anomalies-hypotonia-seizures syndrome - 614080 - nan - biallelic_autosomal - nan - definitive - absent gene product; altered gene product structure - inframe_insertion; missense_variant; inframe_deletion - undetermined - inferred',
'G2P02996 - MFF - 614785.0 - HGNC:24858 - C2ORF33; GL004 - MFF-related encephalopathy due to defective mitochondrial and peroxisomal fission - 617086 - nan - biallelic_autosomal - nan - definitive - absent gene product - nan - loss of function - inferred',
'G2P01166 - CERT1 - 604677.0 - HGNC:2205 - CERT; COL4A3BP; GPBP; STARD11 - CERT1-related intellectual disability - 616351 - nan - monoallelic_autosomal - restricted mutation set - definitive - altered gene product structure - nan - gain of function - inferred',
'G2P01646 - STAT2 - 600556.0 - HGNC:11363 - STAT113 - STAT2-related viral induced severe multiorgan dysfunction related with impaired mitochondrial fission - nan - nan - biallelic_autosomal - nan - limited - absent gene product - nan - loss of function - inferred',
'G2P03036 - NDUFA8 - 603359.0 - HGNC:7692 - MGC793; PGIV - NDUFA8-related developmental disorder - nan - nan - biallelic_autosomal - nan - strong - absent gene product - nan - loss of function - inferred',
]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
Evaluation
Metrics
Cross Encoder Classification
- Dataset:
anno_test - Evaluated with
CrossEncoderClassificationEvaluator
| Metric | Value |
|---|---|
| accuracy | 0.824 |
| accuracy_threshold | 0.9996 |
| f1 | 0.3842 |
| f1_threshold | 0.9992 |
| precision | 0.2698 |
| recall | 0.6667 |
| average_precision | 0.329 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 35,474 training samples
- Columns:
tiab,g2p_lgmde, andlabel - Approximate statistics based on the first 1000 samples:
tiab g2p_lgmde label type string string int details - min: 30 characters
- mean: 1348.52 characters
- max: 2802 characters
- min: 177 characters
- mean: 258.9 characters
- max: 407 characters
- 0: ~72.90%
- 1: ~27.10%
- Samples:
tiab g2p_lgmde label Its8, a fission yeast homolog of Mcd4 and Pig-n, is involved in GPI anchor synthesis and shares an essential function with calcineurin in cytokinesis. In fission yeast, calcineurin is required for cytokinesis and ion homeostasis; however, most of its physiological roles remain obscure. To identify genes that share an essential function with calcineurin, we screened for mutations that confer sensitivity to the calcineurin inhibitor FK506 and high temperature and isolated the mutant its8-1. its8(+) encodes a homolog of the budding yeast MCD4 and human Pig-n that are involved in glycosylphosphatidylinositol (GPI) anchor synthesis. Consistently, reduced inositol labeling of proteins suggested impaired GPI anchor synthesis in its8-1 mutants. The temperature upshift induced a further decrease in inositol labeling and caused dramatic increases in the frequency of septation in its8-1 mutants. BE49385A, an inhibitor of MCD4 and Pig-n, also increased the septation index of the wild-type cell. Os...G2P01414 - PIGN - 606097.0 - HGNC:8967 - MCD4; PIG-N - PIGN-related multiple congenital anomalies-hypotonia-seizures syndrome - 614080 - nan - biallelic_autosomal - nan - definitive - absent gene product; altered gene product structure - inframe_insertion; missense_variant; inframe_deletion - undetermined - inferred1Its8, a fission yeast homolog of Mcd4 and Pig-n, is involved in GPI anchor synthesis and shares an essential function with calcineurin in cytokinesis. In fission yeast, calcineurin is required for cytokinesis and ion homeostasis; however, most of its physiological roles remain obscure. To identify genes that share an essential function with calcineurin, we screened for mutations that confer sensitivity to the calcineurin inhibitor FK506 and high temperature and isolated the mutant its8-1. its8(+) encodes a homolog of the budding yeast MCD4 and human Pig-n that are involved in glycosylphosphatidylinositol (GPI) anchor synthesis. Consistently, reduced inositol labeling of proteins suggested impaired GPI anchor synthesis in its8-1 mutants. The temperature upshift induced a further decrease in inositol labeling and caused dramatic increases in the frequency of septation in its8-1 mutants. BE49385A, an inhibitor of MCD4 and Pig-n, also increased the septation index of the wild-type cell. Os...G2P02996 - MFF - 614785.0 - HGNC:24858 - C2ORF33; GL004 - MFF-related encephalopathy due to defective mitochondrial and peroxisomal fission - 617086 - nan - biallelic_autosomal - nan - definitive - absent gene product - nan - loss of function - inferred0Its8, a fission yeast homolog of Mcd4 and Pig-n, is involved in GPI anchor synthesis and shares an essential function with calcineurin in cytokinesis. In fission yeast, calcineurin is required for cytokinesis and ion homeostasis; however, most of its physiological roles remain obscure. To identify genes that share an essential function with calcineurin, we screened for mutations that confer sensitivity to the calcineurin inhibitor FK506 and high temperature and isolated the mutant its8-1. its8(+) encodes a homolog of the budding yeast MCD4 and human Pig-n that are involved in glycosylphosphatidylinositol (GPI) anchor synthesis. Consistently, reduced inositol labeling of proteins suggested impaired GPI anchor synthesis in its8-1 mutants. The temperature upshift induced a further decrease in inositol labeling and caused dramatic increases in the frequency of septation in its8-1 mutants. BE49385A, an inhibitor of MCD4 and Pig-n, also increased the septation index of the wild-type cell. Os...G2P01166 - CERT1 - 604677.0 - HGNC:2205 - CERT; COL4A3BP; GPBP; STARD11 - CERT1-related intellectual disability - 616351 - nan - monoallelic_autosomal - restricted mutation set - definitive - altered gene product structure - nan - gain of function - inferred0 - Loss:
BinaryCrossEntropyLosswith these parameters:{ "activation_fn": "torch.nn.modules.linear.Identity", "pos_weight": null }
Training Hyperparameters
Non-Default Hyperparameters
per_device_train_batch_size: 16per_device_eval_batch_size: 16learning_rate: 3.879271032713091e-05num_train_epochs: 2warmup_ratio: 0.1
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: noprediction_loss_only: Trueper_device_train_batch_size: 16per_device_eval_batch_size: 16per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 3.879271032713091e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 2max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportionalrouter_mapping: {}learning_rate_mapping: {}
Training Logs
| Epoch | Step | Training Loss | anno_test_average_precision |
|---|---|---|---|
| -1 | -1 | - | 0.6910 |
| 0.2254 | 500 | 0.0908 | - |
| 0.4509 | 1000 | 0.0266 | - |
| 0.6763 | 1500 | 0.0207 | - |
| 0.9017 | 2000 | 0.0126 | - |
| 1.1271 | 2500 | 0.0082 | - |
| 1.3526 | 3000 | 0.0098 | - |
| 1.5780 | 3500 | 0.0072 | - |
| 1.8034 | 4000 | 0.0098 | - |
| -1 | -1 | - | 0.3290 |
Framework Versions
- Python: 3.10.12
- Sentence Transformers: 5.1.0
- Transformers: 4.55.0
- PyTorch: 2.7.1+cu126
- Accelerate: 1.10.0
- Datasets: 4.0.0
- Tokenizers: 0.21.4
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
- Downloads last month
- 13
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for tmy100000001/LitDD_crossencoder
Base model
ncbi/MedCPT-Cross-EncoderEvaluation results
- Accuracy on anno testself-reported0.824
- Accuracy Threshold on anno testself-reported1.000
- F1 on anno testself-reported0.384
- F1 Threshold on anno testself-reported0.999
- Precision on anno testself-reported0.270
- Recall on anno testself-reported0.667
- Average Precision on anno testself-reported0.329