SentenceTransformer based on nomic-ai/nomic-embed-text-v1

This is a sentence-transformers model finetuned from nomic-ai/nomic-embed-text-v1. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: nomic-ai/nomic-embed-text-v1
  • Maximum Sequence Length: 8192 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False, 'architecture': 'NomicBertModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("kavish218/nomic_embeddings-htc-2")
# Run inference
sentences = [
    'Modernism is both a philosophical movement and an art movement that arose from broad transformations in Western society during the late 19th and early 20th centuries. The movement reflected a desire for the creation of new forms of art,  philosophy, and social organization which reflected the newly emerging industrial world, including features such as urbanization, new technologies, and war. Artists attempted to depart from traditional forms of art, which they considered outdated or obsolete. The poet Ezra Pound\'s 1934 injunction to "Make it New" was the touchstone of the movement\'s approach.  Modernist innovations included abstract art, the stream-of-consciousness novel, montage cinema, atonal and twelve-tone music, and divisionist painting. Modernism explicitly rejected the ideology of realism and made use of the works of the past by the employment of reprise, incorporation, rewriting, recapitulation, revision and parody. Modernism also rejected the certainty of Enlightenment thinking, and many modernists also rejected religious belief. A notable characteristic of modernism is self-consciousness concerning artistic and social traditions, which often led to experimentation with form, along with the use of techniques that drew attention to the processes and materials used in creating works of art.While some scholars see modernism continuing into the 21st century, others see it evolving into late modernism or high modernism. Postmodernism is a departure from modernism and rejects its basic assumptions.',
    'Postmodernism is a broad movement that developed in the mid-to-late 20th century across philosophy, the arts, architecture, and criticism, marking a departure from modernism. The term has been more generally applied to describe a historical era said to follow after modernity and the tendencies of this era.  Postmodernism is generally defined by an attitude of skepticism, irony, or rejection toward what it describes as the grand narratives and ideologies associated with modernism, often criticizing Enlightenment rationality and focusing on the role of ideology in maintaining political or economic power. Postmodern thinkers frequently describe knowledge claims and value systems as contingent or socially-conditioned, framing them as products of political, historical, or cultural discourses and hierarchies. Common targets of postmodern criticism include universalist ideas of objective reality, morality, truth, human nature, reason, science, language, and social progress. Accordingly, postmodern thought is broadly characterized by tendencies to self-consciousness, self-referentiality, epistemological and moral relativism, pluralism, and irreverence. Postmodern critical approaches gained popularity in the 1980s and 1990s, and have been adopted in a variety of academic and theoretical disciplines, including cultural studies, philosophy of science, economics, linguistics, architecture, feminist theory, and literary criticism, as well as art movements in fields such as literature, contemporary art, and music. Postmodernism is often associated with schools of thought such as deconstruction, post-structuralism, and institutional critique, as well as philosophers such as Jean-François Lyotard, Jacques Derrida, and Fredric Jameson. Criticisms of postmodernism are intellectually diverse and include arguments that postmodernism promotes obscurantism, is meaningless, and that it adds nothing to analytical or empirical knowledge.',
    'Yucca is a genus of perennial shrubs and trees in the family Asparagaceae, subfamily Agavoideae. Its 40–50 species are notable for their rosettes of evergreen, tough, sword-shaped leaves and large terminal panicles of white or whitish flowers. They are native to the hot and dry (arid) parts of the Americas and the Caribbean. Early reports of the species were confused with the cassava (Manihot esculenta). Consequently, Linnaeus mistakenly derived the generic name from the Taíno word for the latter, yuca.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.8460, 0.0451],
#         [0.8460, 1.0000, 0.0430],
#         [0.0451, 0.0430, 1.0000]])

Training Details

Training Dataset

Unnamed Dataset

  • Size: 2,953 training samples
  • Columns: content_1 and content_2
  • Approximate statistics based on the first 1000 samples:
    content_1 content_2
    type string string
    details
    • min: 31 tokens
    • mean: 366.65 tokens
    • max: 1139 tokens
    • min: 34 tokens
    • mean: 360.87 tokens
    • max: 1202 tokens
  • Samples:
    content_1 content_2
    Sacral architecture (also known as sacred architecture or religious architecture) is a religious architectural practice concerned with the design and construction of places of worship or sacred or intentional space, such as churches, mosques, stupas, synagogues, and temples. Many cultures devoted considerable resources to their sacred architecture and places of worship. Religious and sacred spaces are amongst the most impressive and permanent monolithic buildings created by humanity. Conversely, sacred architecture as a locale for meta-intimacy may also be non-monolithic, ephemeral and intensely private, personal and non-public. Sacred, religious and holy structures often evolved over centuries and were the largest buildings in the world, prior to the modern skyscraper. While the various styles employed in sacred architecture sometimes reflected trends in other structures, these styles also remained unique from the contemporary architecture used in other structures. With the rise of C... Architecture (Latin architectura, from the Greek ἀρχιτέκτων arkhitekton "architect", from ἀρχι- "chief" and τέκτων "creator") is both the process and the product of planning, designing, and constructing buildings or other structures. Architectural works, in the material form of buildings, are often perceived as cultural symbols and as works of art. Historical civilizations are often identified with their surviving architectural achievements.The practice, which began in the prehistoric era, has been used as a way of expressing culture for civilizations on all seven continents. For this reason, architecture is considered to be a form of art. Texts on architecture have been written since ancient time. The earliest surviving text on architectural theory is the 1st century AD treatise De architectura by the Roman architect Vitruvius, according to whom a good building embodies firmitas, utilitas, and venustas (durability, utility, and beauty). Centuries later, Leon Battista Alberti developed...
    Proportion is a central principle of architectural theory and an important connection between mathematics and art. It is the visual effect of the relationships of the various objects and spaces that make up a structure to one another and to the whole. These relationships are often governed by multiples of a standard unit of length known as a "module".Proportion in architecture was discussed by Vitruvius, Leon Battista Alberti, Andrea Palladio, and Le Corbusier among others. Landscape architecture is the design of outdoor areas, landmarks, and structures to achieve environmental, social-behavioural, or aesthetic outcomes. It involves the systematic design and general engineering of various structures for construction and human use, investigation of existing social, ecological, and soil conditions and processes in the landscape, and the design of other interventions that will produce desired outcomes. The scope of the profession is broad and can be subdivided into several sub-categories including professional or licensed landscape architects who are regulated by governmental agencies and possess the expertise to design a wide range of structures and landforms for human use; landscape design which is not a licensed profession; site planning; stormwater management; erosion control; environmental restoration; parks, recreation and urban planning; visual resource management; green infrastructure planning and provision; and private estate and residence landscape...
    The Basílica de la Sagrada Família (Catalan: [bəˈzilikə ðə lə səˈɣɾaðə fəˈmiljə]; Spanish: Basílica de la Sagrada Familia; 'Basilica of the Holy Family'), also known as the Sagrada Família, is a large unfinished Roman Catholic minor basilica in the Eixample district of Barcelona, Catalonia, Spain. Designed by the Spanish architect Antoni Gaudí (1852–1926), his work on the building is part of a UNESCO World Heritage Site. On 7 November 2010, Pope Benedict XVI consecrated the church and proclaimed it a minor basilica.On 19 March 1882, construction of the Sagrada Família began under architect Francisco de Paula del Villar. In 1883, when Villar resigned, Gaudí took over as chief architect, transforming the project with his architectural and engineering style, combining Gothic and curvilinear Art Nouveau forms. Gaudí devoted the remainder of his life to the project, and he is buried in the crypt. At the time of his death in 1926, less than a quarter of the project was complete.Relying sole... The Colosseum ( KOL-ə-SEE-əm; Italian: Colosseo [kolosˈsɛːo]) is an oval amphitheatre in the centre of the city of Rome, Italy, just east of the Roman Forum. It is the largest ancient amphitheatre ever built, and is still the largest standing amphitheatre in the world today, despite its age. Construction began under the emperor Vespasian (r. 69–79 AD) in 72 and was completed in 80 AD under his successor and heir, Titus (r. 79–81). Further modifications were made during the reign of Domitian (r. 81–96). The three emperors that were patrons of the work are known as the Flavian dynasty, and the amphitheatre was named the Flavian Amphitheatre (Latin: Amphitheatrum Flavium; Italian: Anfiteatro Flavio [aɱfiteˈaːtro ˈflaːvjo]) by later classicists and archaeologists for its association with their family name (Flavius).The Colosseum is built of travertine limestone, tuff (volcanic rock), and brick-faced concrete. The Colosseum could hold an estimated 50,000 to 80,000 spectators at various poin...
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_eval_batch_size: 16
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • fp16: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 8
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss
0.0270 10 0.4417
0.0541 20 0.2135
0.0811 30 0.0849
0.1081 40 0.2744
0.1351 50 0.2297
0.1622 60 0.2694
0.1892 70 0.1039
0.2162 80 0.144
0.2432 90 0.0802
0.2703 100 0.0886
0.2973 110 0.1841
0.3243 120 0.0515
0.3514 130 0.373
0.3784 140 0.0519
0.4054 150 0.0942
0.4324 160 0.1645
0.4595 170 0.1254
0.4865 180 0.1549
0.5135 190 0.1378
0.5405 200 0.1643
0.5676 210 0.116
0.5946 220 0.0724
0.6216 230 0.1589
0.6486 240 0.2252
0.6757 250 0.1201
0.7027 260 0.2506
0.7297 270 0.0639
0.7568 280 0.2527
0.7838 290 0.267
0.8108 300 0.0509
0.8378 310 0.2324
0.8649 320 0.2107
0.8919 330 0.1843
0.9189 340 0.0659
0.9459 350 0.1914
0.9730 360 0.0676
1.0 370 0.1129

Framework Versions

  • Python: 3.11.9
  • Sentence Transformers: 5.1.0
  • Transformers: 4.53.3
  • PyTorch: 2.8.0+cu128
  • Accelerate: 1.10.1
  • Datasets: 4.0.0
  • Tokenizers: 0.21.4

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
2
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for kavish218/nomic_embeddings-htc-2

Finetuned
(22)
this model