--- tags: - setfit - sentence-transformers - text-classification - generated_from_setfit_trainer widget: - text: Solicite um relatório financeiro trimestral via ERP conectado. - text: If you save $200 monthly, how much money will you have saved after 18 months? - text: Get the stock price history of Tesla for the last month. - text: Given a historical archive of economic indicators, build a forecasting model that predicts recessions, incorporating leading, lagging, and coincident indicators with explainable outputs. - text: Narrate the experience of a character born without the ability to dream. metrics: - accuracy pipeline_tag: text-classification library_name: setfit inference: true base_model: ibm-granite/granite-embedding-107m-multilingual model-index: - name: SetFit with ibm-granite/granite-embedding-107m-multilingual results: - task: type: text-classification name: Text Classification dataset: name: Unknown type: unknown split: test metrics: - type: accuracy value: 0.9966555183946488 name: Accuracy --- # SetFit with ibm-granite/granite-embedding-107m-multilingual This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [ibm-granite/granite-embedding-107m-multilingual](https://huggingface.co/ibm-granite/granite-embedding-107m-multilingual) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification. The model has been trained using an efficient few-shot learning technique that involves: 1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning. 2. Training a classification head with features from the fine-tuned Sentence Transformer. ## Model Details ### Model Description - **Model Type:** SetFit - **Sentence Transformer body:** [ibm-granite/granite-embedding-107m-multilingual](https://huggingface.co/ibm-granite/granite-embedding-107m-multilingual) - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance - **Maximum Sequence Length:** 512 tokens - **Number of Classes:** 8 classes ### Model Sources - **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit) - **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055) - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit) ### Model Labels | Label | Examples | |:------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | summarization |

'Resuma um texto acadêmico sobre psicologia do comportamento.'
'Summarize the timeline and outcomes of a historical event based on multiple eyewitness accounts.'
'Extract and summarize the key lessons learned from multiple post-project reviews.'

| | general_knowledge |

'Qual é a importância da agricultura para a economia brasileira?'
'Quais são os principais países membros da Organização dos Países Exportadores de Petróleo (OPEP)?'
'What is the mechanism by which vaccines provide immunity?'

| | roleplay |

'Personifique um chef pâtissier criando uma sobremesa para um júri exigente.'
'You are a software tester devising scenarios to uncover bugs in a complex system.'
'Simule uma reunião de conselho editorial decidindo o rumo de uma grande publicação.'

| | creativity |

'Write a thriller in which the protagonist communicates only through artwork.'
'Imagine um poema narrativo sobre a relação entre o sertão e a poesia de uma geração esquecida.'
'Write a story from the perspective of a shadow that gains independence.'

| | complex_reasoning |

'Analise as implicações do uso de drones autônomos para entregas em áreas urbanas densas.'
'Proponha um sistema para avaliação automatizada e justa de currículos em processos seletivos corporativos.'
'Proponha um modelo para prever o crescimento urbano sustentável considerando variáveis ambientais e sociais.'

| | coding |

'Implemente uma função para decompor números inteiros em fatores primos eficientemente para valores grandes.'
'Create an integration that consumes streaming data from an external message broker and processes events in real-time with backpressure management.'
'Escreva um algoritmo para encontrar os pontos de articulação (cut vertices) em um grafo não direcionado.'

| | basic_reasoning |

'Se um carro consome 12 litros de gasolina para 100 km, quantos litros usará para 150 km?'
'If a ladder leans against a wall forming a 60-degree angle and the ladder length is 10 feet, how high does it reach on the wall?'
'Quantos centímetros tem 1 metro?'

| | tool |

'Fetch comprehensive user reviews and ratings for a mobile app across platforms.'
'Analyze sentiment of a tweet and classify it as positive, neutral, or negative.'
'Retrieve country-wise COVID-19 vaccination rates from an authoritative source.'

| ## Evaluation ### Metrics | Label | Accuracy | |:--------|:---------| | **all** | 0.9967 | ## Uses ### Direct Use for Inference First install the SetFit library: ```bash pip install setfit ``` Then you can load this model and run inference. ```python from setfit import SetFitModel # Download from the 🤗 Hub model = SetFitModel.from_pretrained("cnmoro/prompt-router") # Run inference preds = model("Get the stock price history of Tesla for the last month.") ``` ## Training Details ### Training Set Metrics | Training set | Min | Median | Max | |:-------------|:----|:--------|:----| | Word count | 5 | 13.6792 | 38 | | Label | Training Sample Count | |:------------------|:----------------------| | summarization | 160 | | tool | 144 | | general_knowledge | 154 | | roleplay | 145 | | complex_reasoning | 130 | | creativity | 164 | | coding | 152 | | basic_reasoning | 148 | ### Training Hyperparameters - batch_size: (8, 8) - num_epochs: (1, 16) - max_steps: 2400 - sampling_strategy: oversampling - body_learning_rate: (2e-05, 1e-05) - head_learning_rate: 0.01 - loss: CosineSimilarityLoss - distance_metric: cosine_distance - margin: 0.25 - end_to_end: False - use_amp: False - warmup_proportion: 0.1 - l2_weight: 0.01 - seed: 42 - evaluation_strategy: steps - eval_max_steps: -1 - load_best_model_at_end: True ### Training Results | Epoch | Step | Training Loss | Validation Loss | |:------:|:----:|:-------------:|:---------------:| | 0.0004 | 1 | 0.1954 | - | | 0.0208 | 50 | 0.2125 | - | | 0.0417 | 100 | 0.2131 | - | | 0.0625 | 150 | 0.2072 | - | | 0.0833 | 200 | 0.2029 | 0.1902 | | 0.1042 | 250 | 0.1925 | - | | 0.125 | 300 | 0.1764 | - | | 0.1458 | 350 | 0.1512 | - | | 0.1667 | 400 | 0.1229 | 0.1072 | | 0.1875 | 450 | 0.1015 | - | | 0.2083 | 500 | 0.0862 | - | | 0.2292 | 550 | 0.065 | - | | 0.25 | 600 | 0.0505 | 0.0504 | | 0.2708 | 650 | 0.0532 | - | | 0.2917 | 700 | 0.0427 | - | | 0.3125 | 750 | 0.0378 | - | | 0.3333 | 800 | 0.0357 | 0.0322 | | 0.3542 | 850 | 0.0286 | - | | 0.375 | 900 | 0.0381 | - | | 0.3958 | 950 | 0.0333 | - | | 0.4167 | 1000 | 0.0307 | 0.0235 | | 0.4375 | 1050 | 0.0245 | - | | 0.4583 | 1100 | 0.0245 | - | | 0.4792 | 1150 | 0.0217 | - | | 0.5 | 1200 | 0.0193 | 0.0168 | | 0.5208 | 1250 | 0.0167 | - | | 0.5417 | 1300 | 0.0158 | - | | 0.5625 | 1350 | 0.02 | - | | 0.5833 | 1400 | 0.0167 | 0.0120 | | 0.6042 | 1450 | 0.0176 | - | | 0.625 | 1500 | 0.0159 | - | | 0.6458 | 1550 | 0.0141 | - | | 0.6667 | 1600 | 0.0131 | 0.0094 | | 0.6875 | 1650 | 0.0097 | - | | 0.7083 | 1700 | 0.0109 | - | | 0.7292 | 1750 | 0.0126 | - | | 0.75 | 1800 | 0.0115 | 0.0079 | | 0.7708 | 1850 | 0.0122 | - | | 0.7917 | 1900 | 0.0104 | - | | 0.8125 | 1950 | 0.0111 | - | | 0.8333 | 2000 | 0.011 | 0.0071 | | 0.8542 | 2050 | 0.0095 | - | | 0.875 | 2100 | 0.009 | - | | 0.8958 | 2150 | 0.0107 | - | | 0.9167 | 2200 | 0.0099 | 0.0067 | | 0.9375 | 2250 | 0.0084 | - | | 0.9583 | 2300 | 0.0086 | - | | 0.9792 | 2350 | 0.0089 | - | | 1.0 | 2400 | 0.0098 | 0.0066 | ### Framework Versions - Python: 3.11.11 - SetFit: 1.2.0.dev0 - Sentence Transformers: 4.0.2 - Transformers: 4.51.3 - PyTorch: 2.6.0+cu124 - Datasets: 3.5.0 - Tokenizers: 0.21.1 ## Citation ### BibTeX ```bibtex @article{https://doi.org/10.48550/arxiv.2209.11055, doi = {10.48550/ARXIV.2209.11055}, url = {https://arxiv.org/abs/2209.11055}, author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren}, keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences}, title = {Efficient Few-Shot Learning Without Prompts}, publisher = {arXiv}, year = {2022}, copyright = {Creative Commons Attribution 4.0 International} } ```