Update README.md
Browse files
README.md
CHANGED
|
@@ -4,15 +4,25 @@ license: apache-2.0
|
|
| 4 |
base_model: Falconsai/text_summarization
|
| 5 |
tags:
|
| 6 |
- generated_from_trainer
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 7 |
metrics:
|
| 8 |
- rouge
|
| 9 |
model-index:
|
| 10 |
- name: very-small-prompt-compression
|
| 11 |
results: []
|
|
|
|
|
|
|
| 12 |
---
|
| 13 |
|
| 14 |
# very-small-prompt-compression
|
| 15 |
|
|
|
|
|
|
|
| 16 |
This model is a fine-tuned version of [Falconsai/text_summarization](https://huggingface.co/Falconsai/text_summarization) on the [gravitee-io/dolly-15k-prompt-compression](https://huggingface.co/datasets/gravitee-io/dolly-15k-prompt-compression) dataset.
|
| 17 |
It achieves the following results on the evaluation set:
|
| 18 |
- Loss: 2.1583
|
|
@@ -109,3 +119,29 @@ The following hyperparameters were used during training:
|
|
| 109 |
- **Held-out compression:** On the ≤64 token evaluation split the model reaches a mean compression ratio of 0.7395 (≈26 % token reduction) with only 0.04 % of generations exceeding the original length.
|
| 110 |
- **Semantic fidelity:** Cosine similarity between original and compressed embeddings (`text-embedding-3-small`) averages above 0.90, indicating that key semantics are preserved.
|
| 111 |
- **Instruction alignment:** ROUGE-L of 0.7792 against synthetic targets shows the model closely matches the policy-compliant outputs produced during data generation.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4 |
base_model: Falconsai/text_summarization
|
| 5 |
tags:
|
| 6 |
- generated_from_trainer
|
| 7 |
+
- summarization
|
| 8 |
+
- compression
|
| 9 |
+
- prompt-summarization
|
| 10 |
+
- prompt-compression
|
| 11 |
+
- text_summarization
|
| 12 |
+
- text_compression
|
| 13 |
metrics:
|
| 14 |
- rouge
|
| 15 |
model-index:
|
| 16 |
- name: very-small-prompt-compression
|
| 17 |
results: []
|
| 18 |
+
datasets:
|
| 19 |
+
- gravitee-io/dolly-15k-prompt-compression
|
| 20 |
---
|
| 21 |
|
| 22 |
# very-small-prompt-compression
|
| 23 |
|
| 24 |
+
Interactive demo: [Very Small Prompt Compression (Space)](https://huggingface.co/spaces/gravitee-io/very-small-prompt-compression-demo)
|
| 25 |
+
|
| 26 |
This model is a fine-tuned version of [Falconsai/text_summarization](https://huggingface.co/Falconsai/text_summarization) on the [gravitee-io/dolly-15k-prompt-compression](https://huggingface.co/datasets/gravitee-io/dolly-15k-prompt-compression) dataset.
|
| 27 |
It achieves the following results on the evaluation set:
|
| 28 |
- Loss: 2.1583
|
|
|
|
| 119 |
- **Held-out compression:** On the ≤64 token evaluation split the model reaches a mean compression ratio of 0.7395 (≈26 % token reduction) with only 0.04 % of generations exceeding the original length.
|
| 120 |
- **Semantic fidelity:** Cosine similarity between original and compressed embeddings (`text-embedding-3-small`) averages above 0.90, indicating that key semantics are preserved.
|
| 121 |
- **Instruction alignment:** ROUGE-L of 0.7792 against synthetic targets shows the model closely matches the policy-compliant outputs produced during data generation.
|
| 122 |
+
|
| 123 |
+
## License
|
| 124 |
+
This model is released under the Apache 2.0 License.
|
| 125 |
+
|
| 126 |
+
## Acknowledgments
|
| 127 |
+
- Training data sourced from [databricks/databricks-dolly-15k](https://huggingface.co/datasets/databricks/databricks-dolly-15k) and the compressed derivative [gravitee-io/dolly-15k-prompt-compression](https://huggingface.co/datasets/gravitee-io/dolly-15k-prompt-compression)
|
| 128 |
+
- Base model: [Falconsai/text_summarization](https://huggingface.co/Falconsai/text_summarization)
|
| 129 |
+
|
| 130 |
+
## Citation
|
| 131 |
+
If you use this model in your research, please cite:
|
| 132 |
+
|
| 133 |
+
```
|
| 134 |
+
@misc{very_small_prompt_compression_2025,
|
| 135 |
+
title={Very Small Prompt Compression Model},
|
| 136 |
+
author={Derek Thompson - Gravitee.io},
|
| 137 |
+
year={2025},
|
| 138 |
+
publisher={Hugging Face},
|
| 139 |
+
howpublished={\url{https://huggingface.co/gravitee-io/very-small-prompt-compression}}
|
| 140 |
+
}
|
| 141 |
+
```
|
| 142 |
+
|
| 143 |
+
## Contact
|
| 144 |
+
For questions, issues, or contributions, please open an issue on the model repository.
|
| 145 |
+
|
| 146 |
+
---
|
| 147 |
+
Generated by [dotslashderek](https://huggingface.co/dotslashderek) on 2025-10-31
|