umd-zhou-lab
/

recycled-alpaca-7b-v1.0

+---
+license: llama2
+datasets:
+- umd-zhou-lab/recycled_alpaca_v1
+language:
+- en
+---
+# Model Card for umd-zhou-lab/recycled-alpaca-7b-v1.0
+<!-- Provide a quick summary of what the model is/does. -->
+This model is trained by fine-tuning llama-2 with recycled alpaca data V1.
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+- **Developed by:** UMD Tianyi Zhou Lab
+- **Model type:** An auto-regressive language model based on the transformer architecture
+- **License:** Llama 2 Community License Agreement
+- **Finetuned from model:** [meta-llama/Llama-2-7b](https://huggingface.co/meta-llama/Llama-2-7b)
+### Model Sources
+<!-- Provide the basic links for the model. -->
+- **GitHub:** [Reflection-Tuning](https://github.com/tianyi-lab/Reflection_Tuning)
+- **Paper:** [Reflection-Tuning: Data Recycling Improves LLM Instruction-Tuning](https://arxiv.org/abs/2310.11716)
+- **Data:** [recycled_alpaca_v1](https://huggingface.co/datasets/umd-zhou-lab/recycled_alpaca_v1)
+## Uses
+The primary use of this model is research on large language models and chatbots.
+The primary intended users of the model are researchers and hobbyists in natural language processing, machine learning, and artificial intelligence.
+## Training
+We use the prompt from [FastChat](https://github.com/lm-sys/FastChat):
+```
+A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: Hi ASSISTANT: Hello.</s>USER: Who are you? ASSISTANT: I am ...</s>......
+```
+| Hyperparameter | Global Batch Size | Learning rate | Epochs | Max length | Weight decay | Warmup Rate |
+| --- | ---: | ---: | ---: | ---: | ---: | ---: |
+| Recycled Models (7B) | 128 | 2e-5 | 3 | 2048 | 0 | 0.03 |
+## Performance
+The following table provides a comparison between our recycled models (V1) and baseline models on the AlpacaEval Leaderboard and Huggingface Open LLM Leaderboard. <br>
+The Recycled Alpaca Data can be found here: [[hf-Link]](https://huggingface.co/datasets/umd-zhou-lab/recycled_alpaca_v1) <br>
+The Recycled WizardLM (70k) Data can be found here: [[hf-Link]](https://huggingface.co/datasets/umd-zhou-lab/recycled_wiz70_v1) <br>
+|                          | **AlpacaEval** || **Avg** | **ARC** | **HellaSwag** | **MMLU** | **TruthfulQA** || **Model**|
+|--------------------------|:--------------:|:-:|:-----------:|:-------:|:-------------:|:-------:|:--------------:|:-:|:-:|
+| **Alpaca 7B**            | 26.46          || 50.21       | 42.65   | 76.91         | 41.73   | 39.55          ||/|
+| **Recycled Alpaca 7B V1.0**   | 76.99          || 56.18| 53.92   | 77.68         | 47.55   | 45.55          ||[[hf-Link]](https://huggingface.co/umd-zhou-lab/recycled-alpaca-7b-v1.0)|
+| **Recycled Alpaca 13B V1.0**  | 83.42          || 58.93| 58.70   | 80.80         | 53.11   | 43.12          ||[Link]|
+|||||||||||
+| **WizardLM 7B**          | 67.64          || 54.18       | 51.60   | 77.70         | 42.70   | 44.70          ||/|
+| **Recycled WizardLM 7B V1.0** | 78.88          || 56.21       | 53.92   | 77.05         | 48.35   | 45.52         ||[[hf-Link]](https://huggingface.co/umd-zhou-lab/recycled-wizardlm-7b-v1.0)|
+|||||||||
+## Citation
+Please consider citing our paper if you think our codes, data, or models are useful. Thank you!
+```
+@misc{li2023reflectiontuning,
+      title={Reflection-Tuning: Data Recycling Improves LLM Instruction-Tuning},
+      author={Ming Li and Lichang Chen and Jiuhai Chen and Shwai He and Heng Huang and Jiuxiang Gu and Tianyi Zhou},
+      year={2023},
+      eprint={2310.11716},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL}
+}
+```