| base_model: | |
| - mistralai/Mistral-7B-Instruct-v0.3 | |
| datasets: | |
| - NingLab/MMECInstruct | |
| license: cc-by-4.0 | |
| pipeline_tag: image-text-to-text | |
| library_name: transformers | |
| # CASLIE-M | |
| This repository contains the CASLIE-M model presented in the paper [Captions Speak Louder than Images: Generalizing Foundation Models for E-commerce from High-quality Multimodal Instruction Data](https://huggingface.co/papers/2410.17337). | |
| Project page: https://ninglab.github.io/CASLIE/ | |
| Code: https://github.com/ninglab/CASLIE | |
| ## CASLIE Models | |
| The CASLIE-M model is instruction-tuned from the medium-size base model [Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3). | |
| ## Sample Usage | |
| To conduct inference, run `python inference.py --model_path $model_path --task $task --output_path $output_path`. | |
| `$model_path` is the path of the instruction-tuned model. | |
| `$task` specifies the task to be tested. | |
| `$output_path` specifies the path where you want to save the inference output. | |
| Example: | |
| ```bash | |
| python inference.py --model_path NingLab/CASLIE-M --task answerability_prediction --output_path ap.json | |
| ``` | |
| ## Citation | |
| ```bibtex | |
| @article{ling2024captions, | |
| title={Captions Speak Louder than Images (CASLIE): Generalizing Foundation Models for E-commerce from High-quality Multimodal Instruction Data}, | |
| author={Ling, Xinyi and Peng, Bo and Du, Hanwen and Zhu, Zhihui and Ning, Xia}, | |
| journal={arXiv preprint arXiv:2410.17337}, | |
| year={2024} | |
| } | |
| ``` |