GGUF version please

by Hoioi - opened Jan 22, 2024

Jan 22, 2024

Could you please release the GGUF version of this model? I'm following your models and they all seem very interesting, but unfortunately you don't release the GGUF version them 😔

MaziyarPanahi

Owner Jan 23, 2024

Could you please release the GGUF version of this model? I'm following your models and they all seem very interesting, but unfortunately you don't release the GGUF version them 😔

Hi @Hoioi
Absolutely! I know how to do GPTQ, but let me have a look at the GGUF script and see how it works. I will ask you to test it just to be sure I did it correctly.

MaziyarPanahi

Owner Jan 23, 2024

Hi @Hoioi

I have made my first GGUF, since you seem to know how to use this format already could you please have a look and test them? I did follow the official instructions and quantized from the f16 for better accuracy, but wanted to be sure: https://huggingface.co/MaziyarPanahi/Tess-XS-v1-3-yarn-128K-Mistral-7B-Instruct-v0.1-GGUF

Hoioi

Jan 23, 2024

Yes, sure. I'm downloading them and will update you after testing them some hours later.

Hoioi

Jan 23, 2024

I downloaded Tess-XS-v1-3-yarn-128K-Mistral-7B-Instruct-v0.1.Q3_K_M.gguf and loaded it in webui Oobabooga, and it loaded successfully, but i notices that instead of 128K, the default parameters of the model are as below:
n_ctx=32768
Truncate the prompt up to this length= 32768

but these are not the main issue, the main issue is that after giving the model instructions using proper prompt ( or using any other prompt format) the model doesn't return anything and it shows an error in webui Oobabooga.
The model doesn't work in Koboldcpp either.
So it can not be used 😔

MaziyarPanahi

Owner Jan 23, 2024

Could be related to. this:

There are some issues with Mistral-based models and llama.cpp, I will look tomorrow to see if it can be fixed

Hoioi

Jan 23, 2024

Thank you so much. I will download some of other your gguf models and will update you. I'm pretty sure they will work fine 😊

MaziyarPanahi

Owner Jan 23, 2024

@Hoioi
I tested it locally by llama.cpp and you are right it doesn't work and it failed with vocab error. I think I found the issue. Since it's my first time doing GGUF I used the wrong script to convert the model. I need to update all my other gguf models, but before doing that could you please re-try this model that has the fix (I re-uploaded them all): https://huggingface.co/MaziyarPanahi/Tess-XS-v1-3-yarn-128K-Mistral-7B-Instruct-v0.1-GGUF

Hoioi

Jan 24, 2024

Thank you so much for your hard work. I downloaded and tested the model and it works perfectly fine. I'm really interested in your great work and i hope you continue developing the new models especially gguf versions.
Thank you again!

Hoioi changed discussion status to closed Jan 24, 2024

MaziyarPanahi

Owner Jan 24, 2024

@Hoioi Many thanks for your feedback, appreciate it. I'll continue merging new models and quantize them at the same time to GGUF. (I like how GGUF is accessible and useful to wide range of users without GPUs!)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment