New REAP GLM 4.6!!!

#2
by InfernalDread - opened

Hello,

Is it possible for you to create an MXFP4 GGUF quant of this model as well?

https://huggingface.co/cerebras/GLM-4.6-REAP-218B-A32B-FP8

I am really digging the MXFP4 quants, so I wanted to test the quality for this model as well!

Thank you for your hard work!

EDIT:

Seems there is still maybe an issue regarding the MTP layers of these new REAP models, so I guess it would be best to wait until those issues are resolved so that the GGUF can be run in llama.cpp

NEW EDIT:

It looks like the Cerebras team has fixed the MTP issue hopefully, so everything seems to be ready for conversion!

It's in the works at the moment, just waiting for it to finish, if all goes well, it will be available at: https://huggingface.co/sm54/GLM-4.6-REAP-218B-A32B-MXFP4_MOE

I think it'll be another hour or two before its finished, as the instance i am using doesn't have a very fast ssd.

Edited: Okay its now been uploaded to HF, its quantised from the BF16 version.

you are a legend!

sm54 changed discussion status to closed

Sign up or log in to comment