New REAP GLM 4.6!!!

by InfernalDread - opened 22 days ago

Discussion

InfernalDread

22 days ago

•

edited 22 days ago

Hello,

Is it possible for you to create an MXFP4 GGUF quant of this model as well?

https://huggingface.co/cerebras/GLM-4.6-REAP-218B-A32B-FP8

I am really digging the MXFP4 quants, so I wanted to test the quality for this model as well!

Thank you for your hard work!

EDIT:

Seems there is still maybe an issue regarding the MTP layers of these new REAP models, so I guess it would be best to wait until those issues are resolved so that the GGUF can be run in llama.cpp

NEW EDIT:

It looks like the Cerebras team has fixed the MTP issue hopefully, so everything seems to be ready for conversion!

sm54

Owner 21 days ago

•

edited 21 days ago

It's in the works at the moment, just waiting for it to finish, if all goes well, it will be available at: https://huggingface.co/sm54/GLM-4.6-REAP-218B-A32B-MXFP4_MOE

I think it'll be another hour or two before its finished, as the instance i am using doesn't have a very fast ssd.

Edited: Okay its now been uploaded to HF, its quantised from the BF16 version.

InfernalDread

21 days ago

you are a legend!

sm54 changed discussion status to closed 21 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment