How to use this model ?
#1
by
1pharaxh
- opened
Hi, I was wondering how I can use this inside my xcode project ?
thanks!
This model was built for use with https://github.com/starkdmi/ml_sharp_mlx. It is unchanged from what is built by the tool in that repo, other than converting to f16.
Thanks a bunch! It works flawlessly!
1pharaxh
changed discussion status to
closed
1pharaxh
changed discussion status to
open
Hi I was wondering if you could share how you quantised the model
Thanks
I used a naive implementation of:
import safetensors.torch as st
import os
print('Loading float32 weights...')
tensors = st.load_file('checkpoints/sharp.safetensors')
print('Converting to float16...')
tensors_fp16 = {k: v.half() for k, v in tensors.items()}
print('Saving float16 weights...')
st.save_file(tensors_fp16, 'checkpoints/sharp_fp16.safetensors')
orig_size = os.path.getsize('checkpoints/sharp.safetensors') / 1024**3
new_size = os.path.getsize('checkpoints/sharp_fp16.safetensors') / 1024**3
print(f'Original: {orig_size:.2f} GB')
print(f'Float16: {new_size:.2f} GB')
print('Done!')
The quantization could definitely be improved, but I'm not sure how much of a difference it would make. It was sufficient for my tests.
agg23
changed discussion status to
closed