How to use this model ?

by 1pharaxh - opened 11 days ago

Discussion

1pharaxh

11 days ago

Hi, I was wondering how I can use this inside my xcode project ?

thanks!

agg23

Owner 11 days ago

This model was built for use with https://github.com/starkdmi/ml_sharp_mlx. It is unchanged from what is built by the tool in that repo, other than converting to f16.

1pharaxh

8 days ago

Thanks a bunch! It works flawlessly!

1pharaxh changed discussion status to closed 8 days ago

1pharaxh changed discussion status to open 8 days ago

1pharaxh

8 days ago

Hi I was wondering if you could share how you quantised the model

Thanks

agg23

Owner 8 days ago

I used a naive implementation of:

import safetensors.torch as st
import os

print('Loading float32 weights...')
tensors = st.load_file('checkpoints/sharp.safetensors')

print('Converting to float16...')
tensors_fp16 = {k: v.half() for k, v in tensors.items()}

print('Saving float16 weights...')
st.save_file(tensors_fp16, 'checkpoints/sharp_fp16.safetensors')

orig_size = os.path.getsize('checkpoints/sharp.safetensors') / 1024**3
new_size = os.path.getsize('checkpoints/sharp_fp16.safetensors') / 1024**3
print(f'Original: {orig_size:.2f} GB')
print(f'Float16:  {new_size:.2f} GB')
print('Done!')

The quantization could definitely be improved, but I'm not sure how much of a difference it would make. It was sufficient for my tests.

agg23 changed discussion status to closed 8 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment