chiaraboretti commited on
Commit
9baaa33
·
verified ·
1 Parent(s): 85a0a1c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -0
README.md CHANGED
@@ -103,6 +103,7 @@ output_ids = generated_ids[0][len(model_inputs.input_ids[0]) :]
103
  print(tokenizer.decode(output_ids, skip_special_tokens=True))
104
 
105
  ```
 
106
 
107
  <details>
108
  <summary><span style="font-size:1.1em; font-weight:bold;">🧩 Quantization Process</span></summary>
 
103
  print(tokenizer.decode(output_ids, skip_special_tokens=True))
104
 
105
  ```
106
+ > You can optionally compile the model’s forward pass using torch.compile, which can provide a significant speed boost (especially after the first run). Please consider that the first run will take longer because PyTorch compiles optimized kernels, but subsequent runs will be much faster.
107
 
108
  <details>
109
  <summary><span style="font-size:1.1em; font-weight:bold;">🧩 Quantization Process</span></summary>