ubergarm commited on
Commit
3bbd126
·
1 Parent(s): 3bfea3f

cleaning up command a bit

Browse files
Files changed (1) hide show
  1. README.md +3 -4
README.md CHANGED
@@ -44,17 +44,16 @@ Compare with Perplexity of full size *wiki.te
44
  ## Quick Start
45
  Currently testing some more quants and @eaddario's imatrix corpus to decide what to relese next in the smaller sizes. Some graphs in the discussions.
46
  ```bash
47
- ./build/bin/llama-server
48
  --model /models/IQ5_K/Qwen3-235B-A22B-Instruct-IQ5_K-00001-of-00004.gguf \
49
  --alias ubergarm/Qwen3-235B-A22B-Instruct-2507 \
50
  -fa -fmoe \
51
  -ctk q8_0 -ctv q8_0 \
52
  -c 32768 \
53
  -ngl 99 \
54
- -ot blk\.[0-9]\.ffn.*=CUDA0 \
55
  -ot "blk.*\.ffn.*=CPU \
56
- -ngl 99 \
57
- --threads 16
58
  -ub 4096 -b 4096 \
59
  --host 127.0.0.1 \
60
  --port 8080
 
44
  ## Quick Start
45
  Currently testing some more quants and @eaddario's imatrix corpus to decide what to relese next in the smaller sizes. Some graphs in the discussions.
46
  ```bash
47
+ ./build/bin/llama-server \
48
  --model /models/IQ5_K/Qwen3-235B-A22B-Instruct-IQ5_K-00001-of-00004.gguf \
49
  --alias ubergarm/Qwen3-235B-A22B-Instruct-2507 \
50
  -fa -fmoe \
51
  -ctk q8_0 -ctv q8_0 \
52
  -c 32768 \
53
  -ngl 99 \
54
+ -ot "blk\.[0-9]\.ffn.*=CUDA0" \
55
  -ot "blk.*\.ffn.*=CPU \
56
+ --threads 16 \
 
57
  -ub 4096 -b 4096 \
58
  --host 127.0.0.1 \
59
  --port 8080