ubergarm
/

Qwen3-235B-A22B-Instruct-2507-GGUF

Text Generation

Model card Files Files and versions

ubergarm commited on Jul 22

Commit

3bbd126

·

1 Parent(s): 3bfea3f

cleaning up command a bit

Files changed (1) hide show

README.md +3 -4

README.md CHANGED Viewed

@@ -44,17 +44,16 @@ Compare with Perplexity of full size *wiki.te
 ## Quick Start
 Currently testing some more quants and @eaddario's imatrix corpus to decide what to relese next in the smaller sizes. Some graphs in the discussions.
 ```bash
-./build/bin/llama-server
   --model /models/IQ5_K/Qwen3-235B-A22B-Instruct-IQ5_K-00001-of-00004.gguf \
   --alias ubergarm/Qwen3-235B-A22B-Instruct-2507 \
   -fa -fmoe \
   -ctk q8_0 -ctv q8_0 \
   -c 32768 \
   -ngl 99 \
-  -ot blk\.[0-9]\.ffn.*=CUDA0 \
   -ot "blk.*\.ffn.*=CPU \
-  -ngl 99 \
-  --threads 16
   -ub 4096 -b 4096 \
   --host 127.0.0.1 \
   --port 8080

 ## Quick Start
 Currently testing some more quants and @eaddario's imatrix corpus to decide what to relese next in the smaller sizes. Some graphs in the discussions.
 ```bash
+./build/bin/llama-server \
   --model /models/IQ5_K/Qwen3-235B-A22B-Instruct-IQ5_K-00001-of-00004.gguf \
   --alias ubergarm/Qwen3-235B-A22B-Instruct-2507 \
   -fa -fmoe \
   -ctk q8_0 -ctv q8_0 \
   -c 32768 \
   -ngl 99 \
+  -ot "blk\.[0-9]\.ffn.*=CUDA0" \
   -ot "blk.*\.ffn.*=CPU \
+  --threads 16 \
   -ub 4096 -b 4096 \
   --host 127.0.0.1 \
   --port 8080