Update README.md
#15 opened 2 days ago
by
f14
Is this cosmos?
#13 opened 5 months ago
by
ccocks-deca
sglang supports Llama-3_1-Nemotron-Ultra-253B-v1 ?
#12 opened 6 months ago
by
chuanyizjc
llama.cpp now supports Llama-3_1-Nemotron-Ultra-253B-v1
๐
1
#11 opened 6 months ago
by
ymcki
Error invalid configuration argument at line 76 in file /user/...//bitsandbytes/csrc/ops.cu
#10 opened 6 months ago
by
bitmman-nch
Add link to Github repository and paper page
#9 opened 6 months ago
by
nielsr
Model usage on vLLM fails: `No available memory for the cache blocks` & `Error executing method 'determine_num_available_blocks'`
2
#8 opened 7 months ago
by
surajd
benchmark test use vllm ? input/output=500/2000 ๏ผ
2
#6 opened 7 months ago
by
chuanyizjc
FP8 and FP4
2
#5 opened 7 months ago
by
whatever1983
how to reproduce the benchmark score?
#4 opened 7 months ago
by
lincharliesun
AWQ OR GPTQ Quant
๐
1
1
#2 opened 7 months ago
by
chriswritescode
"ffn_mult": null,
โ
1
9
#1 opened 7 months ago
by
csabakecskemeti