nvidia
/

Llama-3_1-Nemotron-Ultra-253B-v1

Text Generation

Model card Files Files and versions

Resources

View closed (3)

Update README.md

#15 opened 2 days ago by

Is this cosmos?

#13 opened 5 months ago by

sglang supports Llama-3_1-Nemotron-Ultra-253B-v1 ?

#12 opened 6 months ago by

llama.cpp now supports Llama-3_1-Nemotron-Ultra-253B-v1

#11 opened 6 months ago by

Error invalid configuration argument at line 76 in file /user/...//bitsandbytes/csrc/ops.cu

#10 opened 6 months ago by

Add link to Github repository and paper page

#9 opened 6 months ago by

Model usage on vLLM fails: `No available memory for the cache blocks` & `Error executing method 'determine_num_available_blocks'`

#8 opened 7 months ago by

benchmark test use vllm ? input/output=500/2000 ？

#6 opened 7 months ago by

FP8 and FP4

#5 opened 7 months ago by

how to reproduce the benchmark score?

#4 opened 7 months ago by

AWQ OR GPTQ Quant

#2 opened 7 months ago by

chriswritescode

"ffn_mult": null,

#1 opened 7 months ago by

csabakecskemeti