download deepseek r1 llam 8b
#33 opened about 1 month ago
by
suiyumeng
Suggestions for the right learning curve for Agents using R1-distill
#32 opened about 2 months ago
by
D-Leap07
[Possible bug] Tokenizer removes thinking part
1
#31 opened 4 months ago
by
haritzpuerto
add AIBOM
1
#30 opened 5 months ago
by
RiccardoDav
why the model inference so slowly??
#29 opened 6 months ago
by
LuYinMiao
How to disable the thinking mode?
👍
3
2
#26 opened 7 months ago
by
fmmarkmq
How to solve this Warning?
#25 opened 8 months ago
by
KevinWangHP
Does Recommended Usage apply to the distilled models?
#24 opened 8 months ago
by
yarnsp
🚩 Report: Not working
#23 opened 8 months ago
by
laozhan
Output bug
#22 opened 9 months ago
by
DazWilliams
Example Prompts
1
#21 opened 9 months ago
by
agat
duplicated bos_token when using apply_chat_template with Tokenizer
1
#20 opened 9 months ago
by
irvingjr
tokenizer.model
#19 opened 9 months ago
by
Lozai
Update README.md
#18 opened 9 months ago
by
tekno-power
<think> tag is missing in the latest revision
2
#17 opened 9 months ago
by
ajsqr
微调DeepSeek-R1打造SQL语言转自然语言视频教程
#16 opened 9 months ago
by
leo009
One more "0" in model-00001-of-000002.safetensors?
#15 opened 9 months ago
by
PPrimo
Excellent models !!! - Plans for Mistral Nemo and/or Gemma 2 Distills ?
➕
3
#14 opened 9 months ago
by
DavidAU
Adding Evaluation Results
#12 opened 9 months ago
by
Mikhil-jivus
Missing multilanguage capabilities
6
#11 opened 9 months ago
by
h4rz3rk4s3
run in colab t4
#9 opened 9 months ago
by
rakmik
Adding Evaluation Results
#8 opened 9 months ago
by
T145
Add pipeline tag, link to paper
#7 opened 10 months ago
by
nielsr
Do the distilled models also have 128K context?
👍
1
2
#4 opened 10 months ago
by
Troyanovsky
How was this quantized?
1
#3 opened 10 months ago
by
imq
missing special_tokens_map.json file
#2 opened 10 months ago
by
vince62s