Parameters / Experts - How to run this model ;
#16 opened 5 months ago
by
DavidAU
DeepSeek R1 0528?
#15 opened 6 months ago
by
Thireus
This model almost completely loses Chinese ablities
π
1
3
#14 opened 6 months ago
by
CHNtentes
Base version?
β
3
2
#13 opened 6 months ago
by
ToastyPigeon
Russian language is missing
1
#12 opened 7 months ago
by
Kosh69
Please, share the custom vLLM source you made
π
1
#11 opened 7 months ago
by
hyunw55
Update metadata π€
#10 opened 7 months ago
by
merve
Model seems to not be performing correctly
1
#9 opened 7 months ago
by
daniel-ltw
Larger model?
π§
2
#8 opened 7 months ago
by
blobbybob
number of experts +
π₯
π§
2
#7 opened 7 months ago
by
Danioken
Brainstorming
π§
5
5
#6 opened 7 months ago
by
Downtown-Case
Further training/distillation needed?
π
1
1
#5 opened 7 months ago
by
mingyi456
Besides pruning..
6
#4 opened 7 months ago
by
Lockout
Context size? YaRN still supported?
2
#3 opened 7 months ago
by
Thireus
Variants
#2 opened 7 months ago
by
someone13574
code
β
18
#1 opened 7 months ago
by
mrfakename