-
deepseek-ai/DeepSeek-V3-Base
685B • Updated • 4.96k • 1.68k -
TransMLA: Multi-head Latent Attention Is All You Need
Paper • 2502.07864 • Published • 58 -
Qwen2.5 Bakeneko 32b Instruct Awq
⚡2Generate detailed responses to text prompts
-
Deepseek R1 Distill Qwen2.5 Bakeneko 32b Awq
⚡3Generate text responses to user messages in a chat interface
Eduardo Espina
Edespina
·
AI & ML interests
None yet
Organizations
None yet
MWT
-
deepseek-ai/DeepSeek-V3-Base
685B • Updated • 4.96k • 1.68k -
TransMLA: Multi-head Latent Attention Is All You Need
Paper • 2502.07864 • Published • 58 -
Sleeping2
Qwen2.5 Bakeneko 32b Instruct Awq
⚡2Generate detailed responses to text prompts
-
Sleeping3
Deepseek R1 Distill Qwen2.5 Bakeneko 32b Awq
⚡3Generate text responses to user messages in a chat interface
models
0
None public yet
datasets
0
None public yet