nw's picture

nw

NightwingNg

·

AI & ML interests

None yet

Recent Activity

liked a model 1 day ago

unsloth/Kimi-K2-Thinking-GGUF

liked a model 3 days ago

cerebras/MiniMax-M2-REAP-162B-A10B

liked a model 13 days ago

moonshotai/Kimi-K2-Thinking

View all activity

Organizations

None yet

upvoted an article 2 months ago

Article

ZebraLogic: Benchmarking the Logical Reasoning Ability of Language Models

Jul 27, 2024

•

34

upvoted a paper 4 months ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24 • 309

upvoted a collection 4 months ago

Seed-X

A powerful open-source multilingual translation language model series, including instruction and reasoning models. • 8 items • Updated Aug 22 • 65

upvoted an article 4 months ago

Article

OpenReasoning-Nemotron: A Family of State-of-the-Art Distilled Reasoning Models

Jul 18

•

50

upvoted a paper 5 months ago

MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

Paper • 2506.13585 • Published Jun 16 • 271

upvoted 2 collections 5 months ago

MiniMax-M1

MiniMax-M1, the world's first open-weight, large-scale hybrid-attention reasoning model. • 6 items • Updated about 1 month ago • 116

V-JEPA 2

A frontier video understanding model developed by FAIR, Meta, which extends the pretraining objectives of https://ai.meta.com/blog/v-jepa-yann • 8 items • Updated Jun 13 • 171

upvoted a collection 6 months ago

GRMR V3 Models

An improved set of models for grammar correction. (Chat template should work, no "responding as an LLM" anymore, that kind of stuff). • 6 items • Updated Jun 4 • 10

upvoted a paper 6 months ago

QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23 • 88

upvoted a collection 6 months ago

RpR Models

RpR (RolePlay with Reasoning) models which are built on RPMax datasets with properly trained multi-turn reasoning. • 8 items • Updated Jun 25 • 12

upvoted 4 collections 7 months ago

Qwen3

Qwen's new Qwen3 models. In Unsloth Dynamic 2.0, GGUF, 4-bit and 16-bit Safetensor formats. Includes 128K Context Length variants. • 79 items • Updated 21 days ago • 234

Unsloth Dynamic 2.0 Quants

New 2.0 version of our Dynamic GGUF + Quants. Dynamic 2.0 achieves superior accuracy & SOTA quantization performance. • 54 items • Updated 13 days ago • 250

Qwen3

84 items • Updated Aug 6 • 1.44k

GLM-4-0414

GLM-4-0414 series model • 8 items • Updated Jun 30 • 133

upvoted a paper 7 months ago

DeepSeek-R1 Thoughtology: Let's <think> about LLM Reasoning

Paper • 2504.07128 • Published Apr 2 • 86

upvoted a collection 9 months ago

Deepseek Papers

Deepseek papers collection • 25 items • Updated 11 days ago • 283