KW's picture

KW

kevineen

·

AI & ML interests

None yet

Recent Activity

liked a model about 1 hour ago

allenai/Bolmo-7B

liked a model about 23 hours ago

Maincode/Maincoder-1B

liked a model about 23 hours ago

tanaos/tanaos-text-anonymizer-v1

View all activity

Organizations

upvoted a paper 4 days ago

TimeBill: Time-Budgeted Inference for Large Language Models

Paper • 2512.21859 • Published 8 days ago • 18

upvoted an article 6 days ago

Article

Deriving the PPO Loss from First Principles

9 days ago

•

31

upvoted 3 articles 9 days ago

Article

KV Cache from scratch in nanoVLM

+3

Jun 4, 2025

•

107

Article

nanoVLM: The simplest repository to train your VLM in pure PyTorch

+5

May 21, 2025

•

247

Article

Efficient MultiModal Data Pipeline

+3

Jul 8, 2025

•

69

upvoted a collection 10 days ago

Optimal Sparsity Math

Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks • 67 items • Updated Aug 19, 2025 • 2

upvoted an article 12 days ago

Article

The Open Evaluation Standard: Benchmarking NVIDIA Nemotron 3 Nano with NeMo Evaluator

17 days ago

•

35

upvoted a collection 12 days ago

Speech Language Models

20 items • Updated 11 days ago • 5

upvoted an article 14 days ago

Article

Tokenization in Transformers v5: Simpler, Clearer, and More Modular

+4

16 days ago

•

91

upvoted a collection 17 days ago

Nemotron-Cascade

Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models • 18 items • Updated 2 days ago • 40

upvoted an article 27 days ago

Article

We Got Claude to Fine-Tune an Open Source LLM

about 1 month ago

•

555

upvoted a paper about 1 month ago

Agent0-VL: Exploring Self-Evolving Agent for Tool-Integrated Vision-Language Reasoning

Paper • 2511.19900 • Published Nov 25, 2025 • 48

upvoted 2 articles about 1 month ago

Article

Continuous batching from first principles

+1

Nov 25, 2025

•

291

Article

ChatML vs Harmony: Understanding the new Format from OpenAI 🔍

Aug 9, 2025

•

48

upvoted a paper about 2 months ago

The Path Not Taken: RLVR Provably Learns Off the Principals

Paper • 2511.08567 • Published Nov 11, 2025 • 33

upvoted a collection about 2 months ago

📄 FinePDFs

81 items • Updated Nov 11, 2025 • 26

upvoted a paper about 2 months ago

Visual Spatial Tuning

Paper • 2511.05491 • Published Nov 7, 2025 • 51

upvoted a collection about 2 months ago

Ouro

a family of pre-trained Looped Language Models. • 4 items • Updated Oct 29, 2025 • 21

upvoted a paper about 2 months ago

Context Engineering 2.0: The Context of Context Engineering

Paper • 2510.26493 • Published Oct 30, 2025 • 8

upvoted a paper 2 months ago

Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published Oct 30, 2025 • 119