MemLoRA: Distilling Expert Adapters for On-Device Memory Systems Paper • 2512.04763 • Published 26 days ago • 3
VersatileFFN: Achieving Parameter Efficiency in LLMs via Adaptive Wide-and-Deep Reuse Paper • 2512.14531 • Published 14 days ago • 11
SonicMoE: Accelerating MoE with IO and Tile-aware Optimizations Paper • 2512.14080 • Published 15 days ago • 5
Improving Recursive Transformers with Mixture of LoRAs Paper • 2512.12880 • Published 16 days ago • 5
Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning Paper • 2512.20848 • Published 7 days ago • 28
Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play Paper • 2509.25541 • Published Sep 29 • 140