35 208 12

Byung-Kwan Lee

BK-Lee

https://sites.google.com/view/byungkwanlee

AI & ML interests

Vision Language Models

Recent Activity

commented on a paper 13 days ago

Unified Reinforcement and Imitation Learning for Vision-Language Models

commented on a paper 14 days ago

Unified Reinforcement and Imitation Learning for Vision-Language Models

upvoted a paper 14 days ago

Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence

View all activity

Organizations

upvoted a paper 14 days ago

Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence

Paper • 2510.20579 • Published 15 days ago • 54

upvoted 3 papers 15 days ago

BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping

Paper • 2510.18927 • Published 17 days ago • 82

Unified Reinforcement and Imitation Learning for Vision-Language Models

Paper • 2510.19307 • Published 16 days ago • 26

DeepSeek-OCR: Contexts Optical Compression

Paper • 2510.18234 • Published 17 days ago • 71

upvoted a paper 17 days ago

MultiVerse: A Multi-Turn Conversation Benchmark for Evaluating Large Vision and Language Models

Paper • 2510.16641 • Published 20 days ago • 4

upvoted 13 papers 2 months ago

LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model

Paper • 2509.00676 • Published Aug 31 • 83

SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

Paper • 2509.02479 • Published Sep 2 • 83

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2 • 220

Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning

Paper • 2508.20751 • Published Aug 28 • 89

Skywork-R1V3 Technical Report

Paper • 2507.06167 • Published Jul 8 • 71

Open Vision Reasoner: Transferring Linguistic Cognitive Behavior for Visual Reasoning

Paper • 2507.05255 • Published Jul 7 • 74

TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based Modeling

Paper • 2508.17445 • Published Aug 24 • 80

Autoregressive Universal Video Segmentation Model

Paper • 2508.19242 • Published Aug 26 • 28

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

Paper • 2508.18265 • Published Aug 25 • 202

DuPO: Enabling Reliable LLM Self-Verification via Dual Preference Optimization

Paper • 2508.14460 • Published Aug 20 • 82

upvoted 2 papers 3 months ago

Deep Think with Confidence

Paper • 2508.15260 • Published Aug 21 • 87

Intern-S1: A Scientific Multimodal Foundation Model

Paper • 2508.15763 • Published Aug 21 • 255

Byung-Kwan Lee

AI & ML interests

Recent Activity

Organizations

BK-Lee's activity