Ming Chen's picture

67 3

Ming Chen

ChenMing-thu14

·

AI & ML interests

3D Human Pose Estimation

Recent Activity

upvoted a paper 10 days ago

OmniTransfer: All-in-one Framework for Spatio-temporal Video Transfer

upvoted a paper 12 days ago

FrankenMotion: Part-level Human Motion Generation and Composition

upvoted a paper 15 days ago

FlowAct-R1: Towards Interactive Humanoid Video Generation

View all activity

Organizations

None yet

upvoted a paper 10 days ago

OmniTransfer: All-in-one Framework for Spatio-temporal Video Transfer

Paper • 2601.14250 • Published 11 days ago • 45

upvoted a paper 12 days ago

FrankenMotion: Part-level Human Motion Generation and Composition

Paper • 2601.10909 • Published 16 days ago • 18

upvoted a paper 15 days ago

FlowAct-R1: Towards Interactive Humanoid Video Generation

Paper • 2601.10103 • Published 16 days ago • 70

upvoted a paper 23 days ago

Klear: Unified Multi-Task Audio-Video Joint Generation

Paper • 2601.04151 • Published 24 days ago • 16

upvoted a paper 24 days ago

LTX-2: Efficient Joint Audio-Visual Foundation Model

Paper • 2601.03233 • Published 25 days ago • 141

upvoted 2 papers 25 days ago

VINO: A Unified Visual Generator with Interleaved OmniModal Context

Paper • 2601.02358 • Published 26 days ago • 29

NextFlow: Unified Sequential Modeling Activates Multimodal Understanding and Generation

Paper • 2601.02204 • Published 26 days ago • 61

upvoted a paper 26 days ago

Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation

Paper • 2601.00664 • Published 29 days ago • 56

upvoted 8 papers about 1 month ago

Learning from Next-Frame Prediction: Autoregressive Video Modeling Encodes Effective Representations

Paper • 2512.21004 • Published Dec 24, 2025 • 13

TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times

Paper • 2512.16093 • Published Dec 18, 2025 • 95

SemanticGen: Video Generation in Semantic Space

Paper • 2512.20619 • Published Dec 23, 2025 • 93

SAM Audio: Segment Anything in Audio

Paper • 2512.18099 • Published Dec 19, 2025 • 23

RePlan: Reasoning-guided Region Planning for Complex Instruction-based Image Editing

Paper • 2512.16864 • Published Dec 18, 2025 • 11

N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models

Paper • 2512.16561 • Published Dec 18, 2025 • 20

Kling-Omni Technical Report

Paper • 2512.16776 • Published Dec 18, 2025 • 169

End-to-End Training for Autoregressive Video Diffusion via Self-Resampling

Paper • 2512.15702 • Published Dec 17, 2025 • 15

upvoted 4 papers about 2 months ago

MemFlow: Flowing Adaptive Memory for Consistent and Efficient Long Video Narratives

Paper • 2512.14699 • Published Dec 16, 2025 • 28

WorldPlay: Towards Long-Term Geometric Consistency for Real-Time Interactive World Modeling

Paper • 2512.14614 • Published Dec 16, 2025 • 71

Scone: Bridging Composition and Distinction in Subject-Driven Image Generation via Unified Understanding-Generation Modeling

Paper • 2512.12675 • Published Dec 14, 2025 • 41

KlingAvatar 2.0 Technical Report

Paper • 2512.13313 • Published Dec 15, 2025 • 43