Harold Chen's picture

6 16 6

Harold Chen

Harold328

·

https://haroldchen19.github.io/

HaroldChen19

AI & ML interests

Computer Vision

Recent Activity

upvoted a paper 28 days ago

PhysToolBench: Benchmarking Physical Tool Understanding for MLLMs

updated a dataset 29 days ago

Harold328/reason_cache

published a dataset 29 days ago

Harold328/reason_cache

View all activity

Organizations

None yet

upvoted a paper 28 days ago

PhysToolBench: Benchmarking Physical Tool Understanding for MLLMs

Paper • 2510.09507 • Published about 1 month ago • 10

upvoted a paper about 1 month ago

Go with Your Gut: Scaling Confidence for Autoregressive Image Generation

Paper • 2509.26376 • Published Sep 30 • 8

upvoted 2 papers 3 months ago

Speed Always Wins: A Survey on Efficient Architectures for Large Language Models

Paper • 2508.09834 • Published Aug 13 • 53

LongVie: Multimodal-Guided Controllable Ultra-Long Video Generation

Paper • 2508.03694 • Published Aug 5 • 50

upvoted a paper 4 months ago

StreamDiT: Real-Time Streaming Text-to-Video Generation

Paper • 2507.03745 • Published Jul 4 • 31

upvoted 3 papers 6 months ago

FinePhys: Fine-grained Human Action Generation by Explicitly Incorporating Physical Laws for Effective Skeletal Guidance

Paper • 2505.13437 • Published May 19 • 6

Self-Generated In-Context Examples Improve LLM Agents for Sequential Decision-Making Tasks

Paper • 2505.00234 • Published May 1 • 26

T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT

Paper • 2505.00703 • Published May 1 • 44

upvoted 2 papers 7 months ago

Packing Input Frame Context in Next-Frame Prediction Models for Video Generation

Paper • 2504.12626 • Published Apr 17 • 51

VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models

Paper • 2504.13122 • Published Apr 17 • 20

upvoted 2 papers 8 months ago

Temporal Regularization Makes Your Video Generator Stronger

Paper • 2503.15417 • Published Mar 19 • 22

LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization

Paper • 2503.08619 • Published Mar 11 • 20

upvoted a paper 10 months ago

SeFAR: Semi-supervised Fine-grained Action Recognition with Temporal Perturbation and Learning Stabilization

Paper • 2501.01245 • Published Jan 2 • 5

upvoted 3 papers 11 months ago

VideoGen-of-Thought: A Collaborative Framework for Multi-Shot Video Generation

Paper • 2412.02259 • Published Dec 3, 2024 • 60

Identity-Preserving Text-to-Video Generation by Frequency Decomposition

Paper • 2411.17440 • Published Nov 26, 2024 • 37

OmniCreator: Self-Supervised Unified Generation with Universal Editing

Paper • 2412.02114 • Published Dec 3, 2024 • 14