Peiyu Wang's picture

Peiyu Wang

OrlandoHugBot

·

Orlando-CS

AI & ML interests

llm/multimodal ai/agent

Recent Activity

new activity 4 days ago

Skywork/UniPic2-Metaquery-9B:关于Benchmark评测结果

liked a model 27 days ago

deepseek-ai/DeepSeek-OCR

liked a dataset about 1 month ago

callanwu/WebWalkerQA

View all activity

Organizations

upvoted a collection about 2 months ago

Qwen3-VL

37 items • Updated 15 days ago • 403

upvoted 2 papers 2 months ago

HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning

Paper • 2509.08519 • Published Sep 10 • 127

EmbodiedOneVision: Interleaved Vision-Text-Action Pretraining for General Robot Control

Paper • 2508.21112 • Published Aug 28 • 75

upvoted 2 papers 3 months ago

MV-RAG: Retrieval Augmented Multiview Diffusion

Paper • 2508.16577 • Published Aug 22 • 38

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

Paper • 2508.18265 • Published Aug 25 • 205

upvoted 2 collections 3 months ago

Skywork-UniPic2

A Unified DiT Multimodal Model for Image Generation, Editing, and Understanding • 8 items • Updated Aug 22 • 10

SVDQuant

Models and datasets for "SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models" • 20 items • Updated May 29 • 64

upvoted a paper 3 months ago

Skywork UniPic: Unified Autoregressive Modeling for Visual Understanding and Generation

Paper • 2508.03320 • Published Aug 5 • 61

upvoted a paper 4 months ago

Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published Jan 22 • 124

upvoted a collection 4 months ago

Skywork-UniPic

Unified Autoregressive Modeling for Visual Understanding and Generation • 2 items • Updated Aug 13 • 12

upvoted a paper 4 months ago

I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models

Paper • 2502.10458 • Published Feb 12 • 37

upvoted 3 collections 4 months ago

Skywork-R1V3

Advanced multimodal reasoning model • 7 items • Updated Aug 8 • 14

WorldPM

4 items • Updated Jul 21 • 8

Qwen3

84 items • Updated Aug 6 • 1.42k

upvoted 3 papers 6 months ago

Multimodal DeepResearcher: Generating Text-Chart Interleaved Reports From Scratch with Agentic Framework

Paper • 2506.02454 • Published Jun 3 • 7

CSVQA: A Chinese Multimodal Benchmark for Evaluating STEM Reasoning Capabilities of VLMs

Paper • 2505.24120 • Published May 30 • 49

ImgEdit: A Unified Image Editing Dataset and Benchmark

Paper • 2505.20275 • Published May 26 • 18

upvoted a collection 6 months ago

🌸 April 2025 - Open releases from the Chinese community

42 items • Updated Sep 1 • 13

upvoted a paper 6 months ago

Skywork-VL Reward: An Effective Reward Model for Multimodal Understanding and Reasoning

Paper • 2505.07263 • Published May 12 • 30

upvoted a collection 6 months ago

VisionLM

1749 items • Updated 4 days ago • 131