16 28 26

Zhaokai Wang

wzk1015

https://www.wzk.plus

wzk1015

AI & ML interests

Computer Vision Music Generation Multimodal Large Language Models

Recent Activity

liked a model 15 days ago

Zhenxin-Lei/MetaCaptioner

upvoted a paper 19 days ago

OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM

upvoted a paper 29 days ago

NaViL: Rethinking Scaling Properties of Native Multimodal Large Language Models under Data Constraints

View all activity

Organizations

liked a model 15 days ago

Zhenxin-Lei/MetaCaptioner

Updated 16 days ago • 6 • 1

upvoted a paper 19 days ago

OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM

Paper • 2510.15870 • Published 22 days ago • 86

upvoted 2 papers 29 days ago

NaViL: Rethinking Scaling Properties of Native Multimodal Large Language Models under Data Constraints

Paper • 2510.08565 • Published 30 days ago • 19

MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization

Paper • 2510.08540 • Published 30 days ago • 108

upvoted a paper about 1 month ago

Factuality Matters: When Image Generation and Editing Meet Structured Visuals

Paper • 2510.05091 • Published Oct 6 • 18

updated a dataset about 1 month ago

OpenGVLab/GenExam

Updated Oct 6 • 236 • 3

authored a paper about 2 months ago

GenExam: A Multidisciplinary Text-to-Image Exam

Paper • 2509.14232 • Published Sep 17 • 21

upvoted a paper about 2 months ago

SAIL-VL2 Technical Report

Paper • 2509.14033 • Published Sep 17 • 44

liked a model about 2 months ago

facebook/nllb-200-distilled-600M

Translation • Updated Feb 14, 2024 • 267k • 785

upvoted a paper about 2 months ago

GenExam: A Multidisciplinary Text-to-Image Exam

Paper • 2509.14232 • Published Sep 17 • 21

liked a dataset about 2 months ago

OpenGVLab/GenExam

Updated Oct 6 • 236 • 3

published a dataset about 2 months ago

OpenGVLab/GenExam

Updated Oct 6 • 236 • 3

liked a dataset 2 months ago

PhoenixZ/RISEBench

Updated May 30 • 85 • 2

upvoted a paper 2 months ago

Does DINOv3 Set a New Medical Vision Standard?

Paper • 2509.06467 • Published Sep 8 • 36

liked 2 models 2 months ago

OpenGVLab/InternVL3_5-241B-A28B-HF

Image-Text-to-Text • 241B • Updated Sep 8 • 124 • 11

OpenGVLab/InternVL3_5-GPT-OSS-20B-A4B-Preview

Image-Text-to-Text • 0.4B • Updated Aug 29 • 41.3k • 76

authored a paper 2 months ago

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

Paper • 2508.18265 • Published Aug 25 • 202

upvoted a paper 2 months ago

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

Paper • 2508.18265 • Published Aug 25 • 202

liked a model 2 months ago

OpenGVLab/InternVL3_5-241B-A28B

Image-Text-to-Text • 241B • Updated Aug 29 • 6.24k • 131

liked a Space 3 months ago

RISEBench Gallery

👀

A Gallery of Generation Results on RISEBench

Zhaokai Wang

AI & ML interests

Recent Activity

Organizations

wzk1015's activity

RISEBench Gallery