Mian Zhang's picture

Mian Zhang

billmianz

·

AI & ML interests

None yet

Recent Activity

updated a model 5 days ago

billmianz/RLIF-1.7b-nothink-logicif80k

published a model 5 days ago

billmianz/RLIF-1.7b-nothink-logicif80k

updated a model 5 days ago

billmianz/RLIF-1.7b-nothink-if80k-1kseqlen

View all activity

Organizations

upvoted a paper about 1 month ago

Self-Improvement in Multimodal Large Language Models: A Survey

Paper • 2510.02665 • Published Oct 3 • 19

upvoted 2 papers 2 months ago

Loong: Synthesize Long Chain-of-Thoughts at Scale through Verifiers

Paper • 2509.03059 • Published Sep 3 • 24

TCIA: A Task-Centric Instruction Augmentation Method for Instruction Finetuning

Paper • 2508.20374 • Published Aug 28 • 21

upvoted 3 papers 3 months ago

MMTok: Multimodal Coverage Maximization for Efficient Inference of VLMs

Paper • 2508.18264 • Published Aug 25 • 25

LiveMCP-101: Stress Testing and Diagnosing MCP-enabled Agents on Challenging Queries

Paper • 2508.15760 • Published Aug 21 • 46

Complex Logical Instruction Generation

Paper • 2508.09125 • Published Aug 12 • 39

upvoted a paper 5 months ago

Hanfu-Bench: A Multimodal Benchmark on Cross-Temporal Cultural Understanding and Transcreation

Paper • 2506.01565 • Published Jun 2 • 3

upvoted a paper 6 months ago

Reinforcement Learning for Reasoning in Large Language Models with One Training Example

Paper • 2504.20571 • Published Apr 29 • 96

upvoted a paper 8 months ago

Preference Learning Unlocks LLMs' Psycho-Counseling Skills

Paper • 2502.19731 • Published Feb 27 • 7

upvoted a paper 10 months ago

Humanity's Last Exam

Paper • 2501.14249 • Published Jan 24 • 76

upvoted 3 papers about 1 year ago

CBT-Bench: Evaluating Large Language Models on Assisting Cognitive Behavior Therapy

Paper • 2410.13218 • Published Oct 17, 2024 • 4

Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale

Paper • 2409.08264 • Published Sep 12, 2024 • 48

Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers

Paper • 2409.04109 • Published Sep 6, 2024 • 48

upvoted a paper over 1 year ago

Learning to Refuse: Towards Mitigating Privacy Risks in LLMs

Paper • 2407.10058 • Published Jul 14, 2024 • 31

upvoted a paper almost 2 years ago

UNcommonsense Reasoning: Abductive Reasoning about Uncommon Situations

Paper • 2311.08469 • Published Nov 14, 2023 • 11