random's picture

random

fakerbaby

·

fakerbaby

AI & ML interests

NLP, RL, VLM

Recent Activity

upvoted a paper 4 days ago

QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs

upvoted a paper 16 days ago

BAPO: Stabilizing Off-Policy Reinforcement Learning for LLMs via Balanced Policy Optimization with Adaptive Clipping

liked a model 28 days ago

zai-org/GLM-4.6

View all activity

Organizations

Collections 1

Papers 9

arXiv:2403.07708

arXiv:2402.01391

arXiv:2401.11458

arXiv:2401.06080

spaces 2

Skywork R1V3

No application file

PaI

models 0

None public yet

datasets 0

None public yet