liu zh

morphism42

AI & ML interests

None yet

Recent Activity

upvoted a paper about 2 months ago

On Predictability of Reinforcement Learning Dynamics for Large Language Models

upvoted a paper 3 months ago

Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference

liked a Space 6 months ago

Ki-Seki/ultrascale-playbook-zh-cn

View all activity

Organizations

None yet

upvoted a paper about 2 months ago

On Predictability of Reinforcement Learning Dynamics for Large Language Models

Paper • 2510.00553 • Published Oct 1 • 8

upvoted a paper 3 months ago

Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference

Paper • 2508.02193 • Published Aug 4 • 130

liked a Space 6 months ago

239

LLM训练终极指南 | The Ultra-Scale Playbook

🔥

了解LLM训练的方方面面

upvoted a paper 10 months ago

Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search

Paper • 2502.02508 • Published Feb 4 • 23

upvoted an article about 1 year ago

Article

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

Sep 18, 2024

•

271

upvoted 4 articles over 1 year ago

Article

How NuminaMath Won the 1st AIMO Progress Prize

Jul 11, 2024

•

122

Article

Illustrating Reinforcement Learning from Human Feedback (RLHF)

Dec 9, 2022

•

371

Article

Fine-tune Llama 3 with ORPO

Apr 22, 2024

•

241

Article

Personal Copilot: Train Your Own Coding Assistant

Oct 27, 2023

•

liu zh

AI & ML interests

Recent Activity

Organizations

morphism42's activity

LLM训练终极指南 | The Ultra-Scale Playbook

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

How NuminaMath Won the 1st AIMO Progress Prize

Illustrating Reinforcement Learning from Human Feedback (RLHF)

Fine-tune Llama 3 with ORPO

Personal Copilot: Train Your Own Coding Assistant