finetune - a ralphite Collection

ralphite 's Collections

systems

agi

ai-misc

finetune

updated Feb 21, 2024

RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback

Paper • 2309.00267 • Published Sep 1, 2023 • 51
Secrets of RLHF in Large Language Models Part I: PPO

Paper • 2307.04964 • Published Jul 11, 2023 • 29
QLoRA: Efficient Finetuning of Quantized LLMs

Paper • 2305.14314 • Published May 23, 2023 • 57