TGTGInsighttelegram intelligenceLIVE / telegram public index
← Parallel Experiments
Parallel Experiments avatar

TGINSIGHT POST

Post #934

@LinghaoCh

Parallel Experiments

Views831Post view count
PostedApr 1304/13/2025, 02:24 AM
Post content

Post content

A really good and concise deep dive into RLHF in LLM post-training, Proximal Policy Optimization (PPO), and Group Relative Policy Optimization (GRPO) https://yugeten.github.io/posts/2025/01/ppogrpo/ #llm