2025  4

July  4

RL4LLM: 2. PPO Algorithm and Implementation Details

Jul-21-2025 · Last updated on Aug-26-2025 · 2 min · 655 words · Kosmo CHE

Zotero

Jul-13-2025 · Last updated on Aug-26-2025 · 1 min · 42 words · Kosmo CHE

RL4LLM: 1. A Brief Talk on DPO

Jul-12-2025 · Last updated on Aug-26-2025 · 2 min · 721 words · Kosmo CHE

About Me

Jul-05-2025 · Last updated on Aug-26-2025 · 1 min · 217 words · Kosmo CHE