RL4LLM: 3. Information Theory in Reasoning
This blog post discusses the role of information theory in LLM reasoning.
This blog post discusses the role of information theory in LLM reasoning.
This blog post introduces RLHF-PPO algorithm with code implementation.
This blog note how to use Zotero with iCloud as the storage.
This blog post notes my understanding of Direct Preference Optimization and the math derivation behind it.