Blogs

Tips on Server(The suck jupyter kernel)

Introduction Some commands and tips, often used when working on a server, I feel tired of asking gpt again and again. 1. Kill Jupyter Notebook Kernel When you use Jupyter Notebook on a server,especially when in VSCode, sometimes the kernel may get stuck or become unresponsive. Unfortunately, VSCode does not provide a direct way to kill or shutdown the kernel from its interface.This make a lot of jupyter processes running on the server, consuming resources and causing confusion. ...

Varentropy

The Mathematical Properties of Varentropy Now we discuss some mathematical properties of varentropy at a discrete distribution with a fixed entropy. Let $X$ be a discrete random variable with support $\{x_1, x_2, \ldots, x_k\}$ and corresponding probabilities $P(X = x_i) = p_i$, and let $H(X) = H_0$ be the entropy of $X$. The varentropy $V(X)$ can be expressed as: $$ V(X) = \sum_{i=1}^{V} p_i \left( -\log(p_i) \right)^2 - H_0^2 $$Lagrange Multiplier Method To find the lower bound of varentropy for a fixed entropy $H_0$, we can use the method of Lagrange multipliers. We want to maximize $V(X)$ subject to the constraints that the probabilities sum to 1 and the entropy is fixed. We can consider it as the following optimization problem: ...

RL4LLM: 3. Information Theory in Reasoning

This blog post discusses the role of information theory in LLM reasoning.

RL4LLM: 2. PPO Algorithm and Implementation Details

This blog post introduces RLHF-PPO algorithm with code implementation.

Zotero

This blog note how to use Zotero with iCloud as the storage.

RL4LLM: 1. A Brief Talk on DPO

This blog post notes my understanding of Direct Preference Optimization and the math derivation behind it.