MyDRP2024 (2A) Notes on Boltzmann Distribution

Summary: This post derives Boltzmann distribution on a system having finitely many states.

Suppose a system has finitely many, say $n$, states; each having an energy level $E_i$ for $i = 1, 2, \dots, n$. Suppose these energy levels are arranged from low to high (i.e., $E_1 < E_2 < \cdots < E_n$). Now we want to find a probability distribution $p_1^\ast, p_2^\ast, \dots, p_n^\ast$ such that its entropy, $S(p_1^\ast, \dots, p_n^\ast) := -k_B \cdot \sum_{i=1}^{n} p_i^\ast \ln p_i^\ast$, is maximized, and the average energy of the system is fixed (i.e., $p_1^\ast E_1 + \cdots + p_n^\ast E_n = U$ where $U$ is a constant).

Mathematically speaking, we want to find $n$ numbers, $p_1^\ast, p_2^\ast, \dots, p_n^\ast \in [0,1]$ such that the following three conditions are satisfied:

$p_1^\ast + \cdots + p_n^\ast = 1$;
$p_1^\ast E_1 + \cdots + p_n^\ast E_n = U$, where $E_1, \dots, E_n$ and $U$ are knwon constants with $E_1 < E_2 < \cdots < E_n$;
The quantity $S(p_1^\ast, \dots, p_n^\ast) := -k_B \cdot \sum_{i=1}^{n} p_i^\ast \ln p_i^\ast$ is maximized, where $k_B$ is a constant.

Notice that this is an optimization problem with 2 constrains. To find $p_1^\ast,\dots,p_n^\ast$, we can use the method of Lagrange multipliers. Let

$$ L(p_1,\dots,p_n,\lambda_1,\lambda_2) := S(p_1,\dots,p_n) + \lambda_1 \left( 1 - \sum_{i=1}^{n} p_i \right) \\ + \lambda_2 \left( U - \sum_{i=1}^{n}p_i E_i \right) $$

for all $p_1, \dots, p_n, \lambda_1, \lambda_2 \in \mathbb{R}$. Let $\lambda_1^\ast$ and $\lambda_2^\ast$ be the numbers such that:

$$ \frac{\partial L(p_1,\dots,p_n,\lambda_1,\lambda_2)}{\partial \lambda_1} \bigg|_{\lambda_1 = \lambda_1^\ast} = 0 $$

$$ \frac{\partial L(p_1,\dots,p_n,\lambda_1,\lambda_2)}{\partial \lambda_2} \bigg|_{\lambda_2 = \lambda_2^\ast} = 0 $$

(i.e., the vector $(p_1^\ast,\dots, p_n^\ast, \lambda_1^\ast, \lambda_2^\ast)$) is a stationary point for the function $L$)

Unpacking $S(p_1,\dots,p_n)$ yields:

$$ L(p_1,\dots,p_n,\lambda_1,\lambda_2) = -k_B \sum_{i=1}^{n} p_i \ln p_i + \lambda_1 \left( 1 - \sum_{i=1}^{n} p_i \right) \\ + \lambda_2 \left( U - \sum_{i=1}^{n}p_i E_i \right) $$

Then, for each $i = 1, \dots, n$, we have:

$$ \frac{\partial L(p_1,\dots,p_n,\lambda_1,\lambda_2)}{\partial p_i} = -k_B \left( \ln p_i + 1 \right) - \lambda_1 - \lambda_2 E_i $$

Also, by the property of Lagrange multiplier, for each $i = 1, \dots, n$, the desired $p_i^\ast$ satisfies:

$$ \left. \frac{\partial L(p_1,\dots,p_n,\lambda_1,\lambda_2)}{\partial p_i} \right|_{\substack{p_i = p_i^\ast} \\ {\lambda_1 = \lambda_1^\ast} \\ {\lambda_2 = \lambda_2^\ast}} = 0 $$

Then, for each $i = 1, \dots, n$:

$$ -k_B \left( \ln p_i^\ast + 1 \right) - \lambda_1^\ast - \lambda_2^\ast E_i = 0 $$

That is, for each $i = 1, \dots, n$:

$$ p_i^\ast = \exp \left( -\frac{k_B + \lambda_1^\ast + \lambda_2^\ast E_i}{k_B} \right) $$

We can remove $\lambda_1^\ast$ from the above expression as follows: For each $i = 1, \dots, n$:

$$ p_i^\ast = \underbrace{\exp \left( -\frac{k_B + \lambda_1^\ast}{k_B} \right)}_{\text{the normalizing constant}} \cdot \exp \left( -\frac{\lambda_2^\ast E_i}{k_B} \right) $$

That is, for each $i = 1, \dots, n$:

$$ p_i^\ast \propto \exp \left( -\frac{\lambda_2^\ast E_i}{k_B} \right) $$

Since $p_1, \dots, p_n$ are probabilities, they need to sum up to 1. That is, for each $i = 1, \dots, n$:

$$ p_i^\ast = \frac{ \exp \left( -\frac{\lambda_2^\ast E_i}{k_B} \right) }{ \sum_{j=1}^{n} \exp \left( -\frac{\lambda_2^\ast E_j}{k_B} \right) } $$

The quantity $\lambda_2^\ast$ has a physical meaning. In fact, $\lambda_2^\ast = 1/T$ where $T$ is the temperature of the system (I don’t know what this means).

Notes:

If $n=2$, then the above derivations fail. Because if $n=2$, then $p_1^\ast$ and $p_2^\ast$ are uniquely determined from the constrains.
How about $n=3$? I don’t know.

References (copied from):

Wikipedia, “Partition function”. All I wrote above are copied from the section “Classical discrete system”!
Robert P. Dobrow, Introduction to Stochastic Processes With R. This book is my PSTAT160A/B textbook, and it is freely available on the Internet. Cool example 5.8 (page 202, Izing model) in this textbook!