← cgad.ski 2023-05-21

The Central Binomial Coefficient and Laplace's Method

The central binomial coefficient is the sequence down the "center" of Pascal's triangle, namely $$ \binom{2n}{n}. $$ These numbers show up, for example, in a random walk on the integers: if we start off at \(0\) and make \(2n\) independent increments of \(+1\) or \(-1\), giving each choice even odds, then we will find ourselves back at the origin with probability $$ 2^{-2n} \binom{2n}{n}. $$ Some basic Markov process reasoning tells us that a random walk like this one can either be recurrent or dissipative: trajectories will either visit each position infinitely many times, or escape to infinity, spending only a finite number of steps at each position. Now, let \(\{ X_0, X_1, \dots \}\) be our trajectory, and \(C\) be the number of times that \(X_n = 0\). Since $$ \begin{align} & E(C) = \sum_{n = 0}^\infty P(X_n = 0) \\ & = \sum_{n = 0}^\infty P(X_{2n} = 0) = \sum_{n = 1}^{\infty} 2^{-2n} \binom{2n}{n}, \end{align} $$ the question of whether the random walk we have described is recurrent comes down to whether the central binomial coefficient grows quickly enough to make this series diverge. Does it? We can also consider the random walk on \(\Z^d\), moving at each step by an increment in the set \(\{ -1, 1 \}^d\), in which case the expected number of visits to the origin is $$ E(C) = \sum_{n = 0}^{\infty} 2^{-2nd} \binom{2n}{n}^d . $$ When are these processes recurrent? To answer either question, we need to figure out something about the asymptotic growth of central binomial coefficient.

A basic "statistical intuition" tells us that \(\binom{2n}{n}\) may on the order of \(2^{2n}/\sqrt{n}\). To see why, choose \(0 < \epsilon < 1\) and put $$ F(n, \epsilon) = P(\lvert X_n \rvert < \epsilon) $$ so that $$ F(2n, \epsilon) = P(X_{2n} = 0) = 2^{-2n} \binom{2n}{n}. $$ Meanwhile, the central limit theorem tells us that \(F(n, \sqrt{n} \epsilon)\) converges to \(P(Z \le \epsilon)\) for large \(n\), where \(Z\) is normally distributed with the variance of an increment—in our case, \(V(Z) = 1\). Since the density of \(Z\) at \(0\) is \(1/\sqrt{2 \pi}\), we get that $$ F(n, \epsilon) \approx P\left( Z \le \frac{\epsilon}{\sqrt{n}} \right) \sim \frac{\epsilon}{\sqrt{2 \pi n}}. $$ Unfortunately, it seems quite difficult to get anything rigorous from this idea. In fact, the closer I look at the approximation above, the more I despair: not only does the expression \(\epsilon/\sqrt{2 \pi n}\) depend on \(\epsilon\) while \(F(n, \epsilon)\) does not, but in fact \(F(n, \epsilon)\) cannot be asymptotic to \(C/\sqrt{n}\) for any constant \(C\) for the simple reason that \(F(n, \epsilon) = 0\) for odd \(n\).

Although the central limit theorem is useless to make rigorous conclusions here—as far as I can tell—it "miraculously" gives us the right order of growth. In fact, it is true that $$ \binom{2n}{n} \sim \frac{2^{2n}}{\sqrt{\pi n}}. $$ Plugging this approximation into the sums above gives the classic result that the random walk on \(\Z^d\) is recurrent for \(d \le 2\) but dissipative for all larger \(d\). Moreover, an asymptotic series can be computed for the relative error, giving (for example) the remarkably accurate approximation $$ \binom{2n}{n} = \frac{2^{2n}}{\sqrt{\pi n}} \left( 1 -\frac{1}{8} n^{-1} +\frac{1}{128} n^{-2} +\frac{5}{1024} n^{-3} + o(n^{-3}) \right) $$ which gives a maximum relative error of around \(10^{-8}\). (You may play the "guess the sequence" game with those coefficients at your own risk.)

These estimates can be derived from Stirling's approximation for \(n!\), but in this post I want to focus on a simple idea from which various asymptotic series—including the Stirling series—can be derived independently.

Laplace's Method

Our asymptotic expansion for the central binomial coefficient will rely on approximating a certain exponential integral. As a warmup, let's consider the problem of approximating a Laplace transform-type integral $$ I(x) = \int_0^\infty f(t) e^{-x t} \, dt $$ for large \(x\). Assuming that \(f(t)\) does not grow too quickly for large \(t\), our integral will vanish exponentially over the interval \([\delta, +\infty)\), so the behavior of \(I(x)\) comes down to the behavior of \(f\) near \(0\). If \(f\) is also bounded near \(0\), then \(I(x)\) tends to \(0\) overall since $$ \int_0^\infty e^{-xt} \, dx = \frac{1}{x}. $$ Cancelling out this term, \(x I(x)\) can be understood as an expectation of \(f\) over an exponential distribution with mean \(1/x\). Indeed, where \(Y\) is exponentially distributed with density \(e^{-t}\), $$ x I(x) = \int_0^\infty f(t) x e^{-x t} \, dt = \int_0^\infty f( x^{-1} t) e^{-t} \, dt = E(f(x^{-1} Y)). $$ When \(f(t) = t^k,\) this expectation is easy to compute: $$ E(x^{-k} Y^k) = x^{-k} E(Y^k) = x^{-k} k!. $$ If \(f\) admits a Taylor series to order \(n\) with remainder \(r_n(t) = o(t^n)\), $$ \begin{align} I(x) & = \frac{1}{x}\left( \sum_{k = 0}^n E\left( \frac{f^{(k)}(0)}{k!} x^{-k} Y^k \right) + E(r_n(x ^{-1} Y) )\right) \\ & = \sum_{k = 0}^n \frac{f^{(k)}(0)}{x^{k + 1}} + \frac{1}{x} E(r_n(x^{-1} Y)). \end{align} $$ All that remains is to bound the expectation of the remainder. We know that \(\lvert r_n(t) \lvert < \epsilon \lvert t \rvert^n\) for sufficiently small \(t < \delta\), so let's consider the events \(Y < x \delta\) and \(Y \ge x \delta\) separately—call then \(S\) and \(L\). We will decompose the expectation into integrals over these two sets: $$ E(r_n(x ^{-1} Y)) = E_S(r_n(x ^{-1} Y)) + E_L(r_n(x ^{-1} Y)). $$ For the expectation over \(L\), we only need a bound like \(\lvert r_n(t) \rvert \le C t^K\) for some \(C\) and \(K\), since then $$ \lvert E_L(r_n(x ^{-1} Y)) \rvert \le \int_{\delta x}^\infty \lvert r_n(x ^{-1} t) \rvert e^{-t}\, dt \le \int_{\delta x}^\infty C x^{-K} t^K e^{-t}\, dt $$ is guaranteed to vanish exponentially for large \(x,\) as we had mentioned above. (We call this kind of error beyond all orders, since it is dominated by \(x^{-n}\) for all \(n\).) Meanwhile, to bound the expectation over \(S\) we can use the bound on \(r_n\): $$ \lvert E_S(r_n(x ^{-1} Y))\rvert \le \epsilon E_S(x^{-n} Y^n) \le \epsilon \frac{E(Y^n)}{x^n}. $$ We conclude overall that $$ E(r_n(x ^{-1} Y)) = o(x^{-n}), $$ and so the whole approximation is $$ I(x) = \sum_{k = 0}^n \frac{f^{(k)}(0)}{x^{k + 1}} + o(x^{-n-1}). $$ The reasoning above can naturally be generalized to any variable \(Y\) with quickly decaying tails.

Proposition: Suppose \(f(t)\) is of polynomial order for large \(t\) and verifies $$ f(t) = \sum_{k = 0}^n a_k t^k + o(t^n) $$ for small \(t\). Then, for any variable \(Y\) with exponentially bounded tails in the sense that \(P(\lvert Y \rvert \ge M) \le C e^{-\delta M}\) for some \(C > 0\), \(\delta > 0\), we also have $$ E(f(t Y)) = \sum_{k = 0}^n E(Y^k) a_k t^k + o(t^{n}). $$ This idea comes in comes in handy in the general task of approximating an integral like $$ I(x) = \int_D f(t) e^{-x \phi(t)} \, dt $$ for large \(x\). Again, if \(f\) does not grow quickly for large \(t\), this integral is determined (up to an exponentially vanishing error) by the integrand near the minimizers of \(\phi.\) For an isolated minimum at \(t_0\) with \(\phi(t) = (t - t_0)^k + O((t - t_0)^{k + 1}),\) changing variables to \(\phi(t) = (s - t_0)^k\) in a neighborhood of \(t_0\) gives us an integral we can evaluate with the proposition above. This technique is known as Laplace's method. We'll see an example of this now.

Estimating the Central Binomial Coefficient

Let's return to our random walk on the integers. The characteristic function of our increment is $$ \frac{e^{-i t} + e^{i t}}{2} = \cos(t), $$ and so \(\cos^n(t)\) will be the characteristic function of \(X_n\). Since \(X_n\) is supported on the integers, the characteristic function is a Fourier polynomial and we can extract probability mass functions easily; for example, $$ P(X_n = 0) = \frac{1}{2 \pi} \int_{-\pi}^\pi \cos^n(\theta) \, d \theta. $$ This integral vanishes exponentially except near the points \(\theta = 0\) and \(\theta = \pm \pi.\) For odd \(n\) these two peaks cancel out—indeed, \(P(X_n = 0) = 0\) for odd \(n\)—and for even \(n\) they are equal. We conclude that $$ P(X_{2n} = 0) \approx \frac{1}{\pi} \int_{-\epsilon}^\epsilon \cos^{2n}(\theta) \, d\theta $$ for \(\epsilon < \pi\), ignoring an error beyond all orders of \(n\).

Since \(\ln \cos(\theta) \approx -\theta^2/2\), we can reparameterize by a variable \(s\) solving the equation $$ \ln \cos(\theta) = -\frac{s^2}{2}. $$ Near \(\theta = s = 0\), the solutions to this equation are a union of two smooth curves. We can choose one branch by defining $$ s = \theta \sqrt{\frac{-2 \ln \cos(\theta)}{\theta^2}} = \theta\sqrt{1 + O(\theta^2)}. $$ We can now change variables from \(\theta\) to \(s\) and write our integral as a certain expectation over a Gaussian variable: $$ \begin{align} P(X_{2n} = 0) & \approx \frac{1}{\pi}\int_{-\epsilon}^\epsilon \exp(2n \ln \cos(\theta)) \, d\theta \\ & = \frac{1}{\pi}\int_{-\epsilon}^\epsilon \frac{d\theta}{ds} \exp(-n s^2)\, ds. \\ & = \frac{1}{\pi \sqrt{2n}} \int_{-\sqrt{2n} \epsilon}^{\sqrt{2n} \epsilon} \frac{d\theta}{ds}\left( \frac{s}{\sqrt{2n}} \right) \exp(-s^2/2) \, ds \\ & = \frac{1}{\sqrt{\pi n}} E\left( \frac{d\theta}{ds}((2n)^{-1/2} Z) \right). \end{align} $$ (In the last expectation, we're defining \(d\theta/ds(s) = 0\) for \(\lvert s \rvert \ge \epsilon\) as a convenience of notation.) Our leading order approximation for \(P(X_{2n} = 0)\) is thus $$ P(X_{2n} = 0) = \frac{1}{\sqrt{\pi n}} + o(n^{-3/2}), $$ and in particular we have an asymptotic equivalence $$ P(X_{2n} = 0) \sim \frac{1}{\sqrt{\pi n}}. $$ In terms of the central binomial coefficient, this proves $$ \binom{2n}{n} = 2^{2n} P(X_{2n} = 0) \sim \frac{2^{2n}}{\sqrt{\pi n}}, $$ as promised. The full asymptotic expansion will be $$ \begin{align} \binom{2n}{n} \approx \frac{2^{2n}}{\sqrt{\pi n}}E\left( \frac{d\theta}{ds}((2n)^{-1/2} Z) \right) = \frac{2^{2n}}{\sqrt{\pi n}} \left( \sum_{k = 0}^N \frac{d\theta^{k + 1}}{ds^{k + 1}}(0) \frac{E(Z^k)}{k! (2n)^{k/2}} + o(n^{-N/2})\right). \end{align} $$ Since the odd moments of \(Z\) vanish and the even moments are \(E(Z^k) = (k - 1)!!\) for \(k > 0\), we can rewrite this as $$ \frac{2^{2n}}{\sqrt{\pi n}} \left( 1 + \sum_{k = 1}^N \frac{d\theta^{2k + 1}}{ds^{2k + 1}}(0) \frac{1}{(2k)!! 2^k} n^{-k} + o(n^{-N}) \right). $$ All that remains is to compute the series expansion of \(\theta\) in terms of \(s\). As far as I know, there is no simple closed-form for this series. Fortunately, if we happen to want coefficients, these computations can be easily done by a computer algebra system. Expanding $$ s = \theta \sqrt{\frac{-2 \ln \cos(\theta)}{\theta^2}} = \theta +\frac{\theta ^3}{12}+\frac{3 \theta ^5}{160}+\frac{209 \theta ^7}{40320}+O\left(\theta^9\right) $$ and inverting the series gives $$ \begin{align} \theta & = s-\frac{s^3}{12}+\frac{s^5}{480}+\frac{s^7}{2688}+O\left(s^9\right) \\ & = s - \frac{1}{2} \frac{s^3}{3!} + \frac{1}{4} \frac{s^5}{5!} + \frac{15}{8} \frac{s^7}{7!} + O(s^9). \end{align} $$ Altogether, we have $$ \begin{align} \binom{2n}{n} & = \frac{2^{2n}}{\sqrt{\pi n}} \left( 1 -\frac{1}{2} \frac{1}{2!! \times 2} n^{-1} + \frac{1}{4} \frac{1}{4!! \times 4} n^{-2}+ \frac{8}{15} \frac{1}{6!! \times 8} n^{-3} + o(n^{-3}) \right) \\ & = \frac{2^{2n}}{\sqrt{\pi n}} \left( 1 -\frac{1}{8} n^{-1} + \frac{1}{128} n^{-2}+ \frac{5}{1024} n^{-3} + o(n^{-3}) \right), \end{align} $$ as claimed above.

← cgad.ski