Stirling's approximation: Difference between revisions
imported>JayBeeEll Undid revision 1282616999 by Neillamas (talk) no; read the sentence |
|||
| Line 1: | Line 1: | ||
{{short description|Approximation for factorials}} | {{short description|Approximation for factorials}} | ||
[[File:Mplwp factorial gamma stirling.svg|thumb|right|upright=1.35|Comparison of Stirling's approximation with the factorial]] | {{Dark mode invert|[[File:Mplwp factorial gamma stirling.svg|thumb|right|upright=1.35|Comparison of Stirling's approximation (pink) with the factorial (blue)]]}} | ||
In [[mathematics]], '''Stirling's approximation''' (or '''Stirling's formula''') is an [[Asymptotic analysis|asymptotic]] approximation for [[factorial]]s. It is a good approximation, leading to accurate results even for small values of <math>n</math>. It is named after [[James Stirling (mathematician)|James Stirling]], though a related but less precise result was first stated by [[Abraham de Moivre]].{{r|dutka|LeCam1986|Pearson1924}} | In [[mathematics]], '''Stirling's approximation''' (or '''Stirling's formula''') is an [[Asymptotic analysis|asymptotic]] approximation for [[factorial]]s. It is a good approximation, leading to accurate results even for small values of <math>n</math>. It is named after [[James Stirling (mathematician)|James Stirling]], though a related but less precise result was first stated by [[Abraham de Moivre]].{{r|dutka|LeCam1986|Pearson1924}} | ||
One way of stating the approximation involves the [[logarithm]] of the factorial: | One way of stating the approximation involves the [[logarithm]] of the factorial: | ||
<math display=block>\ln | <math display=block>\ln n! = n\ln n - n +O(\ln n),</math> | ||
where the [[big O notation]] means that, for all sufficiently large values of <math>n</math>, the difference between <math>\ln | where the [[big O notation]] means that, for all sufficiently large values of <math>n</math>, the difference between <math>\ln n!</math> and <math>n\ln n-n</math> will be at most proportional to the logarithm of <math>n</math>. In computer science applications such as the [[Comparison sort#Number of comparisons required to sort a list|worst-case lower bound for comparison sorting]], it is convenient to instead use the [[binary logarithm]], giving the equivalent form | ||
<math display=block>\log_2 | <math display=block>\log_2 n! = n\log_2 n - n\log_2 e + O(\log_2 n).</math> The error term in either base can be expressed more precisely as <math>\tfrac12\log 2\pi n + O(\tfrac1n)</math>, corresponding to an approximate formula for the factorial itself, | ||
<math display=block>n! \sim \sqrt{2 \pi n}\left(\frac{n}{e}\right)^n.</math> | <math display=block>n! \sim \sqrt{2 \pi n}\left(\frac{n}{e}\right)^n.</math> | ||
Here the sign <math>\sim</math> means that the two quantities are asymptotic, that is, their ratio tends to 1 as <math>n</math> tends to infinity. | Here the sign <math>\sim</math> means that the two quantities are asymptotic, that is, their ratio tends to 1 as <math>n</math> tends to infinity. | ||
== History == | |||
The formula was first discovered by [[Abraham de Moivre]]{{r|LeCam1986}} in 1721 in the form | |||
<math display=block>n! \sim [{\rm constant}] \cdot n^{n+\frac12} e^{-n}.</math> | |||
De Moivre gave an approximate rational-number expression for the natural logarithm of the constant. Stirling's contribution in 1730 consisted of showing that the constant is precisely <math>\sqrt{2\pi} </math>.{{r|Pearson1924}}<ref>[https://archive.org/details/bim_eighteenth-century_methodus-differentialis_stirling-james-frs-_1730/mode/2up Methodus Differentialis: Sive Tractatus de Summatione et Interpolatione Serierum Infinitarum], Jacob Stirling, London, 1730</ref> | |||
== Derivation == | == Derivation == | ||
Roughly speaking, the simplest version of Stirling's formula can be quickly obtained by approximating the sum | Roughly speaking, the simplest version of Stirling's formula can be quickly obtained by approximating the sum | ||
<math display=block>\ln | <math display=block>\ln n! = \sum_{j=1}^n \ln j</math> | ||
with an [[integral]]: | with an [[integral]]: | ||
<math display=block>\sum_{j=1}^n \ln j \approx \int_1^n \ln x \,{\rm d}x = n\ln n - n + 1.</math> | <math display=block>\sum_{j=1}^n \ln j \approx \int_1^n \ln x \,{\rm d}x = n\ln n - n + 1.</math> | ||
The full formula, together with precise estimates of its error, can be derived as follows. Instead of approximating <math>n!</math>, one considers its [[natural logarithm]], as this is a [[slowly varying function]]: | The full formula, together with precise estimates of its error, can be derived as follows. Instead of approximating <math>n!</math>, one considers its [[natural logarithm]], as this is a [[slowly varying function]]: | ||
<math display=block>\ln | <math display=block>\ln n! = \ln 1 + \ln 2 + \cdots + \ln n.</math> | ||
The right-hand side of this equation minus | The right-hand side of this equation minus | ||
<math display=block>\tfrac{1}{2}(\ln 1 + \ln n) = \tfrac{1}{2}\ln n</math> | <math display=block>\tfrac{1}{2}(\ln 1 + \ln n) = \tfrac{1}{2}\ln n</math> | ||
is the approximation by the [[trapezoid rule]] of the integral | is the approximation by the [[trapezoid rule]] of the integral | ||
<math display=block>\ln | <math display=block>\ln n! - \tfrac{1}{2}\ln n \approx \int_1^n \ln x\,{\rm d}x = n \ln n - n + 1,</math> | ||
and the error in this approximation is given by the [[Euler–Maclaurin formula]]: | and the error in this approximation is given by the [[Euler–Maclaurin formula]]: | ||
<math display=block>\begin{align} | <math display=block>\begin{align} | ||
\ln | \ln n! - \tfrac{1}{2}\ln n & = \ln 1 + \ln 2 + \ln 3 + \cdots + \ln(n-1) + \tfrac{1}{2}\ln n\\ | ||
& = n \ln n - n + 1 + \sum_{k=2}^{m} \frac{(-1)^k B_k}{k(k-1)} \left( \frac{1}{n^{k-1}} - 1 \right) + R_{m,n}, | & = n \ln n - n + 1 + \sum_{k=2}^{m} \frac{(-1)^k B_k}{k(k-1)} \left( \frac{1}{n^{k-1}} - 1 \right) + R_{m,n}, | ||
\end{align}</math> | \end{align}</math> | ||
where <math>B_k</math> is a [[Bernoulli number]], and {{math|''R''<sub>''m'',''n''</sub>}} is the remainder term in the Euler–Maclaurin formula. Take limits to find that | where <math>B_k</math> is a [[Bernoulli number]], and {{math|''R''<sub>''m'',''n''</sub>}} is the remainder term in the Euler–Maclaurin formula. Take limits to find that | ||
<math display=block>\lim_{n \to \infty} \left( \ln | <math display=block>\lim_{n \to \infty} \left( \ln n! - n \ln n + n - \tfrac{1}{2}\ln n \right) = 1 - \sum_{k=2}^{m} \frac{(-1)^k B_k}{k(k-1)} + \lim_{n \to \infty} R_{m,n}.</math> | ||
Denote this limit as <math>y</math>. Because the remainder {{math|''R''<sub>''m'',''n''</sub>}} in the Euler–Maclaurin formula satisfies | Denote this limit as <math>y</math>. Because the remainder {{math|''R''<sub>''m'',''n''</sub>}} in the Euler–Maclaurin formula satisfies | ||
<math display=block>R_{m,n} = \lim_{n \to \infty} R_{m,n} + O \left( \frac{1}{n^m} \right),</math> | <math display=block>R_{m,n} = \lim_{n \to \infty} R_{m,n} + O\! \left(\! \frac{1}{n^m}\! \right),</math> | ||
where [[big-O notation]] is used, combining the equations above yields the approximation formula in its logarithmic form: | where [[big-O notation]] is used, combining the equations above yields the approximation formula in its logarithmic form: | ||
<math display=block>\ln | <math display=block>\ln n! = n \ln \left( \frac{n}{e} \right) + \tfrac{1}{2}\ln n + y + \sum_{k=2}^{m} \frac{(-1)^k B_k}{k(k-1)n^{k-1}} + O\! \left(\! \frac{1}{n^m}\! \right).</math> | ||
Taking the exponential of both sides and choosing any positive integer <math>m</math>, one obtains a formula involving an unknown quantity <math>e^y</math>. For {{math|''m'' {{=}} 1}}, the formula is | Taking the exponential of both sides and choosing any positive integer <math>m</math>, one obtains a formula involving an unknown quantity <math>e^y</math>. For {{math|''m'' {{=}} 1}}, the formula is | ||
<math display=block>n! = e^y \sqrt{n} \left( \frac{n}{e} \right)^n \left( 1 + O \left( \frac{1}{n} \right) \right).</math> | <math display=block>n! = e^y \sqrt{n} \left( \frac{n}{e} \right)^n \left( 1 + O\! \left(\! \frac{1}{n} \right) \!\right).</math> | ||
The quantity <math>e^y</math> can be found by taking the limit on both sides as <math>n</math> tends to infinity and using [[Wallis product|Wallis' product]], which shows that <math>e^y=\sqrt{2\pi}</math>. Therefore, one obtains Stirling's formula: | The quantity <math>e^y</math> can be found by taking the limit on both sides as <math>n</math> tends to infinity and using [[Wallis product|Wallis' product]], which shows that <math>e^y=\sqrt{2\pi}</math>. Therefore, one obtains Stirling's formula: | ||
<math display=block>n! = \sqrt{2 \pi n} \left( \frac{n}{e} \right)^n \left( 1 + O \left( \frac{1}{n} \right) \right).</math> | <math display=block>n! = \sqrt{2 \pi n} \left( \frac{n}{e} \right)^n \left( 1 + O\! \left(\! \frac{1}{n}\! \right) \right).</math> | ||
== Alternative derivations == | == Alternative derivations == | ||
An alternative formula for <math>n!</math> using the [[gamma function]] is | An alternative formula for <math>n!</math> using the [[gamma function]] is | ||
<math display=block> n! = \int_0^\infty x^n e^{-x}\,{\rm d}x.</math> | <math display=block> n! = \int_0^\infty\! x^n e^{-x}\,{\rm d}x.</math> | ||
(as can be seen by repeated integration by parts). Rewriting and changing variables {{math|''x'' {{=}} ''ny''}}, one obtains | (as can be seen by repeated integration by parts). Rewriting and changing variables {{math|''x'' {{=}} ''ny''}}, one obtains | ||
<math display=block> n! = \int_0^\infty e^{n\ln x-x}\,{\rm d}x = e^{n \ln n} n \int_0^\infty e^{n(\ln y -y)}\,{\rm d}y.</math> | <math display=block> n! = \int_0^\infty\! e^{n\ln x-x}\,{\rm d}x = e^{n \ln n} n \int_0^\infty\! e^{n(\ln y -y)}\,{\rm d}y.</math> | ||
Applying [[Laplace's method]] one has | Applying [[Laplace's method]] one has | ||
<math display=block>\int_0^\infty e^{n(\ln y -y)}\,{\rm d}y \sim \sqrt{\frac{2\pi}{n}} e^{-n},</math> | <math display=block>\int_0^\infty\! e^{n(\ln y -y)}\,{\rm d}y \sim \sqrt{\frac{2\pi}{n}} e^{-n},</math> | ||
which recovers Stirling's formula: | which recovers Stirling's formula: | ||
<math display=block>n! \sim e^{n \ln n} n \sqrt{\frac{2\pi}{n}} e^{-n} | <math display=block>n! \sim e^{n \ln n} n \sqrt{\frac{2\pi}{n}} e^{-n} | ||
| Line 58: | Line 64: | ||
=== Higher orders === | === Higher orders === | ||
In fact, further corrections can also be obtained using Laplace's method. From previous result, we know that <math>\Gamma(x) \sim x^x e^{-x}</math>, so we "peel off" this dominant term, then perform two changes of variables, to obtain:<math display="block">x^{-x}e^x\Gamma(x) = \int_\R e^{x(1+t-e^t)}dt</math>To verify this: <math>\int_\R e^{x(1+t-e^t)}dt \overset{t \mapsto \ln t}{=} e^x \int_0^\infty t^{x-1} e^{-xt} dt \overset{t \mapsto t/x}{=} x^{-x} e^x \int_0^\infty e^{-t} t^{x-1} dt = x^{-x} e^x \Gamma(x)</math>. | In fact, further corrections can also be obtained using Laplace's method. From previous result, we know that <math>\Gamma(x) \sim x^x e^{-x}</math>, so we "peel off" this dominant term, then perform two changes of variables, to obtain:<math display="block">x^{-x}e^x\Gamma(x) = \int_\R e^{x(1+t-e^t)}dt</math>To verify this: <math>\int_\R e^{x(1+t-e^t)}dt \overset{t \mapsto \ln t}{=} e^x \int_0^\infty\! t^{x-1} e^{-xt} dt \overset{t \mapsto t/x}{=} x^{-x} e^x \int_0^\infty\! e^{-t} t^{x-1} dt = x^{-x} e^x \Gamma(x)</math>. | ||
Now the function <math>t \mapsto 1+t - e^t</math> is unimodal, with maximum value zero. Locally around zero, it looks like <math>-t^2/2</math>, which is why we are able to perform Laplace's method. In order to extend Laplace's method to higher orders, we perform another change of variables by <math>1+t-e^t = -\tau^2/2</math>. This equation cannot be solved in closed form, but it can be solved by serial expansion, which gives us <math>t = \tau - \tau^2/6 + \tau^3/36 + a_4 \tau^4 + O(\tau^5) </math>. Now plug back to the equation to obtain<math display="block">x^{-x}e^x\Gamma(x) = \int_\R e^{-x\tau^2/2}(1-\tau/3 + \tau^2/12 + 4a_4 \tau^3 + O(\tau^4)) d\tau = \sqrt{2\pi}(x^{-1/2} + x^{-3/2}/12) + O(x^{-5/2})</math>notice how we don't need to actually find <math>a_4</math>, since it is cancelled out by the integral. Higher orders can be achieved by computing more terms in <math>t = \tau + \cdots</math>, which can be obtained programmatically.{{NoteTag|note=For example, a program in Mathematica: <syntaxhighlight lang="mathematica"> | Now the function <math>t \mapsto 1+t - e^t</math> is unimodal, with maximum value zero. Locally around zero, it looks like <math>-t^2/2</math>, which is why we are able to perform Laplace's method. In order to extend Laplace's method to higher orders, we perform another change of variables by <math>1+t-e^t = -\tau^2/2</math>. This equation cannot be solved in closed form, but it can be solved by serial expansion, which gives us <math>t = \tau - \tau^2/6 + \tau^3/36 + a_4 \tau^4 + O(\tau^5) </math>. Now plug back to the equation to obtain<math display="block">x^{-x}e^x\Gamma(x) = \int_\R e^{-x\tau^2/2}(1-\tau/3 + \tau^2/12 + 4a_4 \tau^3 + O(\tau^4)) d\tau = \sqrt{2\pi}(x^{-1/2} + x^{-3/2}/12) + O(x^{-5/2})</math>notice how we don't need to actually find <math>a_4</math>, since it is cancelled out by the integral. Higher orders can be achieved by computing more terms in <math>t = \tau + \cdots</math>, which can be obtained programmatically.{{NoteTag|note=For example, a program in Mathematica: <syntaxhighlight lang="mathematica"> | ||
| Line 71: | Line 77: | ||
</syntaxhighlight>|name=mathematica-program|content=content|text=text}} | </syntaxhighlight>|name=mathematica-program|content=content|text=text}} | ||
Thus we get Stirling's formula to two orders:<math display="block"> n! = \sqrt{2\pi n}\left(\frac{n}{e}\right)^n \left(1 + \frac{1}{12 n}+O\left(\frac{1}{n^2}\right) \right). | Thus we get Stirling's formula to two orders:<math display="block"> n! = \sqrt{2\pi n}\left(\frac{n}{e}\right)^n \left(1 + \frac{1}{12 n}+O\!\left(\!\frac{1}{n^2}\!\right) \right). | ||
</math> | </math> | ||
| Line 83: | Line 89: | ||
An alternative version uses the fact that the [[Poisson distribution]] converges to a [[normal distribution]] by the [[Central limit theorem|Central Limit Theorem]].<ref>{{Cite book |last=MacKay |first=David J. C. |title=Information theory, inference, and learning algorithms |date=2019 |publisher=Cambridge University Press |isbn=978-0-521-64298-9 |edition=22nd printing |location=Cambridge}}</ref> | An alternative version uses the fact that the [[Poisson distribution]] converges to a [[normal distribution]] by the [[Central limit theorem|Central Limit Theorem]].<ref>{{Cite book |last=MacKay |first=David J. C. |title=Information theory, inference, and learning algorithms |date=2019 |publisher=Cambridge University Press |isbn=978-0-521-64298-9 |edition=22nd printing |location=Cambridge}}</ref> | ||
Since the Poisson distribution with parameter <math>\ | Since the Poisson distribution with parameter <math>\mu</math> converges to a normal distribution with mean <math>\mu</math> and variance <math>\mu</math>, their [[Probability density function|density functions]] will be approximately the same: | ||
<math>\frac{\exp(-\mu)\mu^x}{x!}\approx \frac{1}{\sqrt{2\pi\mu}}\exp\left(-\frac{1}{2}\left(\frac{x-\mu}{\sqrt{\mu}}\right)^{2}\right)</math> | <math display="block">\frac{\exp(-\mu)\mu^x}{x!}\approx \frac{1}{\sqrt{2\pi\mu}}\exp\left(-\frac{1}{2}\left(\frac{x-\mu}{\sqrt{\mu}}\right)^{2}\right)</math> | ||
Evaluating this expression at the mean, at which the approximation is particularly accurate, simplifies this expression to: | Evaluating this expression at the mean, at which the approximation is particularly accurate, simplifies this expression to: | ||
<math>\frac{\exp(-\mu)\mu^\mu}{\mu!}\approx \frac{1}{\sqrt{2\pi\mu}}</math> | <math display="block">\frac{\exp(-\mu)\mu^\mu}{\mu!}\approx \frac{1}{\sqrt{2\pi\mu}}</math> | ||
Taking logs then results in: | Taking logs then results in: | ||
<math>-\mu+\mu\ln\mu-\ln\mu!\approx -\frac{1}{2}\ln 2\pi\mu</math> | <math display="block">-\mu+\mu\ln\mu-\ln\mu!\approx -\frac{1}{2}\ln 2\pi\mu</math> | ||
which can easily be rearranged to give: | which can easily be rearranged to give: | ||
<math>\ln\mu!\approx \mu\ln\mu - \mu + \frac{1}{2}\ln 2\pi\mu</math> | <math display="block">\ln\mu!\approx \mu\ln\mu - \mu + \frac{1}{2}\ln 2\pi\mu</math> | ||
Evaluating at <math>\mu=n</math> gives the usual, more precise form of Stirling's approximation. | Evaluating at <math>\mu=n</math> gives the usual, more precise form of Stirling's approximation. | ||
== Speed of convergence and error estimates == | == Speed of convergence and error estimates == | ||
[[File:Stirling series relative error.svg|thumb|upright=1.8|The relative error in a truncated Stirling series vs. <math>n</math>, for 0 to 5 terms. The kinks in the curves represent points where the truncated series coincides with {{math|Γ(''n'' + 1)}}.]] | {{Dark mode invert|[[File:Stirling series relative error.svg|thumb|upright=1.8|The relative error in a truncated Stirling series vs. <math>n</math>, for 0 to 5 terms. The kinks in the curves represent points where the truncated series coincides with {{math|Γ(''n'' + 1)}}.]]}} | ||
Stirling's formula is in fact the first approximation to the following series (now called the '''Stirling series'''):{{r|nist}} | Stirling's formula is in fact the first approximation to the following series (now called the '''Stirling series'''):{{r|nist}} | ||
<math display=block> | <math display=block> | ||
n! \sim \sqrt{2\pi n}\left(\frac{n}{e}\right)^n \left(1 +\frac{1}{12n}+\frac{1}{288n^2} - \frac{139}{51840n^3} -\frac{571}{2488320n^4}+ \cdots \right).</math> | n! \sim \sqrt{2\pi n}\ \left(\frac{n}{e}\right)^n \left(1 +\frac{1}{12n}+\frac{1}{288n^2} - \frac{139}{51840n^3} - \frac{571}{2488320n^4} + \frac{163879}{209018880n^5} - \cdots \right).</math> | ||
An explicit formula for the coefficients in this series was given by G. Nemes.{{r|Nemes2010-2}} Further terms are listed in the [[On-Line Encyclopedia of Integer Sequences]] as {{OEIS link|A001163}} and {{OEIS link|A001164}}. The first graph in this section shows the [[Approximation error|relative error]] vs. <math>n</math>, for 1 through all 5 terms listed above. (Bender and Orszag<ref>{{Cite book |last1=Bender |first1=Carl M. |title=Advanced mathematical methods for scientists and engineers. 1: Asymptotic methods and perturbation theory |last2=Orszag |first2=Steven A. |date=2009 |publisher=Springer |isbn=978-0-387-98931-0 |edition=Nachdr. |location=New York, NY}}</ref> p. 218) gives the asymptotic formula for the coefficients:<math display="block">A_{2 j+1} \sim(-1)^j 2(2 j) ! | An explicit formula for the coefficients in this series was given by G. Nemes.{{r|Nemes2010-2}} Further terms are listed in the [[On-Line Encyclopedia of Integer Sequences]] as {{OEIS link|A001163}} and {{OEIS link|A001164}}. The first graph in this section shows the [[Approximation error|relative error]] vs. <math>n</math>, for 1 through all 5 terms listed above. (Bender and Orszag<ref>{{Cite book |last1=Bender |first1=Carl M. |title=Advanced mathematical methods for scientists and engineers. 1: Asymptotic methods and perturbation theory |last2=Orszag |first2=Steven A. |date=2009 |publisher=Springer |isbn=978-0-387-98931-0 |edition=Nachdr. |location=New York, NY}}</ref> p. 218) gives the asymptotic formula for the coefficients: | ||
<math display="block">A_{2 j+1} \sim \frac{(-1)^j 2(2 j) !}{ (2 \pi)^{2(j+1)}}</math>which shows that it grows superexponentially, and that by the [[ratio test]] the [[radius of convergence]] is zero. | |||
[[File:Stirling error vs number of terms.svg|thumb|upright=1.8|The relative error in a truncated Stirling series vs. the number of terms used]] | However, the representation obtained directly from the Euler-Maclaurin approximation, in which the correction term itself is the argument of the exponential function, converges much faster (needs half the number of correction terms for the same accuracy): | ||
<math display=block> | |||
n! \sim \sqrt{2\pi n}\ \left(\frac{n}{e}\right)^n \exp \bigg(\frac{1}{12n} - \frac{1}{360n^3} + \frac{1}{1260n^5} - \frac{1}{1680n^7} + \frac{1}{1188n^9} - \cdots\bigg).</math> | |||
The <math>k</math>-th coefficient (for the reciprocal of the <math>2k\!-\!1</math>-th power of <math>n</math>) is directly calculated using Bernoulli numbers and <math> c_k = \tfrac{B_{2k}}{2k (2k-1)}. </math> | |||
{{Dark mode invert|[[File:Stirling error vs number of terms.svg|thumb|upright=1.8|The relative error in a truncated Stirling series vs. the number of terms used]]}} | |||
As {{math|''n'' → ∞}}, the error in the truncated series is asymptotically equal to the first omitted term. This is an example of an [[asymptotic expansion]]. It is not a [[convergent series]]; for any ''particular'' value of <math>n</math> there are only so many terms of the series that improve accuracy, after which accuracy worsens. This is shown in the next graph, which shows the relative error versus the number of terms in the series, for larger numbers of terms. More precisely, let {{math|''S''(''n | As {{math|''n'' → ∞}}, the error in the truncated series is asymptotically equal to the first omitted term. This is an example of an [[asymptotic expansion]]. It is not a [[convergent series]]; for any ''particular'' value of <math>n</math> there are only so many terms of the series that improve accuracy, after which accuracy worsens. This is shown in the next graph, which shows the relative error versus the number of terms in the series, for larger numbers of terms. More precisely, let {{math|''S<sub>t </sub>''(''n'')}} be the Stirling series to <math>t</math> terms evaluated at <math>n</math>. The graphs show | ||
<math display=block>\left | \ln | <math display=block> \left| \, \ln \frac{S_t(n)}{n!} \right|, </math> | ||
which, when small, is essentially the relative error. | which, when small, is essentially the relative error. | ||
Writing Stirling's series in the form | Writing Stirling's series in the form | ||
<math display=block>\ln | <math display=block> \ln n! \sim n\ln n - n + \tfrac12\ln 2\pi n +\frac{1}{12n} - \frac{1}{360n^3} + \frac{1}{1260n^5} - \frac{1}{1680n^7} + \cdots,</math> | ||
it is known that the error in truncating the series is always of the opposite sign and at most the same magnitude as the first omitted term.{{Citation needed|date=December 2024}} | it is known that the error in truncating the series is always of the opposite sign and at most the same magnitude as the first omitted term.{{Citation needed|date=December 2024}} | ||
Other bounds, due to Robbins,{{r|Robbins1955}} valid for all positive integers <math>n</math> are | Other bounds, due to Robbins,{{r|Robbins1955}} valid for all positive integers <math>n</math> are | ||
<math display=block>\sqrt{2\pi n}\left(\frac{n}{e}\right)^n e^{\frac{1}{12n + 1}} < n! < \sqrt{2\pi n}\left(\frac{n}{e}\right)^n e^{\frac{1}{12n}}. </math> | <math display=block>\sqrt{2\pi n}\ \left(\frac{n}{e}\right)^n e^{\frac{1}{12n + 1}} < n! < \sqrt{2\pi n}\ \left(\frac{n}{e}\right)^n e^{\frac{1}{12n}}. </math> | ||
This upper bound corresponds to stopping the above series for <math>\ln | This upper bound corresponds to stopping the above series for <math>\ln n!\,</math> after the <math>\tfrac{1}{n}</math> term. | ||
The lower bound is weaker than that obtained by stopping the series after the <math>\tfrac{1}{n^3}</math> term. <br> A looser version of this bound is that <math>\frac{n!\, e^n}{n^{n+\tfrac{1}{2}}} \in (\sqrt{2 \pi}, e]</math> for all <math>n \ge 1</math>. | |||
==Stirling's formula for the gamma function== | ==Stirling's formula for the gamma function== | ||
| Line 130: | Line 142: | ||
However, the gamma function, unlike the factorial, is more broadly defined for all complex numbers other than non-positive integers; nevertheless, Stirling's formula may still be applied. If {{math|Re(''z'') > 0}}, then | However, the gamma function, unlike the factorial, is more broadly defined for all complex numbers other than non-positive integers; nevertheless, Stirling's formula may still be applied. If {{math|Re(''z'') > 0}}, then | ||
<math display=block>\ln\Gamma (z) = z\ln z - z + \tfrac12\ln\frac{2\pi}{z} + \int_0^\infty\frac{2\arctan\left(\frac{t}{z}\right)}{e^{2\pi t}-1}\,{\rm d}t.</math> | <math display=block>\ln\Gamma (z) = z\ln z - z + \tfrac12\ln\frac{2\pi}{z} + \int_0^\infty\!\frac{2\arctan\left(\frac{t}{z}\right)}{e^{2\pi t}-1}\,{\rm d}t.</math> | ||
Repeated integration by parts gives | Repeated integration by parts gives | ||
| Line 193: | Line 205: | ||
The approximation may be made precise by giving paired upper and lower bounds; one such inequality is{{r|E.A.Karatsuba|Mortici2011-1|Mortici2011-2|Mortici2011-3}} | The approximation may be made precise by giving paired upper and lower bounds; one such inequality is{{r|E.A.Karatsuba|Mortici2011-1|Mortici2011-2|Mortici2011-3}} | ||
<math display=block> \sqrt{\pi} \left(\frac{x}{e}\right)^x \left( 8x^3 + 4x^2 + x + \frac{1}{100} \right)^{1/6} < \Gamma(1+x) < \sqrt{\pi} \left(\frac{x}{e}\right)^x \left( 8x^3 + 4x^2 + x + \frac{1}{30} \right)^{1/6}.</math> | <math display=block> \sqrt{\pi} \left(\frac{x}{e}\right)^x \left( 8x^3 + 4x^2 + x + \frac{1}{100} \right)^{1/6} < \Gamma(1+x) < \sqrt{\pi} \left(\frac{x}{e}\right)^x \left( 8x^3 + 4x^2 + x + \frac{1}{30} \right)^{1/6}.</math> | ||
==See also== | ==See also== | ||
Latest revision as of 11:41, 6 October 2025
Template:Short description Template:Dark mode invert In mathematics, Stirling's approximation (or Stirling's formula) is an asymptotic approximation for factorials. It is a good approximation, leading to accurate results even for small values of . It is named after James Stirling, though a related but less precise result was first stated by Abraham de Moivre.Template:R
One way of stating the approximation involves the logarithm of the factorial: where the big O notation means that, for all sufficiently large values of , the difference between and will be at most proportional to the logarithm of . In computer science applications such as the worst-case lower bound for comparison sorting, it is convenient to instead use the binary logarithm, giving the equivalent form The error term in either base can be expressed more precisely as , corresponding to an approximate formula for the factorial itself, Here the sign means that the two quantities are asymptotic, that is, their ratio tends to 1 as tends to infinity.
History
The formula was first discovered by Abraham de MoivreTemplate:R in 1721 in the form
De Moivre gave an approximate rational-number expression for the natural logarithm of the constant. Stirling's contribution in 1730 consisted of showing that the constant is precisely .Template:R[1]
Derivation
Roughly speaking, the simplest version of Stirling's formula can be quickly obtained by approximating the sum with an integral:
The full formula, together with precise estimates of its error, can be derived as follows. Instead of approximating , one considers its natural logarithm, as this is a slowly varying function:
The right-hand side of this equation minus is the approximation by the trapezoid rule of the integral
and the error in this approximation is given by the Euler–Maclaurin formula:
where is a Bernoulli number, and Template:Math is the remainder term in the Euler–Maclaurin formula. Take limits to find that
Denote this limit as . Because the remainder Template:Math in the Euler–Maclaurin formula satisfies
where big-O notation is used, combining the equations above yields the approximation formula in its logarithmic form:
Taking the exponential of both sides and choosing any positive integer , one obtains a formula involving an unknown quantity . For Template:Math, the formula is
The quantity can be found by taking the limit on both sides as tends to infinity and using Wallis' product, which shows that . Therefore, one obtains Stirling's formula:
Alternative derivations
An alternative formula for using the gamma function is (as can be seen by repeated integration by parts). Rewriting and changing variables Template:Math, one obtains Applying Laplace's method one has which recovers Stirling's formula:
Higher orders
In fact, further corrections can also be obtained using Laplace's method. From previous result, we know that , so we "peel off" this dominant term, then perform two changes of variables, to obtain:To verify this: .
Now the function is unimodal, with maximum value zero. Locally around zero, it looks like , which is why we are able to perform Laplace's method. In order to extend Laplace's method to higher orders, we perform another change of variables by . This equation cannot be solved in closed form, but it can be solved by serial expansion, which gives us . Now plug back to the equation to obtainnotice how we don't need to actually find , since it is cancelled out by the integral. Higher orders can be achieved by computing more terms in , which can be obtained programmatically.Template:NoteTag
Thus we get Stirling's formula to two orders:
Complex-analytic version
A complex-analysis version of this methodTemplate:R is to consider as a Taylor coefficient of the exponential function , computed by Cauchy's integral formula as
This line integral can then be approximated using the saddle-point method with an appropriate choice of contour radius . The dominant portion of the integral near the saddle point is then approximated by a real integral and Laplace's method, while the remaining portion of the integral can be bounded above to give an error term.
Using the Central Limit Theorem and the Poisson distribution
An alternative version uses the fact that the Poisson distribution converges to a normal distribution by the Central Limit Theorem.[2]
Since the Poisson distribution with parameter converges to a normal distribution with mean and variance , their density functions will be approximately the same:
Evaluating this expression at the mean, at which the approximation is particularly accurate, simplifies this expression to:
Taking logs then results in:
which can easily be rearranged to give:
Evaluating at gives the usual, more precise form of Stirling's approximation.
Speed of convergence and error estimates
Stirling's formula is in fact the first approximation to the following series (now called the Stirling series):Template:R
An explicit formula for the coefficients in this series was given by G. Nemes.Template:R Further terms are listed in the On-Line Encyclopedia of Integer Sequences as A001163 and A001164. The first graph in this section shows the relative error vs. , for 1 through all 5 terms listed above. (Bender and Orszag[3] p. 218) gives the asymptotic formula for the coefficients: which shows that it grows superexponentially, and that by the ratio test the radius of convergence is zero.
However, the representation obtained directly from the Euler-Maclaurin approximation, in which the correction term itself is the argument of the exponential function, converges much faster (needs half the number of correction terms for the same accuracy): The -th coefficient (for the reciprocal of the -th power of ) is directly calculated using Bernoulli numbers and Template:Dark mode invert
As Template:Math, the error in the truncated series is asymptotically equal to the first omitted term. This is an example of an asymptotic expansion. It is not a convergent series; for any particular value of there are only so many terms of the series that improve accuracy, after which accuracy worsens. This is shown in the next graph, which shows the relative error versus the number of terms in the series, for larger numbers of terms. More precisely, let Template:Math be the Stirling series to terms evaluated at . The graphs show which, when small, is essentially the relative error.
Writing Stirling's series in the form it is known that the error in truncating the series is always of the opposite sign and at most the same magnitude as the first omitted term.Script error: No such module "Unsubst".
Other bounds, due to Robbins,Template:R valid for all positive integers are
This upper bound corresponds to stopping the above series for after the term.
The lower bound is weaker than that obtained by stopping the series after the term.
A looser version of this bound is that for all .
Stirling's formula for the gamma function
For all positive integers, where Template:Math denotes the gamma function.
However, the gamma function, unlike the factorial, is more broadly defined for all complex numbers other than non-positive integers; nevertheless, Stirling's formula may still be applied. If Template:Math, then
Repeated integration by parts gives
where is the th Bernoulli number (note that the limit of the sum as is not convergent, so this formula is just an asymptotic expansion). The formula is valid for large enough in absolute value, when Template:Math, where Template:Mvar is positive, with an error term of Template:Math. The corresponding approximation may now be written:
where the expansion is identical to that of Stirling's series above for , except that is replaced with Template:Math.Template:R
A further application of this asymptotic expansion is for complex argument Template:Mvar with constant Template:Math. See for example the Stirling formula applied in Template:Math of the Riemann–Siegel theta function on the straight line Template:Math.
A convergent version of Stirling's formula
Thomas Bayes showed, in a letter to John Canton published by the Royal Society in 1763, that Stirling's formula did not give a convergent series.Template:R Obtaining a convergent version of Stirling's formula entails evaluating Binet's formula:
One way to do this is by means of a convergent series of inverted rising factorials. If then where where Template:Math denotes the Stirling numbers of the first kind. From this one obtains a version of Stirling's series which converges when Template:Math. Stirling's formula may also be given in convergent form as[4] where
Versions suitable for calculators
The approximation and its equivalent form can be obtained by rearranging Stirling's extended formula and observing a coincidence between the resultant power series and the Taylor series expansion of the hyperbolic sine function. This approximation is good to more than 8 decimal digits for Template:Mvar with a real part greater than 8. Robert H. Windschitl suggested it in 2002 for computing the gamma function with fair accuracy on calculators with limited program or register memory.Template:R
Gergő Nemes proposed in 2007 an approximation which gives the same number of exact digits as the Windschitl approximation but is much simpler:Template:R or equivalently,
An alternative approximation for the gamma function stated by Srinivasa Ramanujan in Ramanujan's lost notebook[5] is for Template:Math. The equivalent approximation for Template:Math has an asymptotic error of Template:Math and is given by
The approximation may be made precise by giving paired upper and lower bounds; one such inequality isTemplate:R
See also
References
Further reading
- Script error: No such module "citation/CS1".
- Script error: No such module "citation/CS1".
- Script error: No such module "citation/CS1".
- Script error: No such module "citation/CS1".
- Script error: No such module "citation/CS1".
External links
- Script error: No such module "Template wrapper".Template:Main other
- Peter Luschny, Approximation formulas for the factorial function n!
- Script error: No such module "Template wrapper".
- Stirling's approximation at PlanetMath.
- ↑ Methodus Differentialis: Sive Tractatus de Summatione et Interpolatione Serierum Infinitarum, Jacob Stirling, London, 1730
- ↑ Script error: No such module "citation/CS1".
- ↑ Script error: No such module "citation/CS1".
- ↑ Script error: No such module "citation/CS1".
- ↑ Script error: No such module "citation/CS1".