Box–Muller transform: Difference between revisions

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
imported>Shuisman
Undid revision 1294400558 by Shuisman (talk)
imported>LoisPaulin
m Requires citation for non encyclopedic sentence
 
Line 2: Line 2:
[[File:Box-Muller transform visualisation.svg|thumb|300px|Visualisation of the Box–Muller transform — the coloured points in the unit square (u<sub>1</sub>, u<sub>2</sub>), drawn as circles, are mapped to a 2D Gaussian (z<sub>0</sub>, z<sub>1</sub>), drawn as crosses. The plots at the margins are the probability distribution functions of z0 and z1. z0 and z1 are unbounded; they appear to be in {{closed-closed|&minus;2.5, 2.5}} due to the choice of the illustrated points. In [https://upload.wikimedia.org/wikipedia/commons/1/1f/Box-Muller_transform_visualisation.svg the SVG file], hover over a point to highlight it and its corresponding point.]]
[[File:Box-Muller transform visualisation.svg|thumb|300px|Visualisation of the Box–Muller transform — the coloured points in the unit square (u<sub>1</sub>, u<sub>2</sub>), drawn as circles, are mapped to a 2D Gaussian (z<sub>0</sub>, z<sub>1</sub>), drawn as crosses. The plots at the margins are the probability distribution functions of z0 and z1. z0 and z1 are unbounded; they appear to be in {{closed-closed|&minus;2.5, 2.5}} due to the choice of the illustrated points. In [https://upload.wikimedia.org/wikipedia/commons/1/1f/Box-Muller_transform_visualisation.svg the SVG file], hover over a point to highlight it and its corresponding point.]]


The '''Box–Muller transform''', by [[George E. P. Box|George Edward Pelham Box]] and [[Mervin Edgar Muller]],<ref>{{Cite journal |doi=10.1214/aoms/1177706645 |jstor=2237361|title=A Note on the Generation of Random Normal Deviates|journal=The Annals of Mathematical Statistics|volume=29|issue=2|pages=610–611|last1=Box|first1=G. E. P.|last2=Muller|first2=Mervin E.|year=1958|doi-access=free}}</ref> is a [[random number sampling]] method for generating pairs of [[statistical independence|independent]], standard, [[normal distribution|normally distributed]] (zero [[expected value|expectation]], unit [[variance]]) random numbers, given a source of [[Uniform distribution (continuous)|uniformly distributed]] random numbers. The method was first mentioned explicitly by [[Raymond Paley|Raymond E. A. C. Paley]] and [[Norbert Wiener]] in their 1934 treatise on Fourier transforms in the complex domain.<ref>Raymond E. A. C. Paley and Norbert Wiener ''Fourier Transforms in the Complex Domain,'' New York: American Mathematical Society (1934) §37.</ref> Given the status of these latter authors and the widespread availability and use of their treatise, it is almost certain that Box and Muller were well aware of its contents.
The '''Box–Muller transform''', by [[George E. P. Box|George Edward Pelham Box]] and [[Mervin Edgar Muller]],<ref>{{Cite journal |doi=10.1214/aoms/1177706645 |jstor=2237361|title=A Note on the Generation of Random Normal Deviates|journal=The Annals of Mathematical Statistics|volume=29|issue=2|pages=610–611|last1=Box|first1=G. E. P.|last2=Muller|first2=Mervin E.|year=1958|doi-access=free}}</ref> is a [[random number sampling]] method for generating pairs of [[statistical independence|independent]], standard, [[normal distribution|normally distributed]] (zero [[expected value|expectation]], unit [[variance]]) random numbers, given a source of [[Uniform distribution (continuous)|uniformly distributed]] random numbers. The method was first mentioned explicitly by [[Raymond Paley|Raymond E. A. C. Paley]] and [[Norbert Wiener]] in their 1934 treatise on [[Fourier transform|Fourier transforms]] in the complex domain.<ref>Raymond E. A. C. Paley and Norbert Wiener ''Fourier Transforms in the Complex Domain,'' New York: American Mathematical Society (1934) §37.</ref> Given the status of these latter authors and the widespread availability and use of their treatise, it is almost certain that Box and Muller were well aware of its contents{{Citation needed|date=August 2025|reason=Encyclopedic value is unclear unless there was a discourse on the subject that should be cited and treated as historical information}}.


The Box–Muller transform is commonly expressed in two forms. The basic form as given by Box and Muller takes two samples from the uniform distribution on the interval {{open-open|0,1}} and maps them to two standard, normally distributed samples.  The polar form takes two samples from a different interval, {{closed-closed|−1,+1}}, and maps them to two normally distributed samples without the use of sine or cosine functions.
The Box–Muller transform is commonly expressed in two forms. The basic form as given by Box and Muller takes two samples from the uniform distribution on the [[Interval (mathematics)|interval]] {{open-open|0,1}} and maps them to two standard, normally distributed samples.  The polar form takes two samples from a different interval, {{closed-closed|−1,+1}}, and maps them to two normally distributed samples without the use of sine or cosine functions.


The Box–Muller transform was developed as a more computationally efficient alternative to the [[inverse transform sampling method]].<ref>Kloeden and Platen, ''Numerical Solutions of Stochastic Differential Equations'', pp. 11–12</ref> The [[ziggurat algorithm]] gives a more efficient method for scalar processors (e.g. old CPUs), while the Box–Muller transform is superior for processors with vector units (e.g. GPUs or modern CPUs).<ref>{{Cite book | last1 = Howes | first1 = Lee | last2 = Thomas | first2 = David | title = GPU Gems 3 - Efficient Random Number Generation and Application Using CUDA | publisher = Pearson Education, Inc. | year = 2008 | isbn = 978-0-321-51526-1 }}</ref>
The Box–Muller transform was developed as a more computationally efficient alternative to the [[inverse transform sampling method]].<ref>Kloeden and Platen, ''Numerical Solutions of Stochastic Differential Equations'', pp. 11–12</ref> The [[ziggurat algorithm]] gives a more efficient method for scalar processors (e.g. old CPUs), while the Box–Muller transform is superior for processors with vector units (e.g. [[Graphics processing unit|GPUs]] or modern [[Central processing unit|CPUs]]).<ref>{{Cite book | last1 = Howes | first1 = Lee | last2 = Thomas | first2 = David | title = GPU Gems 3 - Efficient Random Number Generation and Application Using CUDA | publisher = Pearson Education, Inc. | year = 2008 | isbn = 978-0-321-51526-1 }}</ref>


==Basic form==
==Basic form==
Line 27: Line 27:
[[Image:BoxMullerTransformUsingPolarCoordinates.png|thumb|400px|Two uniformly distributed values, ''u'' and ''v'' are used to produce the value {{math|1=''s'' = ''R''<sup>2</sup>}}, which is likewise uniformly distributed. The definitions of the sine and cosine are then applied to the basic form of the Box–Muller transform to avoid using trigonometric functions.]] The polar form was first proposed by J. Bell<ref name=Bell68>{{Cite journal | url=http://portal.acm.org/citation.cfm?doid=363397.363547 | doi=10.1145/363397.363547| title=Algorithm 334: Normal random deviates| journal = Communications of the ACM| volume=11| issue=7| pages=498| year=1968| last1=Bell| first1=James R.| doi-access=free}}</ref> and then modified by R. Knop.<ref>{{Cite journal | url=http://portal.acm.org/citation.cfm?doid=362946.362996 | doi=10.1145/362946.362996| title=Remark on algorithm 334 &#91;G5&#93;: Normal random deviates| journal=Communications of the ACM| volume=12| issue=5| pages=281| year=1969| last1=Knop| first1=R.| doi-access=free}}</ref> While several different versions of the polar method have been described, the version of R. Knop will be described here because it is the most widely used, in part due to its inclusion in ''[[Numerical Recipes]]''.  A slightly different form is described as "Algorithm P" by D. Knuth in ''[[The Art of Computer Programming]]''.<ref>{{cite book|first1=Donald |last1=Knuth |author-link=Donald Knuth |title=The Art of Computer Programming: Volume 2: Seminumerical Algorithms |year=1998 |page=122 |publisher=Addison-Wesley |isbn=0-201-89684-2}}</ref>
[[Image:BoxMullerTransformUsingPolarCoordinates.png|thumb|400px|Two uniformly distributed values, ''u'' and ''v'' are used to produce the value {{math|1=''s'' = ''R''<sup>2</sup>}}, which is likewise uniformly distributed. The definitions of the sine and cosine are then applied to the basic form of the Box–Muller transform to avoid using trigonometric functions.]] The polar form was first proposed by J. Bell<ref name=Bell68>{{Cite journal | url=http://portal.acm.org/citation.cfm?doid=363397.363547 | doi=10.1145/363397.363547| title=Algorithm 334: Normal random deviates| journal = Communications of the ACM| volume=11| issue=7| pages=498| year=1968| last1=Bell| first1=James R.| doi-access=free}}</ref> and then modified by R. Knop.<ref>{{Cite journal | url=http://portal.acm.org/citation.cfm?doid=362946.362996 | doi=10.1145/362946.362996| title=Remark on algorithm 334 &#91;G5&#93;: Normal random deviates| journal=Communications of the ACM| volume=12| issue=5| pages=281| year=1969| last1=Knop| first1=R.| doi-access=free}}</ref> While several different versions of the polar method have been described, the version of R. Knop will be described here because it is the most widely used, in part due to its inclusion in ''[[Numerical Recipes]]''.  A slightly different form is described as "Algorithm P" by D. Knuth in ''[[The Art of Computer Programming]]''.<ref>{{cite book|first1=Donald |last1=Knuth |author-link=Donald Knuth |title=The Art of Computer Programming: Volume 2: Seminumerical Algorithms |year=1998 |page=122 |publisher=Addison-Wesley |isbn=0-201-89684-2}}</ref>


Given {{mvar|u}} and {{mvar|v}}, independent and uniformly distributed in the closed interval {{closed-closed|−1, +1}}, set {{math|1=''s'' = ''R''<sup>2</sup> = ''u''<sup>2</sup> + ''v''<sup>2</sup>}}. If {{math|1=''s'' = 0}} or {{math|''s'' ≥ 1}}, discard ''u'' and ''v'', and try another pair {{math|(''u'', ''v'')}}.  Because {{mvar|u}} and {{mvar|v}} are uniformly distributed and because only points within the unit circle have been admitted, the values of ''s'' will be uniformly distributed in the open interval {{open-open|0, 1}}, too.  The latter can be seen by calculating the cumulative distribution function for ''s'' in the interval {{open-open|0, 1}}.  This is the area of a circle with radius <math display="inline"> \sqrt{s}</math>, divided by <math>\pi</math>. From this we find the probability density function to have the constant value 1 on the interval {{open-open|0, 1}}. Equally so, the angle θ divided by <math> 2 \pi</math> is uniformly distributed in the interval {{closed-open|0, 1}} and independent of {{mvar|s}}.
Given {{mvar|u}} and {{mvar|v}}, independent and uniformly distributed in the closed interval {{closed-closed|−1, +1}}, set {{math|1=''s'' = ''R''<sup>2</sup> = ''u''<sup>2</sup> + ''v''<sup>2</sup>}}. If {{math|1=''s'' = 0}} or {{math|''s'' ≥ 1}}, discard ''u'' and ''v'', and try another pair {{math|(''u'', ''v'')}}.  Because {{mvar|u}} and {{mvar|v}} are uniformly distributed and because only points within the [[unit circle]] have been admitted, the values of ''s'' will be uniformly distributed in the open interval {{open-open|0, 1}}, too.  The latter can be seen by calculating the cumulative distribution function for ''s'' in the interval {{open-open|0, 1}}.  This is the [[area of a circle]] with radius <math display="inline"> \sqrt{s}</math>, divided by <math>\pi</math>. From this we find the [[probability density function]] to have the constant value 1 on the interval {{open-open|0, 1}}. Equally so, the angle θ divided by <math> 2 \pi</math> is uniformly distributed in the interval {{closed-open|0, 1}} and independent of {{mvar|s}}.


We now identify the value of ''s'' with that of ''U''<sub>1</sub> and <math> \theta/(2 \pi)</math> with that of ''U''<sub>2</sub> in the basic form.  As shown in the figure, the values of <math> \cos \theta = \cos 2 \pi U_2</math> and <math> \sin \theta = \sin 2 \pi U_2</math> in the basic form can be replaced with the ratios <math>\cos \theta = u/R = u/\sqrt{s}</math> and <math display="inline">\sin \theta = v/R = v/\sqrt{s}</math>, respectively.  The advantage is that calculating the trigonometric functions directly can be avoided.  This is helpful when trigonometric functions are more expensive to compute than the single division that replaces each one.
We now identify the value of ''s'' with that of ''U''<sub>1</sub> and <math> \theta/(2 \pi)</math> with that of ''U''<sub>2</sub> in the basic form.  As shown in the figure, the values of <math> \cos \theta = \cos 2 \pi U_2</math> and <math> \sin \theta = \sin 2 \pi U_2</math> in the basic form can be replaced with the ratios <math>\cos \theta = u/R = u/\sqrt{s}</math> and <math display="inline">\sin \theta = v/R = v/\sqrt{s}</math>, respectively.  The advantage is that calculating the trigonometric functions directly can be avoided.  This is helpful when trigonometric functions are more expensive to compute than the single division that replaces each one.

Latest revision as of 09:33, 24 August 2025

Template:Short description

File:Box-Muller transform visualisation.svg
Visualisation of the Box–Muller transform — the coloured points in the unit square (u1, u2), drawn as circles, are mapped to a 2D Gaussian (z0, z1), drawn as crosses. The plots at the margins are the probability distribution functions of z0 and z1. z0 and z1 are unbounded; they appear to be in Template:Closed-closed due to the choice of the illustrated points. In the SVG file, hover over a point to highlight it and its corresponding point.

The Box–Muller transform, by George Edward Pelham Box and Mervin Edgar Muller,[1] is a random number sampling method for generating pairs of independent, standard, normally distributed (zero expectation, unit variance) random numbers, given a source of uniformly distributed random numbers. The method was first mentioned explicitly by Raymond E. A. C. Paley and Norbert Wiener in their 1934 treatise on Fourier transforms in the complex domain.[2] Given the status of these latter authors and the widespread availability and use of their treatise, it is almost certain that Box and Muller were well aware of its contentsScript error: No such module "Unsubst"..

The Box–Muller transform is commonly expressed in two forms. The basic form as given by Box and Muller takes two samples from the uniform distribution on the interval Template:Open-open and maps them to two standard, normally distributed samples. The polar form takes two samples from a different interval, Template:Closed-closed, and maps them to two normally distributed samples without the use of sine or cosine functions.

The Box–Muller transform was developed as a more computationally efficient alternative to the inverse transform sampling method.[3] The ziggurat algorithm gives a more efficient method for scalar processors (e.g. old CPUs), while the Box–Muller transform is superior for processors with vector units (e.g. GPUs or modern CPUs).[4]

Basic form

Suppose Template:Math and Template:Math are independent samples chosen from the uniform distribution on the unit interval Template:Open-open. Let Z0=Rcos(Θ)=2lnU1cos(2πU2) and Z1=Rsin(Θ)=2lnU1sin(2πU2).

Then Z0 and Z1 are independent random variables with a standard normal distribution.

The derivation[5] is based on a property of a two-dimensional Cartesian system, where X and Y coordinates are described by two independent and normally distributed random variables, the random variables for Template:Math and Template:Mvar (shown above) in the corresponding polar coordinates are also independent and can be expressed as R2=2lnU1 and Θ=2πU2.

Because Template:Math is the square of the norm of the standard bivariate normal variable Template:Math, it has the chi-squared distribution with two degrees of freedom. In the special case of two degrees of freedom, the chi-squared distribution coincides with the exponential distribution, and the equation for Template:Math above is a simple way of generating the required exponential variate.

Polar form

Script error: No such module "Labelled list hatnote".

File:BoxMullerTransformUsingPolarCoordinates.png
Two uniformly distributed values, u and v are used to produce the value Template:Math, which is likewise uniformly distributed. The definitions of the sine and cosine are then applied to the basic form of the Box–Muller transform to avoid using trigonometric functions.

The polar form was first proposed by J. Bell[6] and then modified by R. Knop.[7] While several different versions of the polar method have been described, the version of R. Knop will be described here because it is the most widely used, in part due to its inclusion in Numerical Recipes. A slightly different form is described as "Algorithm P" by D. Knuth in The Art of Computer Programming.[8]

Given Template:Mvar and Template:Mvar, independent and uniformly distributed in the closed interval Template:Closed-closed, set Template:Math. If Template:Math or Template:Math, discard u and v, and try another pair Template:Math. Because Template:Mvar and Template:Mvar are uniformly distributed and because only points within the unit circle have been admitted, the values of s will be uniformly distributed in the open interval Template:Open-open, too. The latter can be seen by calculating the cumulative distribution function for s in the interval Template:Open-open. This is the area of a circle with radius s, divided by π. From this we find the probability density function to have the constant value 1 on the interval Template:Open-open. Equally so, the angle θ divided by 2π is uniformly distributed in the interval Template:Closed-open and independent of Template:Mvar.

We now identify the value of s with that of U1 and θ/(2π) with that of U2 in the basic form. As shown in the figure, the values of cosθ=cos2πU2 and sinθ=sin2πU2 in the basic form can be replaced with the ratios cosθ=u/R=u/s and sinθ=v/R=v/s, respectively. The advantage is that calculating the trigonometric functions directly can be avoided. This is helpful when trigonometric functions are more expensive to compute than the single division that replaces each one.

Just as the basic form produces two standard normal deviates, so does this alternate calculation. z0=2lnU1cos(2πU2)=2lns(us)=u2lnss and z1=2lnU1sin(2πU2)=2lns(vs)=v2lnss.

Contrasting the two forms

The polar method differs from the basic method in that it is a type of rejection sampling. It discards some generated random numbers, but can be faster than the basic method because it is simpler to compute (provided that the random number generator is relatively fast) and is more numerically robust.[9] Avoiding the use of expensive trigonometric functions improves speed over the basic form.[6] It discards Template:Math of the total input uniformly distributed random number pairs generated, i.e. discards Template:Math uniformly distributed random number pairs per Gaussian random number pair generated, requiring Template:Math input random numbers per output random number.

The basic form requires two multiplications, 1/2 logarithm, 1/2 square root, and one trigonometric function for each normal variate.[10] On some processors, the cosine and sine of the same argument can be calculated in parallel using a single instruction. Notably for Intel-based machines, one can use the fsincos assembler instruction or the expi instruction (usually available from C as an intrinsic function), to calculate complex exp(iz)=eiz=cosz+isinz, and just separate the real and imaginary parts.

Note: To explicitly calculate the complex-polar form use the following substitutions in the general form,

Let r=ln(u1) and z=2πu2. Then reiz=ln(u1)ei2πu2=2ln(u1)[cos(2πu2)+isin(2πu2)].

The polar form requires 3/2 multiplications, 1/2 logarithm, 1/2 square root, and 1/2 division for each normal variate. The effect is to replace one multiplication and one trigonometric function with a single division and a conditional loop.

Tails truncation

When a computer is used to produce a uniform random variable it will inevitably have some inaccuracies because there is a lower bound on how close numbers can be to 0. If the generator uses 32 bits per output value, the smallest non-zero number that can be generated is 232. When U1 and U2 are equal to this the Box–Muller transform produces a normal random deviate equal to δ=2ln(232)cos(2π232)6.660. This means that the algorithm will not produce random variables more than 6.660 standard deviations from the mean. This corresponds to a proportion of 2(1Φ(δ))2.738×1011 lost due to the truncation, where Φ(δ) is the standard cumulative normal distribution. With 64 bits the limit is pushed to δ=9.419 standard deviations, for which 2(1Φ(δ))<5×1021.

Implementation

C++

The standard Box–Muller transform generates values from the standard normal distribution (i.e. standard normal deviates) with mean 0 and standard deviation 1. The implementation below in standard C++ generates values from any normal distribution with mean μ and variance σ2. If Z is a standard normal deviate, then X=Zσ+μ will have a normal distribution with mean μ and standard deviation σ. The random number generator has been seeded to ensure that new, pseudo-random values will be returned from sequential calls to the generateGaussianNoise function.

#include <cmath>
#include <limits>
#include <random>
#include <utility>

//"mu" is the mean of the distribution, and "sigma" is the standard deviation.
std::pair<double, double> generateGaussianNoise(double mu, double sigma)
{
    constexpr double two_pi = 2.0 * M_PI;

    //initialize the random uniform number generator (runif) in a range 0 to 1
    static std::mt19937 rng(std::random_device{}()); // Standard mersenne_twister_engine seeded with rd()
    static std::uniform_real_distribution<> runif(0.0, 1.0);

    //create two random numbers, make sure u1 is greater than zero
    double u1, u2;
    do
    {
        u1 = runif(rng);
    }
    while (u1 == 0);
    u2 = runif(rng);

    //compute z0 and z1
    auto mag = sigma * sqrt(-2.0 * log(u1));
    auto z0  = mag * cos(two_pi * u2) + mu;
    auto z1  = mag * sin(two_pi * u2) + mu;

    return std::make_pair(z0, z1);
}

JavaScript

/* Syntax:
 *
 *   [ x, y ]  =   rand_normal();
 *     x       =   rand_normal()[0];
 *     y       =   rand_normal()[1];
 */
function rand_normal() {
    let theta  = 2 * Math.PI * Math.random();
    let R   = Math.sqrt(-2 * Math.log(Math.random()));
    let x   = R * Math.cos(theta);
    let y   = R * Math.sin(theta);

    return [ x, y ];
}

Julia

"""
    boxmullersample(N)

Generate `2N` samples from the standard normal distribution using the Box-Muller method.
"""
function boxmullersample(N)
    z = Array{Float64}(undef,N,2);
    for i in axes(z,1)
        z[i,:] .= sincospi(2 * rand());
        z[i,:] .*= sqrt(-2 * log(rand()));
    end
    vec(z)
end

"""
    boxmullersample(n,μ,σ)

Generate `n` samples from the normal distribution with mean `μ` and standard deviation `σ` using the Box-Muller method.
"""
function boxmullersample(n,μ,σ)
    μ .+ σ*boxmullersample(cld(n,2))[1:n];
end

See also

References

  1. Script error: No such module "Citation/CS1".
  2. Raymond E. A. C. Paley and Norbert Wiener Fourier Transforms in the Complex Domain, New York: American Mathematical Society (1934) §37.
  3. Kloeden and Platen, Numerical Solutions of Stochastic Differential Equations, pp. 11–12
  4. Script error: No such module "citation/CS1".
  5. Sheldon Ross, A First Course in Probability, (2002), pp. 279–281
  6. a b Script error: No such module "Citation/CS1".
  7. Script error: No such module "Citation/CS1".
  8. Script error: No such module "citation/CS1".
  9. Everett F. Carter, Jr., The Generation and Application of Random Numbers, Forth Dimensions (1994), Vol. 16, No. 1 & 2.
  10. The evaluation of 2Template:PiU1 is counted as one multiplication because the value of 2Template:Pi can be computed in advance and used repeatedly.

External links