Range (statistics): Difference between revisions
imported>WikiCleanerBot m v2.05b - Bot T20 CW#61 - Fix errors for CW project (Reference before punctuation) |
imported>SchlurcherBot m Bot: http → https |
||
| Line 16: | Line 16: | ||
===Distribution=== | ===Distribution=== | ||
The range, T, has the cumulative distribution function<ref name="gumbel">{{cite journal | author = E. J. Gumbel | author-link = E. J. Gumbel | year = 1947 | title = The Distribution of the Range | journal = The Annals of Mathematical Statistics | volume = 18 | issue = 3 | pages = 384–412 | jstor = 2235736 | doi=10.1214/aoms/1177730387| doi-access = free }}</ref><ref name="tsimashenka">{{Cite book | last1 = Tsimashenka | first1 = I. | last2 = Knottenbelt | first2 = W. | last3 = Harrison | first3 = P. | author-link3 = Peter G. Harrison| doi = 10.1007/978-3-642-30782-9_12 | chapter = Controlling Variability in Split-Merge Systems | title = Analytical and Stochastic Modeling Techniques and Applications | series = Lecture Notes in Computer Science | volume = 7314 | pages = 165 | year = 2012 | isbn = 978-3-642-30781-2 | url = | The range, T, has the cumulative distribution function<ref name="gumbel">{{cite journal | author = E. J. Gumbel | author-link = E. J. Gumbel | year = 1947 | title = The Distribution of the Range | journal = The Annals of Mathematical Statistics | volume = 18 | issue = 3 | pages = 384–412 | jstor = 2235736 | doi=10.1214/aoms/1177730387| doi-access = free }}</ref><ref name="tsimashenka">{{Cite book | last1 = Tsimashenka | first1 = I. | last2 = Knottenbelt | first2 = W. | last3 = Harrison | first3 = P. | author-link3 = Peter G. Harrison| doi = 10.1007/978-3-642-30782-9_12 | chapter = Controlling Variability in Split-Merge Systems | title = Analytical and Stochastic Modeling Techniques and Applications | series = Lecture Notes in Computer Science | volume = 7314 | pages = 165 | year = 2012 | isbn = 978-3-642-30781-2 | url = https://www.doc.ic.ac.uk/~wjk/publications/tsimashenka-knottenbelt-harrison-asmta-2012.pdf}}</ref> | ||
::<math>F(t)= n \int_{-\infty}^\infty g(x)[G(x+t)-G(x)]^{n-1} \, \text{d}x.</math> | ::<math>F(t)= n \int_{-\infty}^\infty g(x)[G(x+t)-G(x)]^{n-1} \, \text{d}x.</math> | ||
[[Emil Julius Gumbel|Gumbel]] notes that the "beauty of this formula is completely marred by the facts that, in general, we cannot express ''G''(''x'' + ''t'') by ''G''(''x''), and that the numerical integration is lengthy and tiresome."{{R|gumbel|p=385 <!-- (PDF p. 2) -->}} | [[Emil Julius Gumbel|Gumbel]] notes that the "beauty of this formula is completely marred by the facts that, in general, we cannot express ''G''(''x'' + ''t'') by ''G''(''x''), and that the numerical integration is lengthy and tiresome."{{R|gumbel|p=385 <!-- (PDF p. 2) -->}} | ||
| Line 24: | Line 24: | ||
===Moments=== | ===Moments=== | ||
The mean range is given by<ref>{{cite journal | author1 = H. O. Hartley | author-link1 = H. O. Hartley | author2 = H. A. David | year = 1954 | title = Universal Bounds for Mean Range and Extreme Observation | journal = The Annals of Mathematical Statistics | volume = 25 | issue = 1 | pages = 85–99 | jstor = 2236514 | doi=10.1214/aoms/1177728848| doi-access = free }}</ref> | The mean range is given by<ref>{{cite journal | author1 = H. O. Hartley | author-link1 = H. O. Hartley | author2 = H. A. David | author-link2=Herbert A. David | year = 1954 | title = Universal Bounds for Mean Range and Extreme Observation | journal = The Annals of Mathematical Statistics | volume = 25 | issue = 1 | pages = 85–99 | jstor = 2236514 | doi=10.1214/aoms/1177728848| doi-access = free }}</ref> | ||
::<math>n \int_0^1 x(G)[G^{n-1}-(1-G)^{n-1}] \,\text{d}G</math> | ::<math>n \int_0^1 x(G)[G^{n-1}-(1-G)^{n-1}] \,\text{d}G</math> | ||
where ''x''(''G'') is the inverse function. In the case where each of the ''X''<sub>''i''</sub> has a [[standard normal distribution]], the mean range is given by<ref>{{cite journal | author = L. H. C. Tippett | author-link = L. H. C. Tippett | year = 1925 | title = On the Extreme Individuals and the Range of Samples Taken from a Normal Population | journal = Biometrika | volume = 17 | issue = 3/4 | pages = 364–387 | jstor = 2332087 | doi=10.1093/biomet/17.3-4.364}}</ref> | where ''x''(''G'') is the inverse function. In the case where each of the ''X''<sub>''i''</sub> has a [[standard normal distribution]], the mean range is given by<ref>{{cite journal | author = L. H. C. Tippett | author-link = L. H. C. Tippett | year = 1925 | title = On the Extreme Individuals and the Range of Samples Taken from a Normal Population | journal = Biometrika | volume = 17 | issue = 3/4 | pages = 364–387 | jstor = 2332087 | doi=10.1093/biomet/17.3-4.364}}</ref> | ||
Latest revision as of 07:19, 17 November 2025
Template:Short description Script error: No such module "Distinguish". Script error: No such module "other uses".
In descriptive statistics, the range of a set of data is size of the narrowest interval which contains all the data. It is calculated as the difference between the largest and smallest values (also known as the sample maximum and minimum).[1] It is expressed in the same units as the data.
The range provides an indication of statistical dispersion. Closely related alternative measures are the Interdecile range and the Interquartile range.
Range of continuous IID random variables
For n independent and identically distributed continuous random variables X1, X2, ..., Xn with the cumulative distribution function G(x) and a probability density function g(x), let T denote the range of them, that is, T= max(X1, X2, ..., Xn)- min(X1, X2, ..., Xn).
Distribution
The range, T, has the cumulative distribution function[2][3]
Gumbel notes that the "beauty of this formula is completely marred by the facts that, in general, we cannot express G(x + t) by G(x), and that the numerical integration is lengthy and tiresome."Template:R
If the distribution of each Xi is limited to the right (or left) then the asymptotic distribution of the range is equal to the asymptotic distribution of the largest (smallest) value. For more general distributions the asymptotic distribution can be expressed as a Bessel function.[2]
Moments
The mean range is given by[4]
where x(G) is the inverse function. In the case where each of the Xi has a standard normal distribution, the mean range is given by[5]
Derivation of the distribution
Please note that the following is an informal derivation of the result. It is a bit loose with the calculation of the probabilities.
Let denote respectively the min and max of the random variables .
The event that the range is smaller than can be decomposed into smaller events according to:
- the index of the minimum value
- and the value of the minimum.
For a given index and minimum value , the probability of the joint event:
- is the minimum,
- and ,
- and the range is smaller than ,
is:Summing over the indices and integrating over yields the total probability of the event: "the range is smaller than " which is exactly the cumulative density function of the range:which concludes the proof.
The range in other models
Outside of the IID case with continuous random variables, other cases have explicit formulas. These cases are of marginal interest.
- non-IID continuous random variables.[3]
- Discrete variables supported on .[6][7] A key difficulty for discrete variables is that the range is discrete. This makes the derivation of the formula require combinatorics.
Related quantities
The range is a specific example of order statistics. In particular, the range is a linear function of order statistics, which brings it into the scope of L-estimation.
See also
Script error: No such module "Portal".
References
- ↑ Script error: No such module "citation/CS1".
- ↑ a b Script error: No such module "Citation/CS1".
- ↑ a b Script error: No such module "citation/CS1".
- ↑ Script error: No such module "Citation/CS1".
- ↑ Script error: No such module "Citation/CS1".
- ↑ Script error: No such module "Citation/CS1".
- ↑ Script error: No such module "Citation/CS1".