Tukey's range test

Template:Short description Script error: No such module "Unsubst".Script error: No such module "Distinguish".

Tukey's range test, also known as Tukey's test, Tukey method, Tukey's honest significance test, or Tukey's HSD (honestly significant difference) test,^[1] is a single-step multiple comparison procedure and statistical test. It can be used to correctly interpret the statistical significance of the difference between means that have been selected for comparison because of their extreme values.

The method was initially developed and introduced by John Tukey for use in Analysis of Variance (ANOVA), and usually has only been taught in connection with ANOVA. However, the studentized range distribution used to determine the level of significance of the differences considered in Tukey's test has vastly broader application: It is useful for researchers who have searched their collected data for remarkable differences between groups, but then cannot validly determine how significant their discovered stand-out difference is using standard statistical distributions used for other conventional statistical tests, for which the data must have been selected at random. Since when stand-out data is compared it was by definition not selected at random, but rather specifically chosen because it was extreme, it needs a different, stricter interpretation provided by the likely frequency and size of the studentized range; the modern practice of "data mining" is an example where it is used.

Development

The test is named after John Tukey,^[2] it compares all possible pairs of means, and is based on a studentized range distribution (Template:Mvar) (this distribution is similar to the distribution of Template:Mvar from the [[t-test|Template:Mvar-test]]. See below).^[3]Template:Full citation

Tukey's test compares the means of every treatment to the means of every other treatment; that is, it applies simultaneously to the set of all pairwise comparisons

μ_{i} - μ_{j},

and identifies any difference between two means that is greater than the expected standard error. The confidence coefficient for the set, when all sample sizes are equal, is exactly $1 - α$ for any $α : 0 \leq α \leq 1 .$ For unequal sample sizes, the confidence coefficient is greater than $1 - α .$ In other words, the Tukey method is conservative when there are unequal sample sizes.

This test is often followed by the Compact Letter Display (CLD) statistical procedure to render the output of this test more transparent to non-statistician audiences.

Assumptions

The observations being tested are independent within and among the groups.Script error: No such module "Unsubst".
The subgroups associated with each mean in the test are normally distributed.Script error: No such module "Unsubst".
There is equal within-subgroup variance across the subgroups associated with each mean in the test (homogeneity of variance).Script error: No such module "Unsubst".

The test statistic

Tukey's test is based on a formula very similar to that of the [[student's t test|Template:Mvar-test]]. In fact, Tukey's test is essentially a Template:Mvar-test, except that it corrects for family-wise error rate.

The formula for Tukey's test is

q_{s} = \frac{| Y_{A} - Y_{B} |}{S E},

where Template:Mvar_A and Template:Mvar_B are the two means being compared, and SE is the standard error for the sum of the means. The value Template:Mvar_s is the sample's test statistic. (The notation $Template:Abs$ Script error: No such module "Check for unknown parameters". means the absolute value of Template:Mvar; the magnitude of Template:Mvar with the sign set to $+$ Script error: No such module "Check for unknown parameters"., regardless of the original sign of Template:Mvar.)

This Template:Mvar_s test statistic can then be compared to a Template:Mvar value for the chosen significance level Template:Mvar from a table of the studentized range distribution. If the Template:Mvar_s value is larger than the critical value Template:Mvar obtained from the distribution, the two means are said to be significantly different at level $α : 0 \leq α \leq 1 .$ ^[3]

Since the null hypothesis for Tukey's test states that all means being compared are from the same population (i.e. $μ 1 = μ 2 = μ 3 = ... = μ k$ Script error: No such module "Check for unknown parameters".), the means should be normally distributed (according to the central limit theorem) with the same model standard deviation Template:Mvar, estimated by the merged standard error, $S E,$ for all the samples; its calculation is discussed in the following sections. This gives rise to the normality assumption of Tukey's test.

The studentized range (Template:Mvar) distribution

The Tukey method uses the studentized range distribution. Suppose that we take a sample of size Template:Mvar from each of Template:Mvar populations with the same normal distribution $N (μ, σ 2)$ Script error: No such module "Check for unknown parameters". and suppose that ${\bar{y}}_{m i n}$ is the smallest of these sample means and ${\bar{y}}_{m a x}$ is the largest of these sample means, and suppose Template:Mvar² is the pooled sample variance from these samples. Then the following random variable has a Studentized range distribution:

q \equiv \frac{{\overline{y}}_{m a x} - {\overline{y}}_{m i n}}{S \sqrt{2 / n}}

This definition of the statistic Template:Mvar given above is the basis of the critically significant value for Template:Mvar discussed below, and is based on these three factors:

α

the Type I error rate, or the probability of rejecting a true null hypothesis;

k

the number of sub-populations being compared;

d f

the number of degrees of freedom for each mean

( df = $N - k$ Script error: No such module "Check for unknown parameters". ) where Template:Mvar is the total number of observations.)

The distribution of Template:Mvar has been tabulated and appears in many textbooks on statistics. In some tables the distribution of Template:Mvar has been tabulated without the $\sqrt{2}$ factor. To understand which table it is, we can compute the result for $k = 2$ Script error: No such module "Check for unknown parameters". and compare it to the result of the Student's t-distribution with the same degrees of freedom and the same Template:Mvar . In addition, R offers a cumulative distribution function (ptukey) and a quantile function (qtukey) for Template:Mvar .

Confidence limits

The Tukey confidence limits for all pairwise comparisons with confidence coefficient of at least $1 - α$ Script error: No such module "Check for unknown parameters". are

{\bar{y}}_{i ∙} - {\bar{y}}_{j ∙} \pm \frac{q_{α; k; N - k}}{\sqrt{2}} {\hat{σ}}_{ε} \sqrt{\frac{2}{n}} : i, j = 1, \dots, k i \neq j .

Notice that the point estimator and the estimated variance are the same as those for a single pairwise comparison. The only difference between the confidence limits for simultaneous comparisons and those for a single comparison is the multiple of the estimated standard deviation.

Also note that the sample sizes must be equal when using the studentized range approach. ${\hat{σ}}_{ε}$ is the standard deviation of the entire design, not just that of the two groups being compared. It is possible to work with unequal sample sizes. In this case, one has to calculate the estimated standard deviation for each pairwise comparison as formalized by Clyde Kramer in 1956, so the procedure for unequal sample sizes is sometimes referred to as the Tukey–Kramer method which is as follows:

{\bar{y}}_{i ∙} - {\bar{y}}_{j ∙} \pm \frac{q_{α; k; N - k}}{\sqrt{2}} {\hat{σ}}_{ε} \sqrt{\frac{1}{n_{i}} + \frac{1}{n_{j}}}

where $n i$ Script error: No such module "Check for unknown parameters". and $n j$ Script error: No such module "Check for unknown parameters". are the sizes of groups Template:Mvar and Template:Mvar respectively. The degrees of freedom for the whole design is also applied.

Comparing ANOVA and Tukey–Kramer tests

Both ANOVA and Tukey–Kramer tests are based on the same assumptions. However, these two tests for Template:Mvar groups (i.e. $μ 1 = μ 2 = ... = μ k$ Script error: No such module "Check for unknown parameters".) may result in logical contradictions when $k > 2$ Script error: No such module "Check for unknown parameters". , even if the assumptions do hold.

It is possible to generate a set of pseudorandom samples of strictly negative measure such that hypothesis $μ 1 = μ 2$ Script error: No such module "Check for unknown parameters". is rejected at significance level $1 - α > 0.95$ while $μ 1 = μ 2 = μ 3$ Script error: No such module "Check for unknown parameters". is not rejected even at $1 - α = 0.975 .$ ^[4]

References

↑ Script error: No such module "citation/CS1".
Also occasionally described as "honestly", see e.g.
Script error: No such module "Citation/CS1".
↑ Script error: No such module "Citation/CS1".
↑ ^a ^b Script error: No such module "citation/CS1".
↑ Script error: No such module "Citation/CS1".

Script error: No such module "Check for unknown parameters".

External links

Script error: No such module "citation/CS1".

[Vassar-1] Script error: No such module "citation/CS1".
Also occasionally described as "honestly", see e.g.
Script error: No such module "Citation/CS1".

[2] Script error: No such module "Citation/CS1".

[Calgary-3] Script error: No such module "citation/CS1".

[GurvichNaumova-4] Script error: No such module "Citation/CS1".

[1]

[2]

[3]

[4]

Tukey's range test

Contents

Development

Assumptions

The test statistic

The studentized range (Template:Mvar) distribution

Confidence limits

Comparing ANOVA and Tukey–Kramer tests

See also

References

Further reading

External links

Navigation menu

Tukey's range test

Development

Assumptions

The test statistic

The studentized range (Template:Mvar) distribution

Confidence limits

Comparing ANOVA and Tukey–Kramer tests

See also

References

Further reading

External links

Navigation menu

Search