Concentration of measure: Difference between revisions

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
imported>Sir Ibee
Open access status updates in citations with OAbot #oabot
 
 
Line 12: Line 12:
| doi=10.1214/aop/1042644705 | doi-access=free}}</ref>
| doi=10.1214/aop/1042644705 | doi-access=free}}</ref>


The concentration of measure phenomenon was put forth in the early 1970s by [[Vitali Milman]] in his works on the local theory of [[Banach space]]s, extending an idea going back to the work of [[Paul Lévy (mathematician)|Paul Lévy]].<ref>"''The concentration of <math>f_\ast(\mu)</math>, ubiquitous in the probability theory and statistical mechanics, was brought to geometry (starting from Banach spaces) by Vitali Milman, following the earlier work by Paul Lévy''" - [[Mikhail Gromov (mathematician)|M. Gromov]], Spaces and questions, GAFA 2000 (Tel Aviv, 1999), Geom. Funct. Anal. 2000, Special Volume, Part I, 118&ndash;161.</ref><ref>"''The idea of concentration of measure (which was discovered by V.Milman) is arguably one of the great ideas of analysis in our times. While its impact on Probability is only a small part of the whole picture, this impact should not be ignored.''" - [[Michel Talagrand|M. Talagrand]], A new look at independence, Ann. Probab. 24 (1996), no. 1, 1&ndash;34.</ref> It was further developed in the works of Milman and [[Mikhail Gromov (mathematician)|Gromov]], [[Bernard Maurey|Maurey]], [[Gilles Pisier|Pisier]], [[Gideon Schechtman|Schechtman]], [[Michel Talagrand|Talagrand]], [[Michel Ledoux|Ledoux]], and others.
The concentration of measure phenomenon was put forth in the early 1970s by [[Vitali Milman]] in his works on the local theory of [[Banach space]]s, extending an idea going back to the work of [[Paul Lévy (mathematician)|Paul Lévy]].<ref>"''The concentration of <math>f_\ast(\mu)</math>, ubiquitous in the probability theory and statistical mechanics, was brought to geometry (starting from Banach spaces) by Vitali Milman, following the earlier work by Paul Lévy''" [[Mikhail Gromov (mathematician)|M. Gromov]], Spaces and questions, GAFA 2000 (Tel Aviv, 1999), Geom. Funct. Anal. 2000, Special Volume, Part I, 118&ndash;161.</ref><ref>"''The idea of concentration of measure (which was discovered by V.Milman) is arguably one of the great ideas of analysis in our times. While its impact on Probability is only a small part of the whole picture, this impact should not be ignored.''" [[Michel Talagrand|M. Talagrand]], A new look at independence, Ann. Probab. 24 (1996), no. 1, 1&ndash;34.</ref> It was further developed in the works of Milman and [[Mikhail Gromov (mathematician)|Gromov]], [[Bernard Maurey|Maurey]], [[Gilles Pisier|Pisier]], [[Gideon Schechtman|Schechtman]], [[Michel Talagrand|Talagrand]], [[Michel Ledoux|Ledoux]], and others.


==The general setting==
==The general setting==


Let <math>(X, d)</math> be a [[metric space]] with a [[Measure (mathematics)|measure]] <math>\mu</math> on the [[Borel set|Borel sets]] with <math>\mu(X) = 1</math>.
Let <math>(X, d)</math> be a [[metric space]] with a [[Measure (mathematics)|measure]] <math>\mu</math> on the [[Borel set]]s with <math>\mu(X) = 1</math>.
Let  
Let  
:<math>\alpha(\epsilon) = \sup \left\{\mu( X \setminus A_\epsilon) \, | A \mbox{ is a Borel set and} \, \mu(A) \geq 1/2 \right\},</math>
:<math>\alpha(\varepsilon) = \sup \left\{\mu( X \smallsetminus A_\varepsilon) \mid A \text{ is a Borel set and } \mu(A) \geq 1/2 \right\},</math>
where  
where  
:<math>A_\epsilon = \left\{ x \, | \, d(x, A) < \epsilon \right\} </math>
:<math>A_\varepsilon = \left\{ x \mid d(x, A) < \varepsilon \right\} </math>
is the <math>\epsilon</math>-''extension'' (also called <math>\epsilon</math>-fattening in the context of [[Hausdorff_distance#Definition|the Hausdorff distance]]) of a set <math>A</math>.
is the <math>\varepsilon</math>-''extension'' (also called <math>\varepsilon</math>-fattening in the context of [[Hausdorff_distance#Definition|the Hausdorff distance]]) of a set <math>A</math>.


The function <math>\alpha(\cdot)</math> is called the ''concentration rate'' of the space <math>X</math>. The following equivalent definition has many applications:
The function <math>\alpha(\cdot)</math> is called the ''concentration rate'' of the space <math>X</math>. The following equivalent definition has many applications:
:<math>\alpha(\epsilon) = \sup \left\{ \mu( \{ F \geq \mathop{M} + \epsilon \}) \right\},</math>
:<math>\alpha(\varepsilon) = \sup \left\{ \mu( \{ F \geq M + \varepsilon \}) \right\},</math>
where the supremum is over all 1-Lipschitz functions <math>F: X \to \mathbb{R}</math>, and  
where the supremum is over all 1-Lipschitz functions <math>F: X \to \mathbb{R}</math>, and  
the median (or Levy mean) <math> M = \mathop{\mathrm{Med}} F </math> is defined by the inequalities
the median (or Levy mean) <math> M = \operatorname{Med} F </math> is defined by the inequalities
:<math>\mu \{ F \geq M \} \geq 1/2, \, \mu \{ F \leq M \} \geq 1/2.</math>
:<math>\mu \{ F \geq M \} \geq 1/2, \, \mu \{ F \leq M \} \geq 1/2.</math>


Informally, the space <math>X</math> exhibits a concentration phenomenon if
Informally, the space <math>X</math> exhibits a concentration phenomenon if
<math>\alpha(\epsilon)</math> decays very fast as <math>\epsilon</math> grows. More formally,
<math>\alpha(\varepsilon)</math> decays very fast as <math>\varepsilon</math> grows. More formally,
a family of metric measure spaces <math>(X_n, d_n, \mu_n)</math> is called a ''Lévy family'' if
a family of metric measure spaces <math>(X_n, d_n, \mu_n)</math> is called a ''Lévy family'' if
the corresponding concentration rates <math>\alpha_n</math> satisfy
the corresponding concentration rates <math>\alpha_n</math> satisfy
:<math>\forall \epsilon > 0 \,\, \alpha_n(\epsilon) \to 0 {\rm \;as\; } n\to \infty,</math>
:<math>\forall \varepsilon > 0 \,\, \alpha_n(\varepsilon) \to 0 \text{ as } n\to \infty,</math>
and a ''normal Lévy family'' if
and a ''normal Lévy family'' if
:<math>\forall \epsilon > 0 \,\, \alpha_n(\epsilon) \leq C \exp(-c n \epsilon^2)</math>
:<math>\forall \varepsilon > 0 \,\, \alpha_n(\varepsilon) \leq C \exp(-c n \varepsilon^2)</math>
for some constants <math>c,C>0</math>. For examples see below.
for some constants <math>c,C>0</math>. For examples see below.


Line 41: Line 41:


The first example goes back to [[Paul Lévy (mathematician)|Paul Lévy]]. According to the [[spherical isoperimetric inequality]], among all subsets <math>A</math> of the sphere <math>S^n</math> with prescribed [[spherical measure]] <math>\sigma_n(A)</math>, the spherical cap
The first example goes back to [[Paul Lévy (mathematician)|Paul Lévy]]. According to the [[spherical isoperimetric inequality]], among all subsets <math>A</math> of the sphere <math>S^n</math> with prescribed [[spherical measure]] <math>\sigma_n(A)</math>, the spherical cap
:<math> \left\{ x \in S^n | \mathrm{dist}(x, x_0) \leq R \right\}, </math>
:<math> \left\{ x \in S^n \mid \operatorname{dist}(x, x_0) \leq R \right\}, </math>
for suitable <math>R</math>, has the smallest <math>\epsilon</math>-extension <math>A_\epsilon</math> (for any <math>\epsilon > 0</math>).
for suitable <math>R</math>, has the smallest <math>\varepsilon</math>-extension <math>A_\varepsilon</math> (for any <math>\varepsilon > 0</math>).


Applying this to sets of measure <math>\sigma_n(A) = 1/2</math> (where  
Applying this to sets of measure <math>\sigma_n(A) = 1/2</math> (where  
<math>\sigma_n(S^n) = 1</math>), one can deduce the following [[concentration inequality]]:
<math>\sigma_n(S^n) = 1</math>), one can deduce the following [[concentration inequality]]:
:<math>\sigma_n(A_\epsilon) \geq 1 - C \exp(- c n \epsilon^2) </math>,
:<math>\sigma_n(A_\varepsilon) \geq 1 - C \exp(- c n \varepsilon^2), </math>
where <math>C,c</math> are universal constants. Therefore <math>(S^n)_n</math> meet the definition above of a normal Lévy family.
where <math>C,c</math> are universal constants. Therefore <math>(S^n)_n</math> meet the definition above of a normal Lévy family.


Line 54: Line 54:


All classical statistical physics is based on the concentration of measure phenomena:
All classical statistical physics is based on the concentration of measure phenomena:
The fundamental idea (‘theorem’) about equivalence of ensembles in thermodynamic limit ([[Josiah Willard Gibbs|Gibbs]], 1902<ref>{{cite book |last= Gibbs |first= Josiah Willard |date=1902 |title= Elementary Principles in Statistical Mechanics |url= https://www-liphy.ujf-grenoble.fr/pagesperso/bahram/Phys_Stat/Biblio/gibbs_1902.pdf |location= New York, NY |publisher= Charles Scribner's Sons |page= <!-- or pages= --> }}</ref> and [[Albert Einstein|Einstein]], 1902-1904<ref>{{cite journal | last = Einstein | first = Albert | title = Kinetische Theorie des Wärmegleichgewichtes und des zweiten Hauptsatzes der Thermodynamik [Kinetic Theory of Thermal Equilibrium and of the Second Law of Thermodynamics]| journal = Annalen der Physik |series=Series 4| volume = 9 | pages = 417–433| date = 1902| url = http://myweb.rz.uni-augsburg.de/~eckern/adp/history/einstein-papers/1902_9_417-433.pdf| doi = 10.1002/andp.19023141007| access-date = 21 January 2020 }}</ref><ref>{{cite journal | last = Einstein | first = Albert | title = Eine Theorie der Grundlagen der Thermodynamik [A Theory of the Foundations of Thermodynamics]| journal = Annalen der Physik |series=Series 4| volume = 11 | pages = 417–433| date = 1904 | url = http://myweb.rz.uni-augsburg.de/~eckern/adp/history/einstein-papers/1904_14_354-362.pdf | access-date = 21 January 2020 }}</ref><ref>{{cite journal | last = Einstein | first = Albert | title = Allgemeine molekulare Theorie der Wärme [On the General Molecular Theory of Heat]| journal = Annalen der Physik |series=Series 4| volume = 14 | pages = 354–362| date = 1904 | url = http://myweb.rz.uni-augsburg.de/~eckern/adp/history/einstein-papers/1904_14_354-362.pdf | doi = 10.1002/andp.19043190707| access-date = 21 January 2020}}</ref>) is exactly the thin shell concentration theorem. For each mechanical system consider the [[phase space]] equipped by the invariant [[Liouville measure]] (the phase volume) and conserving energy ''E''. The [[microcanonical ensemble]] is just an invariant distribution over the surface of constant energy E obtained by Gibbs as the limit of distributions in [[phase space]] with constant density in thin layers between the surfaces of states with energy ''E'' and with energy ''E+ΔE''. The [[canonical ensemble]] is given by the probability density in the phase space (with respect to the phase volume)
The fundamental idea (‘theorem’) about equivalence of ensembles in thermodynamic limit ([[Josiah Willard Gibbs|Gibbs]], 1902<ref>{{cite book |last= Gibbs |first= Josiah Willard |date=1902 |title= Elementary Principles in Statistical Mechanics |url= https://www-liphy.ujf-grenoble.fr/pagesperso/bahram/Phys_Stat/Biblio/gibbs_1902.pdf |location= New York, NY |publisher= Charles Scribner's Sons |page= <!-- or pages= --> }}</ref> and [[Albert Einstein|Einstein]], 1902–1904<ref>{{cite journal | last = Einstein | first = Albert | title = Kinetische Theorie des Wärmegleichgewichtes und des zweiten Hauptsatzes der Thermodynamik [Kinetic Theory of Thermal Equilibrium and of the Second Law of Thermodynamics]| journal = Annalen der Physik |series=Series 4| volume = 9 | pages = 417–433| date = 1902| url = http://myweb.rz.uni-augsburg.de/~eckern/adp/history/einstein-papers/1902_9_417-433.pdf| doi = 10.1002/andp.19023141007| access-date = 21 January 2020 }}</ref><ref>{{cite journal | last = Einstein | first = Albert | title = Eine Theorie der Grundlagen der Thermodynamik [A Theory of the Foundations of Thermodynamics]| journal = Annalen der Physik |series=Series 4| volume = 11 | pages = 417–433| date = 1904 | url = http://myweb.rz.uni-augsburg.de/~eckern/adp/history/einstein-papers/1904_14_354-362.pdf | access-date = 21 January 2020 }}</ref><ref>{{cite journal | last = Einstein | first = Albert | title = Allgemeine molekulare Theorie der Wärme [On the General Molecular Theory of Heat]| journal = Annalen der Physik |series=Series 4| volume = 14 | pages = 354–362| date = 1904 | url = http://myweb.rz.uni-augsburg.de/~eckern/adp/history/einstein-papers/1904_14_354-362.pdf | doi = 10.1002/andp.19043190707| access-date = 21 January 2020}}</ref>) is exactly the thin shell concentration theorem. For each mechanical system consider the [[phase space]] equipped by the invariant [[Liouville measure]] (the phase volume) and conserving energy ''E''. The [[microcanonical ensemble]] is just an invariant distribution over the surface of constant energy E obtained by Gibbs as the limit of distributions in [[phase space]] with constant density in thin layers between the surfaces of states with energy ''E'' and with energy ''E''&nbsp;+&nbsp;Δ''E''. The [[canonical ensemble]] is given by the probability density in the phase space (with respect to the phase volume)
<math>\rho = e^{\frac{F - E}{k T}},</math>
<math>\rho = e^{(F - E)/(kT)},</math>
where quantities F=const and T=const are defined by the conditions of probability normalisation and the given expectation of energy ''E''.
where quantities ''F''&nbsp;=&nbsp;constant and ''T''&nbsp;=&nbsp;constant are defined by the conditions of probability normalisation and the given expectation of energy&nbsp;''E''.


When the number of particles is large, then the difference between average values of the macroscopic variables for the canonical and microcanonical ensembles tends to zero, and their [[Thermal fluctuations |fluctuations]] are explicitly evaluated. These results are proven rigorously under some regularity conditions on the energy function ''E'' by [[Aleksandr Khinchin|Khinchin]] (1943).<ref>{{cite book |last= Khinchin |first= Aleksandr Y. |date=1949 |title= Mathematical foundations of statistical mechanics [English translation from the Russian edition, Moscow, Leningrad, 1943]|url= https://books.google.com/books?id=D7oEAAAAMAAJ |location= New York, NY |publisher= Courier Corporation  |page= <!-- or pages= --> | access-date = 21 January 2020}}</ref>
When the number of particles is large, then the difference between average values of the macroscopic variables for the canonical and microcanonical ensembles tends to zero, and their [[Thermal fluctuations |fluctuations]] are explicitly evaluated. These results are proven rigorously under some regularity conditions on the energy function ''E'' by [[Aleksandr Khinchin|Khinchin]] (1943).<ref>{{cite book |last= Khinchin |first= Aleksandr Y. |date=1949 |title= Mathematical foundations of statistical mechanics [English translation from the Russian edition, Moscow, Leningrad, 1943]|url= https://books.google.com/books?id=D7oEAAAAMAAJ |location= New York, NY |publisher= Courier Corporation  |page= <!-- or pages= --> | access-date = 21 January 2020}}</ref>
The simplest particular case when ''E'' is a sum of squares was well-known in detail before [[Aleksandr Khinchin|Khinchin]] and Lévy and even before Gibbs and Einstein. This is the [[Maxwell–Boltzmann distribution]] of the particle energy in ideal gas.
The simplest particular case when ''E'' is a sum of squares was well-known in detail before [[Aleksandr Khinchin|Khinchin]] and Lévy and even before Gibbs and Einstein. This is the [[Maxwell–Boltzmann distribution]] of the particle energy in ideal gas.


The microcanonical ensemble is very natural from the naïve physical point of view: this is just a natural equidistribution on the isoenergetic hypersurface. The canonical ensemble is very useful because of an important property: if a system consists of two non-interacting subsystems, i.e. if the energy ''E'' is the sum, <math>E=E_1(X_1)+E_2(X_2)</math>, where <math>X_1, X_2</math> are the states of the subsystems, then the equilibrium states of subsystems are independent, the equilibrium distribution of the system is the product of equilibrium distributions of the subsystems with the same T. The equivalence of these ensembles is the cornerstone of the mechanical foundations of thermodynamics.
The microcanonical ensemble is very natural from the naïve physical point of view: this is just a natural equidistribution on the isoenergetic hypersurface. The canonical ensemble is very useful because of an important property: if a system consists of two non-interacting subsystems, i.e. if the energy ''E'' is the sum, <math>E=E_1(X_1)+E_2(X_2)</math>, where <math>X_1, X_2</math> are the states of the subsystems, then the equilibrium states of subsystems are independent, the equilibrium distribution of the system is the product of equilibrium distributions of the subsystems with the same&nbsp;''T''. The equivalence of these ensembles is the cornerstone of the mechanical foundations of thermodynamics.


==Other examples==
==Other examples==
Line 69: Line 69:
* [[McDiarmid's inequality]]
* [[McDiarmid's inequality]]
* [[Talagrand's concentration inequality]]
* [[Talagrand's concentration inequality]]
*[[Asymptotic equipartition property]]
* [[Asymptotic equipartition property]]


==References==
==References==

Latest revision as of 23:30, 9 June 2025

Template:Use American English Template:Use mdy dates Template:Short description In mathematics, concentration of measure (about a median) is a principle that is applied in measure theory, probability and combinatorics, and has consequences for other fields such as Banach space theory. Informally, it states that "A random variable that depends in a Lipschitz way on many independent variables (but not too much on any of them) is essentially constant".[1]

The concentration of measure phenomenon was put forth in the early 1970s by Vitali Milman in his works on the local theory of Banach spaces, extending an idea going back to the work of Paul Lévy.[2][3] It was further developed in the works of Milman and Gromov, Maurey, Pisier, Schechtman, Talagrand, Ledoux, and others.

The general setting

Let (X,d) be a metric space with a measure μ on the Borel sets with μ(X)=1. Let

α(ε)=sup{μ(XAε)A is a Borel set and μ(A)1/2},

where

Aε={xd(x,A)<ε}

is the ε-extension (also called ε-fattening in the context of the Hausdorff distance) of a set A.

The function α() is called the concentration rate of the space X. The following equivalent definition has many applications:

α(ε)=sup{μ({FM+ε})},

where the supremum is over all 1-Lipschitz functions F:X, and the median (or Levy mean) M=MedF is defined by the inequalities

μ{FM}1/2,μ{FM}1/2.

Informally, the space X exhibits a concentration phenomenon if α(ε) decays very fast as ε grows. More formally, a family of metric measure spaces (Xn,dn,μn) is called a Lévy family if the corresponding concentration rates αn satisfy

ε>0αn(ε)0 as n,

and a normal Lévy family if

ε>0αn(ε)Cexp(cnε2)

for some constants c,C>0. For examples see below.

Concentration on the sphere

The first example goes back to Paul Lévy. According to the spherical isoperimetric inequality, among all subsets A of the sphere Sn with prescribed spherical measure σn(A), the spherical cap

{xSndist(x,x0)R},

for suitable R, has the smallest ε-extension Aε (for any ε>0).

Applying this to sets of measure σn(A)=1/2 (where σn(Sn)=1), one can deduce the following concentration inequality:

σn(Aε)1Cexp(cnε2),

where C,c are universal constants. Therefore (Sn)n meet the definition above of a normal Lévy family.

Vitali Milman applied this fact to several problems in the local theory of Banach spaces, in particular, to give a new proof of Dvoretzky's theorem.

Concentration of measure in physics

All classical statistical physics is based on the concentration of measure phenomena: The fundamental idea (‘theorem’) about equivalence of ensembles in thermodynamic limit (Gibbs, 1902[4] and Einstein, 1902–1904[5][6][7]) is exactly the thin shell concentration theorem. For each mechanical system consider the phase space equipped by the invariant Liouville measure (the phase volume) and conserving energy E. The microcanonical ensemble is just an invariant distribution over the surface of constant energy E obtained by Gibbs as the limit of distributions in phase space with constant density in thin layers between the surfaces of states with energy E and with energy E + ΔE. The canonical ensemble is given by the probability density in the phase space (with respect to the phase volume) ρ=e(FE)/(kT), where quantities F = constant and T = constant are defined by the conditions of probability normalisation and the given expectation of energy E.

When the number of particles is large, then the difference between average values of the macroscopic variables for the canonical and microcanonical ensembles tends to zero, and their fluctuations are explicitly evaluated. These results are proven rigorously under some regularity conditions on the energy function E by Khinchin (1943).[8] The simplest particular case when E is a sum of squares was well-known in detail before Khinchin and Lévy and even before Gibbs and Einstein. This is the Maxwell–Boltzmann distribution of the particle energy in ideal gas.

The microcanonical ensemble is very natural from the naïve physical point of view: this is just a natural equidistribution on the isoenergetic hypersurface. The canonical ensemble is very useful because of an important property: if a system consists of two non-interacting subsystems, i.e. if the energy E is the sum, E=E1(X1)+E2(X2), where X1,X2 are the states of the subsystems, then the equilibrium states of subsystems are independent, the equilibrium distribution of the system is the product of equilibrium distributions of the subsystems with the same T. The equivalence of these ensembles is the cornerstone of the mechanical foundations of thermodynamics.

Other examples

References

Template:Reflist

Further reading

  • Script error: No such module "citation/CS1".
  • Script error: No such module "Citation/CS1".

External links

  • Script error: No such module "citation/CS1". – blog post illustrating one of the implications of concentration of measure

 

  1. Script error: No such module "Citation/CS1".
  2. "The concentration of f(μ), ubiquitous in the probability theory and statistical mechanics, was brought to geometry (starting from Banach spaces) by Vitali Milman, following the earlier work by Paul Lévy" – M. Gromov, Spaces and questions, GAFA 2000 (Tel Aviv, 1999), Geom. Funct. Anal. 2000, Special Volume, Part I, 118–161.
  3. "The idea of concentration of measure (which was discovered by V.Milman) is arguably one of the great ideas of analysis in our times. While its impact on Probability is only a small part of the whole picture, this impact should not be ignored." – M. Talagrand, A new look at independence, Ann. Probab. 24 (1996), no. 1, 1–34.
  4. Script error: No such module "citation/CS1".
  5. Script error: No such module "Citation/CS1".
  6. Script error: No such module "Citation/CS1".
  7. Script error: No such module "Citation/CS1".
  8. Script error: No such module "citation/CS1".