Information geometry: Difference between revisions

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
imported>OAbot
m Open access bot: url-access updated in citation with #oabot.
imported>Citation bot
Added bibcode. | Use this bot. Report bugs. | Suggested by Headbomb | Linked from Wikipedia:WikiProject_Academic_Journals/Journals_cited_by_Wikipedia/Sandbox | #UCB_webform_linked 423/991
 
Line 16: Line 16:
Historically, information geometry can be traced back to the work of [[C. R. Rao]], who was the first to treat the [[Fisher matrix]] as a [[Riemannian metric]].<ref>{{cite journal |last=Rao |first=C. R. |year=1945 |title=Information and Accuracy Attainable in the Estimation of Statistical Parameters |journal=Bulletin of the Calcutta Mathematical Society |volume=37 |pages=81–91 }} Reprinted in {{cite book |title=Breakthroughs in Statistics |publisher=Springer |year=1992 |pages=235–247 |doi=10.1007/978-1-4612-0919-5_16 |s2cid=117034671 }}</ref><ref>{{cite book |first=F. |last=Nielsen |year=2013 |arxiv=1301.3578 |chapter=Cramér-Rao Lower Bound and Information Geometry |title=Connected at Infinity II: On the Work of Indian Mathematicians |series=Texts and Readings in Mathematics |editor-first=R. |editor1-last=Bhatia |editor2-first=C. S. |editor2-last=Rajan |volume=Special Volume of Texts and Readings in Mathematics (TRIM) |pages=18–37 |publisher=Hindustan Book Agency |doi=10.1007/978-93-86279-56-9_2 |isbn=978-93-80250-51-9 |s2cid=16759683 }}</ref> The modern theory is largely due to [[Shun'ichi Amari]], whose work has been greatly influential on the development of the field.<ref>{{cite journal |first=Shun'ichi |last=Amari | title=A foundation of information geometry | journal=Electronics and Communications in Japan |year=1983 |volume=66 |issue=6 |pages=1–10 |doi=10.1002/ecja.4400660602 | url=https://onlinelibrary.wiley.com/doi/abs/10.1002/ecja.4400660602|url-access=subscription }}</ref>
Historically, information geometry can be traced back to the work of [[C. R. Rao]], who was the first to treat the [[Fisher matrix]] as a [[Riemannian metric]].<ref>{{cite journal |last=Rao |first=C. R. |year=1945 |title=Information and Accuracy Attainable in the Estimation of Statistical Parameters |journal=Bulletin of the Calcutta Mathematical Society |volume=37 |pages=81–91 }} Reprinted in {{cite book |title=Breakthroughs in Statistics |publisher=Springer |year=1992 |pages=235–247 |doi=10.1007/978-1-4612-0919-5_16 |s2cid=117034671 }}</ref><ref>{{cite book |first=F. |last=Nielsen |year=2013 |arxiv=1301.3578 |chapter=Cramér-Rao Lower Bound and Information Geometry |title=Connected at Infinity II: On the Work of Indian Mathematicians |series=Texts and Readings in Mathematics |editor-first=R. |editor1-last=Bhatia |editor2-first=C. S. |editor2-last=Rajan |volume=Special Volume of Texts and Readings in Mathematics (TRIM) |pages=18–37 |publisher=Hindustan Book Agency |doi=10.1007/978-93-86279-56-9_2 |isbn=978-93-80250-51-9 |s2cid=16759683 }}</ref> The modern theory is largely due to [[Shun'ichi Amari]], whose work has been greatly influential on the development of the field.<ref>{{cite journal |first=Shun'ichi |last=Amari | title=A foundation of information geometry | journal=Electronics and Communications in Japan |year=1983 |volume=66 |issue=6 |pages=1–10 |doi=10.1002/ecja.4400660602 | url=https://onlinelibrary.wiley.com/doi/abs/10.1002/ecja.4400660602|url-access=subscription }}</ref>


Classically, information geometry considered a parametrized [[statistical model]] as a [[Riemannian manifold|Riemannian]], conjugate connection, statistical, and dually flat manifolds. Unlike usual smooth manifolds with tensor metric and Levi-Civita connection, these take into account conjugate connection, torsion, and Amari-Chentsov metric.<ref>{{Cite journal |last=Bauer |first=Martin |last2=Le Brigant |first2=Alice |last3=Lu |first3=Yuxiu |last4=Maor |first4=Cy |date=2024-02-10 |title=The $$L^p$$-Fisher–Rao metric and Amari–C̆encov $$\alpha $$-Connections |url=https://link.springer.com/article/10.1007/s00526-024-02660-5 |journal=Calculus of Variations and Partial Differential Equations |language=en |volume=63 |issue=2 |pages=56 |doi=10.1007/s00526-024-02660-5 |issn=1432-0835|arxiv=2306.14533 }}</ref>  All presented above geometric structures find application in [[information theory]] and [[machine learning]]. For such models, there is a natural choice of Riemannian metric, known as the [[Fisher information metric]]. In the special case that the statistical model is an [[exponential family]], it is possible to induce the statistical manifold with a Hessian metric (i.e a Riemannian metric given by the potential of a convex function). In this case, the manifold naturally inherits two flat [[affine connection]]s, as well as a canonical [[Bregman divergence]]. Historically, much of the work was devoted to studying the associated geometry of these examples. In the modern setting, information geometry applies to a much wider context, including non-exponential families, [[nonparametric statistics]], and even abstract statistical manifolds not induced from a known statistical model. The results combine techniques from [[information theory]], [[affine differential geometry]], [[convex analysis]] and many other fields. One of the most perspective information geometry approaches find applications in [[machine learning]]. For example, the developing of information-geometric optimization methods (mirror descent<ref>{{Cite journal |last1=Raskutti |first1=Garvesh |last2=Mukherjee |first2=Sayan |date=March 2015 |title=The Information Geometry of Mirror Descent |url=https://ieeexplore.ieee.org/document/7004065 |journal=IEEE Transactions on Information Theory |volume=61 |issue=3 |pages=1451–1457 |doi=10.1109/TIT.2015.2388583 |arxiv=1310.7780 |issn=0018-9448}}</ref> and natural gradient descent<ref>{{Cite journal |last1=Abdulkadirov |first1=Ruslan |last2=Lyakhov |first2=Pavel |last3=Nagornov |first3=Nikolay |date=January 2022 |title=Accelerating Extreme Search of Multidimensional Functions Based on Natural Gradient Descent with Dirichlet Distributions |journal=Mathematics |language=en |volume=10 |issue=19 |pages=3556 |doi=10.3390/math10193556 |doi-access=free |issn=2227-7390}}</ref>).  
Classically, information geometry considered a parametrized [[statistical model]] as a [[Riemannian manifold|Riemannian]], conjugate connection, statistical, and dually flat manifolds. Unlike usual smooth manifolds with tensor metric and Levi-Civita connection, these take into account conjugate connection, torsion, and Amari-Chentsov metric.<ref>{{Cite journal |last1=Bauer |first1=Martin |last2=Le Brigant |first2=Alice |last3=Lu |first3=Yuxiu |last4=Maor |first4=Cy |date=2024-02-10 |title=The $$L^p$$-Fisher–Rao metric and Amari–C̆encov $$\alpha $$-Connections |url=https://link.springer.com/article/10.1007/s00526-024-02660-5 |journal=Calculus of Variations and Partial Differential Equations |language=en |volume=63 |issue=2 |pages=56 |doi=10.1007/s00526-024-02660-5 |issn=1432-0835|arxiv=2306.14533 }}</ref>  All presented above geometric structures find application in [[information theory]] and [[machine learning]]. For such models, there is a natural choice of Riemannian metric, known as the [[Fisher information metric]]. In the special case that the statistical model is an [[exponential family]], it is possible to induce the statistical manifold with a Hessian metric (i.e a Riemannian metric given by the potential of a convex function). In this case, the manifold naturally inherits two flat [[affine connection]]s, as well as a canonical [[Bregman divergence]]. Historically, much of the work was devoted to studying the associated geometry of these examples. In the modern setting, information geometry applies to a much wider context, including non-exponential families, [[nonparametric statistics]], and even abstract statistical manifolds not induced from a known statistical model. The results combine techniques from [[information theory]], [[affine differential geometry]], [[convex analysis]] and many other fields. One of the most perspective information geometry approaches find applications in [[machine learning]]. For example, the developing of information-geometric optimization methods (mirror descent<ref>{{Cite journal |last1=Raskutti |first1=Garvesh |last2=Mukherjee |first2=Sayan |date=March 2015 |title=The Information Geometry of Mirror Descent |journal=IEEE Transactions on Information Theory |volume=61 |issue=3 |pages=1451–1457 |doi=10.1109/TIT.2015.2388583 |arxiv=1310.7780 |bibcode=2015ITIT...61.1451R |issn=0018-9448}}</ref> and natural gradient descent<ref>{{Cite journal |last1=Abdulkadirov |first1=Ruslan |last2=Lyakhov |first2=Pavel |last3=Nagornov |first3=Nikolay |date=January 2022 |title=Accelerating Extreme Search of Multidimensional Functions Based on Natural Gradient Descent with Dirichlet Distributions |journal=Mathematics |language=en |volume=10 |issue=19 |pages=3556 |doi=10.3390/math10193556 |doi-access=free |issn=2227-7390}}</ref>).  


The standard references in the field are Shun’ichi Amari and Hiroshi Nagaoka's book, ''Methods of Information Geometry'',<ref>{{cite book |first1=Shun'ichi |last1=Amari |first2=Hiroshi |last2=Nagaoka |title=Methods of Information Geometry |series=Translations of Mathematical Monographs |volume=191 |publisher=American Mathematical Society |year=2000 |isbn=0-8218-0531-2 }}</ref> and the more recent book by Nihat Ay and others.<ref>{{cite book |first1=Nihat |last1=Ay |first2=Jürgen |last2=Jost |author-link2=Jürgen Jost |first3=Hông Vân |last3=Lê |first4=Lorenz |last4=Schwachhöfer |title=Information Geometry |volume=64 |series=Ergebnisse der Mathematik und ihrer Grenzgebiete |publisher=Springer |year=2017 |isbn=978-3-319-56477-7 }}</ref> A gentle introduction is given in the survey by Frank Nielsen.<ref>{{cite journal|first=Frank |last=Nielsen |journal = Entropy | title=An Elementary Introduction to Information Geometry |date=2018 |url =https://www.mdpi.com/1099-4300/22/10/1100 | volume =22 | number =10 }}</ref> In 2018, the journal ''Information Geometry'' was released, which is devoted to the field.
The standard references in the field are Shun’ichi Amari and Hiroshi Nagaoka's book, ''Methods of Information Geometry'',<ref>{{cite book |first1=Shun'ichi |last1=Amari |first2=Hiroshi |last2=Nagaoka |title=Methods of Information Geometry |series=Translations of Mathematical Monographs |volume=191 |publisher=American Mathematical Society |year=2000 |isbn=0-8218-0531-2 }}</ref> and the more recent book by Nihat Ay and others.<ref>{{cite book |first1=Nihat |last1=Ay |first2=Jürgen |last2=Jost |author-link2=Jürgen Jost |first3=Hông Vân |last3=Lê |first4=Lorenz |last4=Schwachhöfer |title=Information Geometry |volume=64 |series=Ergebnisse der Mathematik und ihrer Grenzgebiete |publisher=Springer |year=2017 |isbn=978-3-319-56477-7 }}</ref> A gentle introduction is given in the survey by Frank Nielsen.<ref>{{cite journal|first=Frank |last=Nielsen |journal = Entropy | title=An Elementary Introduction to Information Geometry |date=2018 |url =https://www.mdpi.com/1099-4300/22/10/1100 | volume =22 | number =10 }}</ref> In 2018, the journal ''Information Geometry'' was released, which is devoted to the field.
Line 54: Line 54:
* Statistical inference <ref>{{cite book |first1=R. E. |last1=Kass |first2=P. W. |last2=Vos |year=1997 |title=Geometrical Foundations of Asymptotic Inference |series=Series in Probability and Statistics |publisher=Wiley |isbn=0-471-82668-5 }}</ref>
* Statistical inference <ref>{{cite book |first1=R. E. |last1=Kass |first2=P. W. |last2=Vos |year=1997 |title=Geometrical Foundations of Asymptotic Inference |series=Series in Probability and Statistics |publisher=Wiley |isbn=0-471-82668-5 }}</ref>
* Time series and linear systems
* Time series and linear systems
* Filtering problem<ref name="brigoieee">{{cite journal |last1=Brigo |first1=Damiano |last2=Hanzon |first2=Bernard | last3= LeGland | first3 = Francois | year=1998 |title=A differential geometric approach to nonlinear filtering: the projection filter |journal= IEEE Transactions on Automatic Control |volume= 43 |issue=2 |pages= 247–252 |doi=10.1109/9.661075 |url=https://hal.inria.fr/hal-02101519/file/rr-2598.pdf |author1-link=Damiano Brigo }}</ref>
* Filtering problem<ref name="brigoieee">{{cite journal |last1=Brigo |first1=Damiano |last2=Hanzon |first2=Bernard | last3= LeGland | first3 = Francois | year=1998 |title=A differential geometric approach to nonlinear filtering: the projection filter |journal= IEEE Transactions on Automatic Control |volume= 43 |issue=2 |pages= 247–252 |doi=10.1109/9.661075 |bibcode=1998ITAC...43..247B |url=https://hal.inria.fr/hal-02101519/file/rr-2598.pdf |author1-link=Damiano Brigo }}</ref>
* Quantum systems<ref name="handel">{{cite journal |last1=van Handel |first1=Ramon |last2=Mabuchi |first2=Hideo |year=2005 |title=Quantum projection filter for a highly nonlinear model in cavity QED | journal= Journal of Optics B: Quantum and Semiclassical Optics| volume= 7 |issue=10 |pages=S226–S236 |doi=10.1088/1464-4266/7/10/005 |arxiv=quant-ph/0503222 |bibcode=2005JOptB...7S.226V |s2cid=15292186 }}</ref>
* Quantum systems<ref name="handel">{{cite journal |last1=van Handel |first1=Ramon |last2=Mabuchi |first2=Hideo |year=2005 |title=Quantum projection filter for a highly nonlinear model in cavity QED | journal= Journal of Optics B: Quantum and Semiclassical Optics| volume= 7 |issue=10 |pages=S226–S236 |doi=10.1088/1464-4266/7/10/005 |arxiv=quant-ph/0503222 |bibcode=2005JOptB...7S.226V |s2cid=15292186 }}</ref>
* Neural networks<ref>{{cite journal |last1=Zlochin |first1=Mark |last2=Baram |first2=Yoram |year=2001 |title=Manifold Stochastic Dynamics for Bayesian Learning  | journal= Neural Computation| volume= 13 |issue=11 |pages=2549–2572 |doi=10.1162/089976601753196021 |pmid=11674851 | url=https://direct.mit.edu/neco/article-abstract/13/11/2549/6467/Manifold-Stochastic-Dynamics-for-Bayesian-Learning?redirectedFrom=fulltext|url-access=subscription }}</ref>
* Neural networks<ref>{{cite journal |last1=Zlochin |first1=Mark |last2=Baram |first2=Yoram |year=2001 |title=Manifold Stochastic Dynamics for Bayesian Learning  | journal= Neural Computation| volume= 13 |issue=11 |pages=2549–2572 |doi=10.1162/089976601753196021 |pmid=11674851 | url=https://direct.mit.edu/neco/article-abstract/13/11/2549/6467/Manifold-Stochastic-Dynamics-for-Bayesian-Learning?redirectedFrom=fulltext|url-access=subscription }}</ref>

Latest revision as of 00:43, 29 September 2025

Template:Short description

File:Normal Distribution PDF.svg
The set of all normal distributions forms a statistical manifold with hyperbolic geometry.

Information geometry is an interdisciplinary field that applies the techniques of differential geometry to study probability theory and statistics.[1] It studies statistical manifolds, which are Riemannian manifolds whose points correspond to probability distributions.

Introduction

Script error: No such module "Unsubst".

Historically, information geometry can be traced back to the work of C. R. Rao, who was the first to treat the Fisher matrix as a Riemannian metric.[2][3] The modern theory is largely due to Shun'ichi Amari, whose work has been greatly influential on the development of the field.[4]

Classically, information geometry considered a parametrized statistical model as a Riemannian, conjugate connection, statistical, and dually flat manifolds. Unlike usual smooth manifolds with tensor metric and Levi-Civita connection, these take into account conjugate connection, torsion, and Amari-Chentsov metric.[5] All presented above geometric structures find application in information theory and machine learning. For such models, there is a natural choice of Riemannian metric, known as the Fisher information metric. In the special case that the statistical model is an exponential family, it is possible to induce the statistical manifold with a Hessian metric (i.e a Riemannian metric given by the potential of a convex function). In this case, the manifold naturally inherits two flat affine connections, as well as a canonical Bregman divergence. Historically, much of the work was devoted to studying the associated geometry of these examples. In the modern setting, information geometry applies to a much wider context, including non-exponential families, nonparametric statistics, and even abstract statistical manifolds not induced from a known statistical model. The results combine techniques from information theory, affine differential geometry, convex analysis and many other fields. One of the most perspective information geometry approaches find applications in machine learning. For example, the developing of information-geometric optimization methods (mirror descent[6] and natural gradient descent[7]).

The standard references in the field are Shun’ichi Amari and Hiroshi Nagaoka's book, Methods of Information Geometry,[8] and the more recent book by Nihat Ay and others.[9] A gentle introduction is given in the survey by Frank Nielsen.[10] In 2018, the journal Information Geometry was released, which is devoted to the field.

Contributors

Script error: No such module "Unsubst". The history of information geometry is associated with the discoveries of at least the following people, and many others.

<templatestyles src="Div col/styles.css"/>

Applications

Script error: No such module "Unsubst".

As an interdisciplinary field, information geometry has been used in various applications.

Here an incomplete list:

  • Statistical inference [11]
  • Time series and linear systems
  • Filtering problem[12]
  • Quantum systems[13]
  • Neural networks[14]
  • Machine learning
  • Statistical mechanics
  • Biology
  • Statistics [15][16]
  • Mathematical finance [17]

See also

References

<templatestyles src="Reflist/styles.css" />

  1. Script error: No such module "Citation/CS1".
  2. Script error: No such module "Citation/CS1". Reprinted in Script error: No such module "citation/CS1".
  3. Script error: No such module "citation/CS1".
  4. Script error: No such module "Citation/CS1".
  5. Script error: No such module "Citation/CS1".
  6. Script error: No such module "Citation/CS1".
  7. Script error: No such module "Citation/CS1".
  8. Script error: No such module "citation/CS1".
  9. Script error: No such module "citation/CS1".
  10. Script error: No such module "Citation/CS1".
  11. Script error: No such module "citation/CS1".
  12. Script error: No such module "Citation/CS1".
  13. Script error: No such module "Citation/CS1".
  14. Script error: No such module "Citation/CS1".
  15. Script error: No such module "citation/CS1".
  16. Script error: No such module "citation/CS1".
  17. Script error: No such module "citation/CS1".

Script error: No such module "Check for unknown parameters".

External links

Template:Differentiable computing