Dendrogram: Difference between revisions

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
imported>AnomieBOT
m Dating maintenance tags: {{Merge to}}
 
imported>Krauss
add History and methods
 
Line 5: Line 5:
{{merge to|Tree structure|date=May 2025}}
{{merge to|Tree structure|date=May 2025}}


[[File:UPGMA Dendrogram Hierarchical.svg|thumb|Dendrogram of a hierarchical clustering (UPGMA) with the height of the nodes (adapted from bacterial 5S rRNA sequence data<ref name="Swofford1996">{{cite book|title=Molecular Systematics, 2nd edition|last1=Swofford|first1=David L.|last2=Olsen|first2=Gary J.|last3=Waddell|first3=Peter J.|last4=Hillis|first4=David M. | name-list-style = vanc |year=1996|isbn=9780878932825|editor-last1=Hillis|editor-first1=David M.|editor-last2=Moritz|editor-first2=Craig|editor-last3=Mable|editor-first3=Barbara K.|publisher=Sinauer|location=Sunderland, MA|pages=407–514|chapter=Phylogenetic inference}}</ref>).]]
[[File:UPGMA Dendrogram Hierarchical.svg|thumb|320px|Dendrogram of a hierarchical clustering (UPGMA) with the height of the nodes (adapted from bacterial 5S rRNA sequence data<ref name="Swofford1996">{{cite book|title=Molecular Systematics, 2nd edition|last1=Swofford|first1=David L.|last2=Olsen|first2=Gary J.|last3=Waddell|first3=Peter J.|last4=Hillis|first4=David M. | name-list-style = vanc |year=1996|isbn=9780878932825|editor-last1=Hillis|editor-first1=David M.|editor-last2=Moritz|editor-first2=Craig|editor-last3=Mable|editor-first3=Barbara K.|publisher=Sinauer|location=Sunderland, MA|pages=407–514|chapter=Phylogenetic inference}}</ref>).]]
[[File:Global-Diversity-of-Sponges-(Porifera)-pone.0035105.s008.tif|thumb|Dendrogram output for hierarchical clustering of marine provinces using presence / absence of sponge species.<ref name="VanSoest2012">{{Cite journal |vauthors=Van Soest R, Boury-Esnault N, Vacelet J, Dohrmann M, Erpenbeck D, De Voogd N, Santodomingo N, Vanhoorne B, Kelly M, Hooper J| title = Global Diversity of Sponges (Porifera) | doi = 10.1371/journal.pone.0035105 | journal = PLOS ONE | volume = 7 | issue = 4 | pages = e35105 | year = 2012 | pmid = 22558119 | pmc = 3338747 | bibcode = 2012PLoSO...735105V| doi-access = free }}
</ref>]]
[[File:Phylogenetic tree.svg|thumb|A dendrogram of the [[Tree of Life]]. This phylogenetic tree is adapted from Woese et al. rRNA analysis.<ref name="Woese_1990">{{cite journal | last1 = Woese | first1 = Carl R.| author-link1 = Carl Woese | last2 =  Kandler | first2 = O | last3 = Wheelis | first3= M | title = Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya | url=http://www.pnas.org/content/87/12/4576.full.pdf | journal = Proc Natl Acad Sci USA | volume = 87 | issue = 12 | pages = 4576–4579 | year = 1990 | pmid = 2112744 | doi = 10.1073/pnas.87.12.4576 | pmc = 54159 | bibcode=1990PNAS...87.4576W| author2-link = Otto Kandler| doi-access = free}}</ref> The vertical line at bottom represents the [[last universal common ancestor]] (LUCA).]]
[[File:Heatmap RNAseqV2 1.png|thumb|Heatmap of [[RNA-Seq]] data showing two dendrograms in the left and top margins.]]


A '''dendrogram''' is a [[diagram]] representing a [[Tree (graph theory)|tree graph]]. This diagrammatic representation is frequently used in different contexts:
A '''dendrogram''' is a [[diagram]] representing a [[Tree (graph theory)|tree graph]] and a [[Similarity measure|similarity metric]], based on [[numerical taxonomy]] methods. This diagrammatic representation is frequently used in different contexts:
* in [[hierarchical clustering]], it illustrates the arrangement of the clusters produced by the corresponding analyses.<ref>{{cite book |last= Everitt |first= Brian |date= 1998 |title= Dictionary of Statistics |location= Cambridge, UK |publisher= Cambridge University Press |page= [https://archive.org/details/cambridgediction00ever_0/page/96 96] |isbn= 0-521-59346-8 |url-access= registration |url= https://archive.org/details/cambridgediction00ever_0/page/96 }}</ref>
* in [[hierarchical clustering]], it illustrates the arrangement of the clusters produced by the corresponding analyses.<ref>{{cite book |last= Everitt |first= Brian |date= 1998 |title= Dictionary of Statistics |location= Cambridge, UK |publisher= Cambridge University Press |page= [https://archive.org/details/cambridgediction00ever_0/page/96 96] |isbn= 0-521-59346-8 |url-access= registration |url= https://archive.org/details/cambridgediction00ever_0/page/96 }}</ref>
* in [[computational biology]], it shows the clustering of [[gene]]s or samples, sometimes in the margins of [[heat map|heatmaps]].<ref name="Wilkinson2009">{{cite journal|last1=Wilkinson|first1=Leland|last2=Friendly|first2=Michael|title=The History of the Cluster Heat Map|journal=The American Statistician|date=May 2009|volume=63|issue=2|pages=179–184|doi=10.1198/tas.2009.0033|s2cid=122792460|citeseerx=10.1.1.165.7924}}</ref>
* in [[computational biology]], it shows the clustering of [[gene]]s or samples, sometimes in the margins of [[heat map|heatmaps]].<ref name="Wilkinson2009">{{cite journal|last1=Wilkinson|first1=Leland|last2=Friendly|first2=Michael|title=The History of the Cluster Heat Map|journal=The American Statistician|date=May 2009|volume=63|issue=2|pages=179–184|doi=10.1198/tas.2009.0033|s2cid=122792460|citeseerx=10.1.1.165.7924}}</ref>
* in [[phylogenetics]], it displays the [[evolution]]ary relationships among various biological [[taxa]]. In this case, the dendrogram is also called a [[phylogenetic tree]].<ref>{{Cite encyclopedia|url=https://www.britannica.com/science/phylogenetic-tree|title=Phylogenetic tree (biology)|encyclopedia=Encyclopedia Britannica|access-date=2018-10-22|language=en}}</ref>
* in [[phylogenetics]], it displays the [[evolution]]ary relationships among various biological [[taxa]]. In this case, the dendrogram is also called a [[phylogenetic tree]].<ref>{{Cite encyclopedia|url=https://www.britannica.com/science/phylogenetic-tree|title=Phylogenetic tree (biology)|encyclopedia=Encyclopedia Britannica|access-date=2018-10-22|language=en}}</ref>


The name ''dendrogram'' derives from the two [[ancient greek]] words {{wikt-lang|grc|δένδρον}} ({{grc-transl|δένδρον}}), meaning "tree", and {{wikt-lang|grc|γράμμα}} ({{grc-transl|γράμμα}}), meaning "drawing, mathematical figure".<ref>{{Cite book |title=Abrégé du dictionnaire grec français |last=Bailly |first=Anatole |date=1981-01-01 |publisher=Hachette |isbn=2010035283 |location=Paris |oclc=461974285 }}</ref><ref>{{Cite web |url=http://www.tabularium.be/bailly/ |title=Greek-french dictionary online |last=Bailly |first=Anatole |website=www.tabularium.be |access-date=October 20, 2018}}</ref>
The name ''dendrogram'' derives from the two [[ancient greek]] words {{wikt-lang|grc|δένδρον}} ({{grc-transl|δένδρον}}), meaning "tree", and {{wikt-lang|grc|γράμμα}} ({{grc-transl|γράμμα}}), meaning "drawing, mathematical figure".<ref>{{Cite book |title=Abrégé du dictionnaire grec français |last=Bailly |first=Anatole |date=1981-01-01 |publisher=Hachette |isbn=2010035283 |location=Paris |oclc=461974285 }}</ref><ref>{{Cite web |url=http://www.tabularium.be/bailly/ |title=Greek-french dictionary online |last=Bailly |first=Anatole |website=www.tabularium.be |access-date=October 20, 2018}}</ref> Below a typical one:


== Clustering example ==
[[File:Global-Diversity-of-Sponges-(Porifera)-pone.0035105.s008.tif|center]]


For a clustering example, suppose that five taxa (<math>a</math> to <math>e</math>) have been clustered by [[UPGMA]] based on a matrix of [[genetic distances]]. The [[hierarchical clustering]] dendrogram would show a column of five nodes representing the initial data (here individual taxa), and the remaining nodes represent the clusters to which the data belong, with the arrows representing the distance (dissimilarity). The distance between merged clusters is monotone, increasing with the level of the merger: the height of each node in the plot is proportional to the value of the intergroup dissimilarity between its two daughters (the nodes on the right representing individual observations all plotted at zero height).
The figure is a dendrogram output for hierarchical clustering of marine provinces using presence / absence of sponge species.<ref name="VanSoest2012">{{Cite journal |vauthors=Van Soest R, Boury-Esnault N, Vacelet J, Dohrmann M, Erpenbeck D, De Voogd N, Santodomingo N, Vanhoorne B, Kelly M, Hooper J| title = Global Diversity of Sponges (Porifera) | doi = 10.1371/journal.pone.0035105 | journal = PLOS ONE | volume = 7 | issue = 4 | pages = e35105 | year = 2012 | pmid = 22558119 | pmc = 3338747 | bibcode = 2012PLoSO...735105V| doi-access = free }}
</ref>
 
== History and methods ==
The diagrams were popularized and methods refined by [[Robert R. Sokal]] and [[Peter H. A. Sneath]] in the 1960s, as part of the [[numerical taxonomy]] tools. Despite having been born out of a demand from [[Taxonomy (biology)|biological taxonomy]], it was adopted as a method of [[data visualization]] in statistics, with [[cluster analysis]].
 
Nowadays dendrograms are key tools in [[hierarchical clustering]], and their methods are known as [[Hierarchical_clustering#Cluster_Linkage|cluster linkage]].
 
== Other examples ==
Diagram variations.
 
== Phylogenetic  ==
 
A dendrogram of the [[Tree of Life]]. The figure does not use a metric axis, only the approximate distance on the tree, to show evolutionary order.
 
[[File:Phylogenetic tree.svg|center]]
 
This phylogenetic tree is adapted from Woese et al. rRNA analysis.<ref name="Woese_1990">{{cite journal | last1 = Woese | first1 = Carl R.| author-link1 = Carl Woese | last2 =  Kandler | first2 = O | last3 = Wheelis | first3= M | title = Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya | url=http://www.pnas.org/content/87/12/4576.full.pdf | journal = Proc Natl Acad Sci USA | volume = 87 | issue = 12 | pages = 4576–4579 | year = 1990 | pmid = 2112744 | doi = 10.1073/pnas.87.12.4576 | pmc = 54159 | bibcode=1990PNAS...87.4576W| author2-link = Otto Kandler| doi-access = free}}</ref> The vertical line at bottom represents the [[last universal common ancestor]] (LUCA).
 
== Clustering ==
For a clustering example, suppose that five taxa (<math>a</math> to <math>e</math>) have been clustered by [[UPGMA]] based on a matrix of [[genetic distances]]. The matrix is showed as a [[heatmap]] of [[RNA-Seq]], with data showing two dendrograms in the left and top margins.
 
[[File:Heatmap RNAseqV2 1.png|center]]
 
The [[hierarchical clustering]] dendrogram would show a column of five nodes representing the initial data (here individual taxa), and the remaining nodes represent the clusters to which the data belong, with the arrows representing the distance (dissimilarity). The distance between merged clusters is monotone, increasing with the level of the merger: the height of each node in the plot is proportional to the value of the intergroup dissimilarity between its two daughters (the nodes on the right representing individual observations all plotted at zero height).


== See also ==
== See also ==
Line 29: Line 49:
* [[yEd]], a freeware for drawing and automatically arranging dendrograms
* [[yEd]], a freeware for drawing and automatically arranging dendrograms
* [[Taxonomy]]
* [[Taxonomy]]
* [[Numerical taxonomy]]


== References ==
== References ==

Latest revision as of 11:32, 23 June 2025

Script error: No such module "Distinguish". Template:Short description Template:More citations needed

Script error: No such module "Unsubst".

File:UPGMA Dendrogram Hierarchical.svg
Dendrogram of a hierarchical clustering (UPGMA) with the height of the nodes (adapted from bacterial 5S rRNA sequence data[1]).

A dendrogram is a diagram representing a tree graph and a similarity metric, based on numerical taxonomy methods. This diagrammatic representation is frequently used in different contexts:

The name dendrogram derives from the two ancient greek words Template:Wikt-lang (Template:Grc-transl), meaning "tree", and Template:Wikt-lang (Template:Grc-transl), meaning "drawing, mathematical figure".[5][6] Below a typical one:

File:Global-Diversity-of-Sponges-(Porifera)-pone.0035105.s008.tif

The figure is a dendrogram output for hierarchical clustering of marine provinces using presence / absence of sponge species.[7]

History and methods

The diagrams were popularized and methods refined by Robert R. Sokal and Peter H. A. Sneath in the 1960s, as part of the numerical taxonomy tools. Despite having been born out of a demand from biological taxonomy, it was adopted as a method of data visualization in statistics, with cluster analysis.

Nowadays dendrograms are key tools in hierarchical clustering, and their methods are known as cluster linkage.

Other examples

Diagram variations.

Phylogenetic

A dendrogram of the Tree of Life. The figure does not use a metric axis, only the approximate distance on the tree, to show evolutionary order.

File:Phylogenetic tree.svg

This phylogenetic tree is adapted from Woese et al. rRNA analysis.[8] The vertical line at bottom represents the last universal common ancestor (LUCA).

Clustering

For a clustering example, suppose that five taxa (a to e) have been clustered by UPGMA based on a matrix of genetic distances. The matrix is showed as a heatmap of RNA-Seq, with data showing two dendrograms in the left and top margins.

File:Heatmap RNAseqV2 1.png

The hierarchical clustering dendrogram would show a column of five nodes representing the initial data (here individual taxa), and the remaining nodes represent the clusters to which the data belong, with the arrows representing the distance (dissimilarity). The distance between merged clusters is monotone, increasing with the level of the merger: the height of each node in the plot is proportional to the value of the intergroup dissimilarity between its two daughters (the nodes on the right representing individual observations all plotted at zero height).

See also

References

Citations

Template:Reflist

Sources

Template:Refbegin

  • Script error: No such module "Citation/CS1".

Template:Refend

External links

  1. Script error: No such module "citation/CS1".
  2. Script error: No such module "citation/CS1".
  3. Script error: No such module "Citation/CS1".
  4. Script error: No such module "citation/CS1".
  5. Script error: No such module "citation/CS1".
  6. Script error: No such module "citation/CS1".
  7. Script error: No such module "Citation/CS1".
  8. Script error: No such module "Citation/CS1".