Dendrogram: Difference between revisions
imported>AnomieBOT m Dating maintenance tags: {{Merge to}} |
imported>Krauss add History and methods |
||
| Line 5: | Line 5: | ||
{{merge to|Tree structure|date=May 2025}} | {{merge to|Tree structure|date=May 2025}} | ||
[[File:UPGMA Dendrogram Hierarchical.svg|thumb|Dendrogram of a hierarchical clustering (UPGMA) with the height of the nodes (adapted from bacterial 5S rRNA sequence data<ref name="Swofford1996">{{cite book|title=Molecular Systematics, 2nd edition|last1=Swofford|first1=David L.|last2=Olsen|first2=Gary J.|last3=Waddell|first3=Peter J.|last4=Hillis|first4=David M. | name-list-style = vanc |year=1996|isbn=9780878932825|editor-last1=Hillis|editor-first1=David M.|editor-last2=Moritz|editor-first2=Craig|editor-last3=Mable|editor-first3=Barbara K.|publisher=Sinauer|location=Sunderland, MA|pages=407–514|chapter=Phylogenetic inference}}</ref>) | [[File:UPGMA Dendrogram Hierarchical.svg|thumb|320px|Dendrogram of a hierarchical clustering (UPGMA) with the height of the nodes (adapted from bacterial 5S rRNA sequence data<ref name="Swofford1996">{{cite book|title=Molecular Systematics, 2nd edition|last1=Swofford|first1=David L.|last2=Olsen|first2=Gary J.|last3=Waddell|first3=Peter J.|last4=Hillis|first4=David M. | name-list-style = vanc |year=1996|isbn=9780878932825|editor-last1=Hillis|editor-first1=David M.|editor-last2=Moritz|editor-first2=Craig|editor-last3=Mable|editor-first3=Barbara K.|publisher=Sinauer|location=Sunderland, MA|pages=407–514|chapter=Phylogenetic inference}}</ref>).]] | ||
A '''dendrogram''' is a [[diagram]] representing a [[Tree (graph theory)|tree graph]]. This diagrammatic representation is frequently used in different contexts: | A '''dendrogram''' is a [[diagram]] representing a [[Tree (graph theory)|tree graph]] and a [[Similarity measure|similarity metric]], based on [[numerical taxonomy]] methods. This diagrammatic representation is frequently used in different contexts: | ||
* in [[hierarchical clustering]], it illustrates the arrangement of the clusters produced by the corresponding analyses.<ref>{{cite book |last= Everitt |first= Brian |date= 1998 |title= Dictionary of Statistics |location= Cambridge, UK |publisher= Cambridge University Press |page= [https://archive.org/details/cambridgediction00ever_0/page/96 96] |isbn= 0-521-59346-8 |url-access= registration |url= https://archive.org/details/cambridgediction00ever_0/page/96 }}</ref> | * in [[hierarchical clustering]], it illustrates the arrangement of the clusters produced by the corresponding analyses.<ref>{{cite book |last= Everitt |first= Brian |date= 1998 |title= Dictionary of Statistics |location= Cambridge, UK |publisher= Cambridge University Press |page= [https://archive.org/details/cambridgediction00ever_0/page/96 96] |isbn= 0-521-59346-8 |url-access= registration |url= https://archive.org/details/cambridgediction00ever_0/page/96 }}</ref> | ||
* in [[computational biology]], it shows the clustering of [[gene]]s or samples, sometimes in the margins of [[heat map|heatmaps]].<ref name="Wilkinson2009">{{cite journal|last1=Wilkinson|first1=Leland|last2=Friendly|first2=Michael|title=The History of the Cluster Heat Map|journal=The American Statistician|date=May 2009|volume=63|issue=2|pages=179–184|doi=10.1198/tas.2009.0033|s2cid=122792460|citeseerx=10.1.1.165.7924}}</ref> | * in [[computational biology]], it shows the clustering of [[gene]]s or samples, sometimes in the margins of [[heat map|heatmaps]].<ref name="Wilkinson2009">{{cite journal|last1=Wilkinson|first1=Leland|last2=Friendly|first2=Michael|title=The History of the Cluster Heat Map|journal=The American Statistician|date=May 2009|volume=63|issue=2|pages=179–184|doi=10.1198/tas.2009.0033|s2cid=122792460|citeseerx=10.1.1.165.7924}}</ref> | ||
* in [[phylogenetics]], it displays the [[evolution]]ary relationships among various biological [[taxa]]. In this case, the dendrogram is also called a [[phylogenetic tree]].<ref>{{Cite encyclopedia|url=https://www.britannica.com/science/phylogenetic-tree|title=Phylogenetic tree (biology)|encyclopedia=Encyclopedia Britannica|access-date=2018-10-22|language=en}}</ref> | * in [[phylogenetics]], it displays the [[evolution]]ary relationships among various biological [[taxa]]. In this case, the dendrogram is also called a [[phylogenetic tree]].<ref>{{Cite encyclopedia|url=https://www.britannica.com/science/phylogenetic-tree|title=Phylogenetic tree (biology)|encyclopedia=Encyclopedia Britannica|access-date=2018-10-22|language=en}}</ref> | ||
The name ''dendrogram'' derives from the two [[ancient greek]] words {{wikt-lang|grc|δένδρον}} ({{grc-transl|δένδρον}}), meaning "tree", and {{wikt-lang|grc|γράμμα}} ({{grc-transl|γράμμα}}), meaning "drawing, mathematical figure".<ref>{{Cite book |title=Abrégé du dictionnaire grec français |last=Bailly |first=Anatole |date=1981-01-01 |publisher=Hachette |isbn=2010035283 |location=Paris |oclc=461974285 }}</ref><ref>{{Cite web |url=http://www.tabularium.be/bailly/ |title=Greek-french dictionary online |last=Bailly |first=Anatole |website=www.tabularium.be |access-date=October 20, 2018}}</ref> | The name ''dendrogram'' derives from the two [[ancient greek]] words {{wikt-lang|grc|δένδρον}} ({{grc-transl|δένδρον}}), meaning "tree", and {{wikt-lang|grc|γράμμα}} ({{grc-transl|γράμμα}}), meaning "drawing, mathematical figure".<ref>{{Cite book |title=Abrégé du dictionnaire grec français |last=Bailly |first=Anatole |date=1981-01-01 |publisher=Hachette |isbn=2010035283 |location=Paris |oclc=461974285 }}</ref><ref>{{Cite web |url=http://www.tabularium.be/bailly/ |title=Greek-french dictionary online |last=Bailly |first=Anatole |website=www.tabularium.be |access-date=October 20, 2018}}</ref> Below a typical one: | ||
[[File:Global-Diversity-of-Sponges-(Porifera)-pone.0035105.s008.tif|center]] | |||
For a clustering example, suppose that five taxa (<math>a</math> to <math>e</math>) have been clustered by [[UPGMA]] based on a matrix of [[genetic distances]]. The [[hierarchical clustering]] dendrogram would show a column of five nodes representing the initial data (here individual taxa), and the remaining nodes represent the clusters to which the data belong, with the arrows representing the distance (dissimilarity). The distance between merged clusters is monotone, increasing with the level of the merger: the height of each node in the plot is proportional to the value of the intergroup dissimilarity between its two daughters (the nodes on the right representing individual observations all plotted at zero height). | The figure is a dendrogram output for hierarchical clustering of marine provinces using presence / absence of sponge species.<ref name="VanSoest2012">{{Cite journal |vauthors=Van Soest R, Boury-Esnault N, Vacelet J, Dohrmann M, Erpenbeck D, De Voogd N, Santodomingo N, Vanhoorne B, Kelly M, Hooper J| title = Global Diversity of Sponges (Porifera) | doi = 10.1371/journal.pone.0035105 | journal = PLOS ONE | volume = 7 | issue = 4 | pages = e35105 | year = 2012 | pmid = 22558119 | pmc = 3338747 | bibcode = 2012PLoSO...735105V| doi-access = free }} | ||
</ref> | |||
== History and methods == | |||
The diagrams were popularized and methods refined by [[Robert R. Sokal]] and [[Peter H. A. Sneath]] in the 1960s, as part of the [[numerical taxonomy]] tools. Despite having been born out of a demand from [[Taxonomy (biology)|biological taxonomy]], it was adopted as a method of [[data visualization]] in statistics, with [[cluster analysis]]. | |||
Nowadays dendrograms are key tools in [[hierarchical clustering]], and their methods are known as [[Hierarchical_clustering#Cluster_Linkage|cluster linkage]]. | |||
== Other examples == | |||
Diagram variations. | |||
== Phylogenetic == | |||
A dendrogram of the [[Tree of Life]]. The figure does not use a metric axis, only the approximate distance on the tree, to show evolutionary order. | |||
[[File:Phylogenetic tree.svg|center]] | |||
This phylogenetic tree is adapted from Woese et al. rRNA analysis.<ref name="Woese_1990">{{cite journal | last1 = Woese | first1 = Carl R.| author-link1 = Carl Woese | last2 = Kandler | first2 = O | last3 = Wheelis | first3= M | title = Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya | url=http://www.pnas.org/content/87/12/4576.full.pdf | journal = Proc Natl Acad Sci USA | volume = 87 | issue = 12 | pages = 4576–4579 | year = 1990 | pmid = 2112744 | doi = 10.1073/pnas.87.12.4576 | pmc = 54159 | bibcode=1990PNAS...87.4576W| author2-link = Otto Kandler| doi-access = free}}</ref> The vertical line at bottom represents the [[last universal common ancestor]] (LUCA). | |||
== Clustering == | |||
For a clustering example, suppose that five taxa (<math>a</math> to <math>e</math>) have been clustered by [[UPGMA]] based on a matrix of [[genetic distances]]. The matrix is showed as a [[heatmap]] of [[RNA-Seq]], with data showing two dendrograms in the left and top margins. | |||
[[File:Heatmap RNAseqV2 1.png|center]] | |||
The [[hierarchical clustering]] dendrogram would show a column of five nodes representing the initial data (here individual taxa), and the remaining nodes represent the clusters to which the data belong, with the arrows representing the distance (dissimilarity). The distance between merged clusters is monotone, increasing with the level of the merger: the height of each node in the plot is proportional to the value of the intergroup dissimilarity between its two daughters (the nodes on the right representing individual observations all plotted at zero height). | |||
== See also == | == See also == | ||
| Line 29: | Line 49: | ||
* [[yEd]], a freeware for drawing and automatically arranging dendrograms | * [[yEd]], a freeware for drawing and automatically arranging dendrograms | ||
* [[Taxonomy]] | * [[Taxonomy]] | ||
* [[Numerical taxonomy]] | |||
== References == | == References == | ||
Latest revision as of 11:32, 23 June 2025
Script error: No such module "Distinguish". Template:Short description Template:More citations needed
Script error: No such module "Unsubst".
A dendrogram is a diagram representing a tree graph and a similarity metric, based on numerical taxonomy methods. This diagrammatic representation is frequently used in different contexts:
- in hierarchical clustering, it illustrates the arrangement of the clusters produced by the corresponding analyses.[2]
- in computational biology, it shows the clustering of genes or samples, sometimes in the margins of heatmaps.[3]
- in phylogenetics, it displays the evolutionary relationships among various biological taxa. In this case, the dendrogram is also called a phylogenetic tree.[4]
The name dendrogram derives from the two ancient greek words Template:Wikt-lang (Template:Grc-transl), meaning "tree", and Template:Wikt-lang (Template:Grc-transl), meaning "drawing, mathematical figure".[5][6] Below a typical one:
The figure is a dendrogram output for hierarchical clustering of marine provinces using presence / absence of sponge species.[7]
History and methods
The diagrams were popularized and methods refined by Robert R. Sokal and Peter H. A. Sneath in the 1960s, as part of the numerical taxonomy tools. Despite having been born out of a demand from biological taxonomy, it was adopted as a method of data visualization in statistics, with cluster analysis.
Nowadays dendrograms are key tools in hierarchical clustering, and their methods are known as cluster linkage.
Other examples
Diagram variations.
Phylogenetic
A dendrogram of the Tree of Life. The figure does not use a metric axis, only the approximate distance on the tree, to show evolutionary order.
This phylogenetic tree is adapted from Woese et al. rRNA analysis.[8] The vertical line at bottom represents the last universal common ancestor (LUCA).
Clustering
For a clustering example, suppose that five taxa ( to ) have been clustered by UPGMA based on a matrix of genetic distances. The matrix is showed as a heatmap of RNA-Seq, with data showing two dendrograms in the left and top margins.
The hierarchical clustering dendrogram would show a column of five nodes representing the initial data (here individual taxa), and the remaining nodes represent the clusters to which the data belong, with the arrows representing the distance (dissimilarity). The distance between merged clusters is monotone, increasing with the level of the merger: the height of each node in the plot is proportional to the value of the intergroup dissimilarity between its two daughters (the nodes on the right representing individual observations all plotted at zero height).
See also
- Cladogram
- Distance matrices in phylogeny
- Hierarchical clustering
- MEGA, a freeware for drawing dendrograms
- yEd, a freeware for drawing and automatically arranging dendrograms
- Taxonomy
- Numerical taxonomy
References
Citations
Sources
- Script error: No such module "Citation/CS1".
External links
- Iris dendrogram - Example of using a dendrogram to visualize the 3 clusters from hierarchical clustering using the "complete" method vs the real species category (using R).
- ↑ Script error: No such module "citation/CS1".
- ↑ Script error: No such module "citation/CS1".
- ↑ Script error: No such module "Citation/CS1".
- ↑ Script error: No such module "citation/CS1".
- ↑ Script error: No such module "citation/CS1".
- ↑ Script error: No such module "citation/CS1".
- ↑ Script error: No such module "Citation/CS1".
- ↑ Script error: No such module "Citation/CS1".