Dendrogram
Script error: No such module "Distinguish". Template:Short description Template:More citations needed
Script error: No such module "Unsubst".
A dendrogram is a diagram representing a tree graph and a similarity metric, based on numerical taxonomy methods. This diagrammatic representation is frequently used in different contexts:
- in hierarchical clustering, it illustrates the arrangement of the clusters produced by the corresponding analyses.[2]
- in computational biology, it shows the clustering of genes or samples, sometimes in the margins of heatmaps.[3]
- in phylogenetics, it displays the evolutionary relationships among various biological taxa. In this case, the dendrogram is also called a phylogenetic tree.[4]
The name dendrogram derives from the two ancient greek words Template:Wikt-lang (Template:Grc-transl), meaning "tree", and Template:Wikt-lang (Template:Grc-transl), meaning "drawing, mathematical figure".[5][6] Below a typical one:
The figure is a dendrogram output for hierarchical clustering of marine provinces using presence / absence of sponge species.[7]
History and methods
The diagrams were popularized and methods refined by Robert R. Sokal and Peter H. A. Sneath in the 1960s, as part of the numerical taxonomy tools. Despite having been born out of a demand from biological taxonomy, it was adopted as a method of data visualization in statistics, with cluster analysis.
Nowadays dendrograms are key tools in hierarchical clustering, and their methods are known as cluster linkage.
Other examples
Diagram variations.
Phylogenetic
A dendrogram of the Tree of Life. The figure does not use a metric axis, only the approximate distance on the tree, to show evolutionary order.
This phylogenetic tree is adapted from Woese et al. rRNA analysis.[8] The vertical line at bottom represents the last universal common ancestor (LUCA).
Clustering
For a clustering example, suppose that five taxa ( to ) have been clustered by UPGMA based on a matrix of genetic distances. The matrix is showed as a heatmap of RNA-Seq, with data showing two dendrograms in the left and top margins.
The hierarchical clustering dendrogram would show a column of five nodes representing the initial data (here individual taxa), and the remaining nodes represent the clusters to which the data belong, with the arrows representing the distance (dissimilarity). The distance between merged clusters is monotone, increasing with the level of the merger: the height of each node in the plot is proportional to the value of the intergroup dissimilarity between its two daughters (the nodes on the right representing individual observations all plotted at zero height).
See also
- Cladogram
- Distance matrices in phylogeny
- Hierarchical clustering
- MEGA, a freeware for drawing dendrograms
- yEd, a freeware for drawing and automatically arranging dendrograms
- Taxonomy
- Numerical taxonomy
References
Citations
Sources
- Script error: No such module "Citation/CS1".
External links
- Iris dendrogram - Example of using a dendrogram to visualize the 3 clusters from hierarchical clustering using the "complete" method vs the real species category (using R).
- ↑ Script error: No such module "citation/CS1".
- ↑ Script error: No such module "citation/CS1".
- ↑ Script error: No such module "Citation/CS1".
- ↑ Script error: No such module "citation/CS1".
- ↑ Script error: No such module "citation/CS1".
- ↑ Script error: No such module "citation/CS1".
- ↑ Script error: No such module "Citation/CS1".
- ↑ Script error: No such module "Citation/CS1".