De Bruijn graph: Difference between revisions

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
imported>MrSwedishMeatballs
 
imported>WikiCleanerBot
m v2.05b - Bot T20 CW#61 - Fix errors for CW project (Reference before punctuation)
 
Line 34: Line 34:
* The [[distributed hash table]] protocol [[Koorde]] uses a De Bruijn graph.
* The [[distributed hash table]] protocol [[Koorde]] uses a De Bruijn graph.
* In [[bioinformatics]], De Bruijn graphs are used for [[De novo sequence assemblers|''de novo'' assembly]] of sequencing reads into a [[genome]].<ref name="Pevzner2001a"/><ref name="Pevzner2001b"/><ref name="zerbino2008"/><ref name="chikhi2014representation"/><ref name="ukmss-37901"/> Instead of the complete De Bruijn graphs described above that contain all possible k-mers, de novo sequence assemblers make use of De Bruijn subgraphs that contain only the k-mers observed in a sequencing dataset.
* In [[bioinformatics]], De Bruijn graphs are used for [[De novo sequence assemblers|''de novo'' assembly]] of sequencing reads into a [[genome]].<ref name="Pevzner2001a"/><ref name="Pevzner2001b"/><ref name="zerbino2008"/><ref name="chikhi2014representation"/><ref name="ukmss-37901"/> Instead of the complete De Bruijn graphs described above that contain all possible k-mers, de novo sequence assemblers make use of De Bruijn subgraphs that contain only the k-mers observed in a sequencing dataset.
* In [[time series]] [[forecasting]], De Bruijn graphs have been adapted to encode temporal patterns by mapping discrete subsequences (n-grams) of observations to graph nodes. This enables the modeling of sequential dependencies in symbolic or [[discretized]] time series data.<ref name="cakiroglu2024"/><ref name="cakiroglu2024hypo"/> [[Multivariate statistics|Multivariate]] De Bruijn graphs extend this idea by jointly encoding patterns across multiple correlated variables, allowing for the representation of complex inter-variable temporal dynamics in multivariate time series.<ref name="cakiroglu2025"/>


== See also ==
== See also ==
Line 205: Line 206:
   | pmid = 22231483}}
   | pmid = 22231483}}
</ref>
</ref>
<ref name="cakiroglu2025">
{{cite arXiv
| title = Multivariate de Bruijn Graphs: A Symbolic Graph Framework for Time Series Forecasting
| author = Mert Onur Cakiroglu, Idil Bilge Altun, Hasan Kurban, Elham Buxton, Mehmet Dalkilic
| year = 2025
| class = cs.LG
| eprint = 2505.22768
}}
</ref>
<ref name="cakiroglu2024">
{{cite conference
| last1 = Cakiroglu | first1 = Mert Onur
| last2 = Kurban | first2 = Hasan
| last3 = Buxton | first3 = Elham Khorasani
| last4 = Dalkilic | first4 = Mehmet
| title = A Novel Discrete Time Series Representation with De Bruijn Graphs for Enhanced Forecasting Using TimesNet (Extended Abstract)
| conference = Proceedings of the 2024 IEEE 11th International Conference on Data Science and Advanced Analytics (DSAA)
| year = 2024
| pages = 1–3
| doi = 10.1109/DSAA61799.2024.10722826
}}
</ref>
<ref name="cakiroglu2024hypo">
{{cite journal
| last1 = Cakiroglu | first1 = Mert Onur
| last2 = Kurban | first2 = Hasan
| last3 = Aljihmani | first3 = Lilia
| last4 = Qaraqe | first4 = Khalid
| last5 = Petrovski | first5 = Goran
| last6 = Dalkilic | first6 = Mehmet M.
| title = A reinforcement learning approach to effective forecasting of pediatric hypoglycemia in diabetes I patients using an extended de Bruijn graph
| journal = Scientific Reports
| volume = 14
| issue = 1
| pages = 31251
| year = 2024
| doi = 10.1038/s41598-024-82649-4
| pmid = 39732907
| pmc = 11682413
| bibcode = 2024NatSR..1431251C
| url = https://doi.org/10.1038/s41598-024-82649-4
}}
</ref>
}}
}}


==External links==
==External links==
* [https://web.archive.org/web/20141030124239/http://www.homolog.us/Tutorials/index.php?p=2.1&s=1 Tutorial on using De Bruijn Graphs in Bioinformatics] by Homolog.us
* [https://web.archive.org/web/20141030124239/http://www.homolog.us/Tutorials/index.php?p=2.1&s=1 Tutorial on using De Bruijn Graphs in Bioinformatics] by Homolog.us
* [https://kurbanintelligencelab.com/2024/07/08/de-bruijn-graph/ De Bruijn Graph algorithm tutorial] by Kurban Intelligence Lab


[[Category:Dynamical systems]]
[[Category:Dynamical systems]]

Latest revision as of 05:16, 28 June 2025

Template:Short description In graph theory, an Template:Mvar-dimensional De Bruijn graph of Template:Mvar symbols is a directed graph representing overlaps between sequences of symbols. It has Template:Mvar vertices, consisting of all possible length-Template:Mvar sequences of the given symbols; the same symbol may appear multiple times in a sequence. For a set of Template:Mvar symbols Template:Math the set of vertices is:

V=Sn={(s1,,s1,s1),(s1,,s1,s2),,(s1,,s1,sm),(s1,,s2,s1),,(sm,,sm,sm)}.

If one of the vertices can be expressed as another vertex by shifting all its symbols by one place to the left and adding a new symbol at the end of this vertex, then the latter has a directed edge to the former vertex. Thus the set of arcs (that is, directed edges) is

E={((t1,t2,,tn),(t2,,tn,sj)):tiS,1in,1jm}.

Although De Bruijn graphs are named after Nicolaas Govert de Bruijn, they were invented independently by both de Bruijn[1] and I. J. Good.[2] Much earlier, Camille Flye Sainte-Marie[3] implicitly used their properties.

Properties

The line graph construction of the three smallest binary De Bruijn graphs is depicted below. As can be seen in the illustration, each vertex of the Template:Mvar-dimensional De Bruijn graph corresponds to an edge of the Template:Math-dimensional De Bruijn graph, and each edge in the Template:Mvar-dimensional De Bruijn graph corresponds to a two-edge path in the Template:Math-dimensional De Bruijn graph.

Template:Wide image

Dynamical systems

Binary De Bruijn graphs can be drawn in such a way that they resemble objects from the theory of dynamical systems, such as the Lorenz attractor: Template:Multiple image This analogy can be made rigorous: the Template:Mvar-dimensional Template:Mvar-symbol De Bruijn graph is a model of the Bernoulli map

xmx mod 1.

The Bernoulli map (also called the Template:Math map for Template:Math) is an ergodic dynamical system, which can be understood to be a single shift of a [[p-adic|Template:Mvar-adic number]].[5] The trajectories of this dynamical system correspond to walks in the De Bruijn graph, where the correspondence is given by mapping each real Template:Mvar in the interval Template:Math to the vertex corresponding to the first Template:Mvar digits in the base-Template:Mvar representation of Template:Mvar. Equivalently, walks in the De Bruijn graph correspond to trajectories in a one-sided subshift of finite type.

File:De Bruijn binary graph.svg
Directed graphs of two Template:Math de Bruijn sequences and a Template:Math sequence. In Template:Math, each vertex is visited once, whereas in Template:Math, each edge is traversed once.

Embeddings resembling this one can be used to show that the binary De Bruijn graphs have queue number 2[6] and that they have book thickness at most 5.[7]

Uses

  • Some grid network topologies are De Bruijn graphs.
  • The distributed hash table protocol Koorde uses a De Bruijn graph.
  • In bioinformatics, De Bruijn graphs are used for de novo assembly of sequencing reads into a genome.[8][9][10][11][12] Instead of the complete De Bruijn graphs described above that contain all possible k-mers, de novo sequence assemblers make use of De Bruijn subgraphs that contain only the k-mers observed in a sequencing dataset.
  • In time series forecasting, De Bruijn graphs have been adapted to encode temporal patterns by mapping discrete subsequences (n-grams) of observations to graph nodes. This enables the modeling of sequential dependencies in symbolic or discretized time series data.[13][14] Multivariate De Bruijn graphs extend this idea by jointly encoding patterns across multiple correlated variables, allowing for the representation of complex inter-variable temporal dynamics in multivariate time series.[15]

See also

References

Template:Reflist

External links

  1. Cite error: Invalid <ref> tag; no text was provided for refs named Bruijn1946
  2. Cite error: Invalid <ref> tag; no text was provided for refs named Good1946
  3. Cite error: Invalid <ref> tag; no text was provided for refs named Flye1894
  4. Cite error: Invalid <ref> tag; no text was provided for refs named Zhang1987
  5. Cite error: Invalid <ref> tag; no text was provided for refs named Leroux2002
  6. Cite error: Invalid <ref> tag; no text was provided for refs named HeathRosenberg
  7. Cite error: Invalid <ref> tag; no text was provided for refs named Obrenic
  8. Cite error: Invalid <ref> tag; no text was provided for refs named Pevzner2001a
  9. Cite error: Invalid <ref> tag; no text was provided for refs named Pevzner2001b
  10. Cite error: Invalid <ref> tag; no text was provided for refs named zerbino2008
  11. Cite error: Invalid <ref> tag; no text was provided for refs named chikhi2014representation
  12. Cite error: Invalid <ref> tag; no text was provided for refs named ukmss-37901
  13. Cite error: Invalid <ref> tag; no text was provided for refs named cakiroglu2024
  14. Cite error: Invalid <ref> tag; no text was provided for refs named cakiroglu2024hypo
  15. Cite error: Invalid <ref> tag; no text was provided for refs named cakiroglu2025