Journal of Combinatorics
Volume 8 (2017)
Guest Editor: Steve Butler (Iowa State University)
Multi de Bruijn sequences
Pages: 439 – 474
We generalize the notion of a de Bruijn sequence to a “multi de Bruijn sequence”: a cyclic or linear sequence that contains every $k$-mer over an alphabet of size $q$ exactly $m$ times. For example, over the binary alphabet $\lbrace 0, 1 \rbrace$, the cyclic sequence $(00010111)$ and the linear sequence $000101110$ each contain two instances of each $2$-mer $00, 01, 10, 11$. We derive formulas for the number of such sequences. The formulas and derivation generalize classical de Bruijn sequences (the case $m = 1$). We also determine the number of multisets of aperiodic cyclic sequences containing every $k$-mer exactly $m$ times; for example, the pair of cyclic sequences $(00011)(011)$ contains two instances of each $2$-mer listed above. This uses an extension of the Burrows–Wheeler Transform due to Mantaci et al., and generalizes a result by Higgins for the case $m = 1$.
multi de Bruijn sequences, de Bruijn graphs, Extended Burrows–Wheeler Transform, Lyndon words, Eulerian graphs
2010 Mathematics Subject Classification
Primary 05C30, 68R15. Secondary 05C38, 05C45, 05C81, 68P30.
This work was supported in part by National Science Foundation grant CCF-1115206.
Paper received on 16 August 2016.