Journal of Combinatorics

Volume 8 (2017)

Number 3

Guest Editor: Steve Butler (Iowa State University)

Multi de Bruijn sequences

Pages: 439 – 474

DOI: http://dx.doi.org/10.4310/JOC.2017.v8.n3.a3

Author

Glenn Tesler (Department of Mathematics, University of California at San Diego)

Abstract

We generalize the notion of a de Bruijn sequence to a “multi de Bruijn sequence”: a cyclic or linear sequence that contains every $k$-mer over an alphabet of size $q$ exactly $m$ times. For example, over the binary alphabet $\lbrace 0, 1 \rbrace$, the cyclic sequence $(00010111)$ and the linear sequence $000101110$ each contain two instances of each $2$-mer $00, 01, 10, 11$. We derive formulas for the number of such sequences. The formulas and derivation generalize classical de Bruijn sequences (the case $m = 1$). We also determine the number of multisets of aperiodic cyclic sequences containing every $k$-mer exactly $m$ times; for example, the pair of cyclic sequences $(00011)(011)$ contains two instances of each $2$-mer listed above. This uses an extension of the Burrows–Wheeler Transform due to Mantaci et al., and generalizes a result by Higgins for the case $m = 1$.

Keywords

multi de Bruijn sequences, de Bruijn graphs, Extended Burrows–Wheeler Transform, Lyndon words, Eulerian graphs

2010 Mathematics Subject Classification

Primary 05C30, 68R15. Secondary 05C38, 05C45, 05C81, 68P30.

Full Text (PDF format)

This work was supported in part by National Science Foundation grant CCF-1115206.

Paper received on 16 August 2016.