Reconstruction of Strings from their Substrings Spectrum
Sagi Marcovich, M.Sc. Thesis Seminar
Wednesday, 24.3.2021, 16:00
Zoom Lecture: 92990701982
For password to lecture, please contact:
Advisor:  Prof. Eitan Yaakobi and Prof. Tuvi Etzion
Using DNA molecules as a data storage volume was first introduced in the 1960s by Richard Feynman. Later, in 1990, the human genome project led to a significant progress in sequencing and assembly methods. As a result, the interest in storage solutions based on DNA molecules was increased. DNA storage enjoys major advantages over magnetic and optical storage solutions. Motivated by rising technologies for DNA sequencing, this work studies reconstruction of strings based upon their substrings spectrum. Under this paradigm, it is assumed that all substrings of some fixed length are received and the goal is to reconstruct the string. While many existing works assumed that substrings are received error free, we follow in this paper the noisy setup of this problem that was first studied by Gabrys and Milenkovic. The goal of this study is twofold. First we study the setup in which not all substrings in the multispectrum are received, and then we focus on the case where the read substrings are not error free. In each case we provide specific code constructions of strings that their reconstruction is guaranteed even in the presence of failure in either model. We present efficient encoding and decoding maps and analyze the cardinality of the code constructions.
