Events

The Taub Faculty of Computer Science Events and Talks

The Deletion/Insertion Channel and its Application to Coding for DNA Storage

Daniella Bar-Lev (M.Sc. Thesis Seminar)

Monday, 05.04.2021, 17:30

Zoom Lecture: 97428473352
For password to lecture, please contact: daniellalev@cs.technion.ac.il

Advisor: Prof. T. Etzion and Prof. E. Yaakobi

DNA-based storage offers significant advantages over magnetic and optical storage solutions in terms of density, durability and not requiring a constant power supply. Given current trends of cost reduction in DNA synthesis and sequencing, it is now acknowledged that within the next 10 – 15 years DNA-based storage may become a highly competitive archiving technology. The microscopic world in which the DNA molecules reside induces error patterns that are fundamentally different from their digital counterparts. Errors in DNA are typically substitutions, duplications, insertions and deletions. These types of errors result from the specific error behavior in DNA and the method in which DNA strands are stored together. Currently, in most DNA storage systems, the coding solutions used for error correction were conventional ones such as repetition, Reed-Solomon and LDPC codes. Hence, to maintain reliability in reading and writing, new coding schemes must be developed. This work addresses some of the open problems regarding coding for DNA storage and more generally coding for channels with deletions. When the number of deletions is equal to the number of insertions, the Levenshtein metric is a suitable measure to compute the distance between two words of the same length. We present the minimum, maximum, and average size of a ball with radius one, in the Levenshtein metric. The related minimum and maximum size of a maximal anticode with diameter one is also calculated. Beyond the analysis of the sizes of algebraic structures relevant to the study of DNA-based storage, the reconstruction problem is also discussed. That is, to reconstruct a word given several of its noisy copies, which is relevant to several applications. The goal in the DNA reconstruction problem is to minimize the distance between the original sequence and the algorithm's estimation. We consider two variants of the reconstruction problem and constructed an optimal decoder in terms of average error probability for both of them.

[Back to the index of events]