Events

The Taub Faculty of Computer Science Events and Talks

DNA-Correcting Codes: End-to-end Correction in DNA Storage Systems

Avital Boruchovsky (M.Sc. Thesis Seminar)

Sunday, 07.05.2023, 14:30

Taub 601

Existing storage technologies cannot keep up with the modern data explosion. There is a growing need to find alternatives for the current solutions for storing data. Storage systems based DNA, seems like an attractive possibility due to a number of unique properties of DNA mulecules, among them are that DNA is extremely dense (up to about 1 exabyte per cubic millimeter) and durable (half-life of over 500 years). A typical DNA storage system consists of three important components. The first is the DNA synthesis which produces the oligonucleotides, also called strands, that encode the data. The second part is a storage container with compartments which stores the DNA strands, however without order. Finally, to retrieve the data, the DNA is accessed using next-generation sequencing, which results in several noisy copies, called reads. The retrieval of the input information, is usually done by three steps as well. The first step is to partition all the reads into clusters such that the reads at each cluster are all copies of the same information strand. The second step is to apply a reconstruction algorithm on every cluster in order to retrieve an approximation of the original input strands. In the last step an Error Correcting Code is used in order to correct the remaining errors and to retrieve the user’s information. This work presents a new solution to DNA storage that integrates all three steps of retrieval, namely clustering, reconstruction, and error correction. DNA-correcting codes are presented as a unique solution to the problem of ensuring that the output of the storage system is unique for any valid set of input strands. To this end, we introduce a novel distance metric to capture the unique behavior of the DNA storage system and provide necessary and sufficient conditions for DNA-correcting codes. The work also includes several upper bounds and constructions of DNA-correcting codes.

[Back to the index of events]