A Machine Learning Exploration of Relations between Protein Structures and their Genetic Coding
Linor Ackerman-Schraier
Tuesday, 26.07.2022, 10:30
Synonymous codons translate into the same amino acid. Although the identity of synonymous codons is often considered inconsequential to the final protein structure there is mounting evidence for an association between the two. Protein structure plays an important role in understanding the biological function and mechanism of a protein therefore understanding the relations between protein structures and their genetic coding is crucial. Our study examined the association between the two by using regression and classification models and found that (i) codon sequences predict protein backbone dihedral angles with a lower error than amino acid sequences, and (ii) models trained with true dihedral angles have better classification of synonymous codons given structural information than models trained with random dihedral angles. Using this classification approach, we investigate the local codon-codon dependencies and test whether synonymous codon identity can be predicted more accurately from codon context rather than amino acid context, and most specifically which codon context position carries the most predictive power.