ShaoPeng Wang, Jiarui Li, Fei Yuan, Tao Huang and Yu-Dong Cai* Pages 1 - 10 ( 10 )
Background: The post-translational modifications (PTMs) on the side chains of conserved lysine (Lys) residues play important roles in myriad cellular processes, such as histone modification, protein degradation, and regulation of DNA damage responses. To date, several computational methods have been developed to identify different PTMs on Lys residues. However, most of these methods focused on identifying one particular PTM regardless of other types of PTMs. Objective: In this study, we first conducted a computational investigation of three types of PTMs (acetylation, sumoylation, and ubiquitination) at the same time by analyzing the protein structure and sequence factors surrounding the substrate Lys residues in these types of PTMs. Method: To extract fully the structural and sequence information around the Lys residues, six types of features were used to encode the peptide segments containing the substrates. Next, through a feature selection method, i.e., maximum relevance minimum redundancy (mRMR), two feature lists, i.e., MaxRel feature list and mRMR feature list, were obtained. For the mRMR feature list, it was applied to extract optimal features of the random forest algorithm for distinguishing three types of PTMs. Results: An optimal classification model with overall accuracy of 0.989 was built. For the MaxRel feature list, we investigated the top-ranked features to uncover the site-preference and residue-preference of Lys residues. Conclusion: The results suggested that the disorder structure and the preference of flanking residues were the most important attributes to distinguish the three types of PTMs, which was consistent with the results reported in previous studies.
post-translational modification, acetylation, sumoylation, ubiquitination, feature selection, random forest
College of Life Science, Shanghai University, Shanghai, College of Life Science, Shanghai University, Shanghai, 200444, Department of Science & Technology, Binzhou Medical University Hospital, Binzhou256603, Shandong, Institute of Health Sciences, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, College of Life Science, Shanghai University, Shanghai, 200444