Lei Chen*, ShaoPeng Wang , Yu-Hang Zhang , Lai Wei , XianLing Xu, Tao Huang * and Yu-Dong Cai * Pages 393 - 402 ( 10 )
Background: Accurately recognizing nitrated tyrosine residues from protein sequences would pave a way for understanding the mechanism of nitration and the screening of the tyrosine residues in sequences.
Results: In this study, we proposed a prediction model that used the extreme learning machine (ELM) algorithm as the prediction engine to identify nitrated tyrosine residues. To encode each tyrosine residue, a sliding window technique was adopted to extract a peptide segment for each tyrosine residue, from which a number of features were extracted. These features were analyzed by a popular feature selection method, Minimum Redundancy Maximum Relevance (mRMR) method, producing a feature list, in which all features were ranked in a rigorous way. Then, the Incremental Feature Selection (IFS) method was utilized to discover the optimal features, on which the optimal ELM-based prediction model was built. This model produced satisfactory results on the training dataset with a Matthews correlation coefficient of 0.757. The model was also evaluated by an independent test dataset that contained only positive samples, yielding a sensitivity of 0.938.
Conclusion: Compared to other prediction models that use classic machine learning algorithms as prediction engines on the same datasets with their own optimal features, the optimal ELM-based prediction model produced much better results, indicating the superiority of the proposed model for the identification of nitrated tyrosine residues from protein sequences.
Post-translational modification, nitrated tyrosine, extreme learning machine, minimum redundancy maximum relevance, incremental feature selection.
College of Information Engineering, Shanghai Maritime University, Shanghai 201306, School of Life Sciences, Shanghai University, Shanghai 200444, Institute of Health Sciences, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200025, College of Information Engineering, Shanghai Maritime University, Shanghai 201306, Department of Computer Science, Guangdong AIB Polytechnic, Guangzhou 510507, Guangdong, Institute of Health Sciences, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200025, School of Life Sciences, Shanghai University, Shanghai 200444