Nphos: Database and Predictor of Protein N-phosphorylation.
Loading...
Date
2024-04-10
Open Access Location
Journal Title
Journal ISSN
Volume Title
Publisher
Oxford University Press
Rights
(c) 2024 The Author/s
CC BY 4.0
CC BY 4.0
Abstract
Protein N-phosphorylation is widely present in nature and participates in various biological processes. However, current knowledge on N-phosphorylation is extremely limited compared to that on O-phosphorylation. In this study, we collected 11,710 experimentally verified N-phosphosites of 7344 proteins from 39 species and subsequently constructed the database Nphos to share up-to-date information on protein N-phosphorylation. Upon these substantial data, we characterized the sequential and structural features of protein N-phosphorylation. Moreover, after comparing hundreds of learning models, we chose and optimized gradient boosting decision tree (GBDT) models to predict three types of human N-phosphorylation, achieving mean area under the receiver operating characteristic curve (AUC) values of 90.56%, 91.24%, and 92.01% for pHis, pLys, and pArg, respectively. Meanwhile, we discovered 488,825 distinct N-phosphosites in the human proteome. The models were also deployed in Nphos for interactive N-phosphosite prediction. In summary, this work provides new insights and points for both flexible and focused investigations of N-phosphorylation. It will also facilitate a deeper and more systematic understanding of protein N-phosphorylation modification by providing a data and technical foundation. Nphos is freely available at http://www.bio-add.org/Nphos/ and http://ppodd.org.cn/Nphos/.
Description
Keywords
N-phosphorylation, Benchmark dataset, Database, Machine learning, Post-translational modification, Phosphorylation, Databases, Protein, Humans, Phosphoproteins, Proteome
Citation
Zhao M-X, Ding R-F, Chen Q, Meng J, Li F, Fu S, Huang B, Liu Y, Ji Z-L, Zhao Y. (2024). Nphos: Database and Predictor of Protein N-phosphorylation.. Genomics Proteomics Bioinformatics. 22. 3. (pp. qzae032-).