跳到主要导航 跳到搜索 跳到主要内容

Development of an experiment-split method for benchmarking the generalization of a PTM site predictor: Lysine methylome as an example

  • Qingdao University

科研成果: 期刊稿件文章同行评审

6 引用 (Scopus)

摘要

Many computational classifiers have been developed to predict different types of post-translational modification sites. Their performances are measured using cross-validation or independent test, in which experimental data from different sources are mixed and randomly split into training and test sets. However, the self-reported performances of most classifiers based on this measure are generally higher than their performances in the application of new experimental data. It suggests that the cross-validation method overestimates the generalization ability of a classifier. Here, we proposed a generalization estimate method, dubbed experiment-split test, where the experimental sources for the training set are different from those for the test set that simulate the data derived from a new experiment. We took the prediction of lysine methylome (Kme) as an example and developed a deep learning-based Kme site predictor (called DeepKme) with outstanding performance. We assessed the experiment-split test by comparing it with the cross-validation method. We found that the performance measured using the experiment-split test is lower than that measured in terms of cross-validation. As the test data of the experiment-split method were derived from an independent experimental source, this method could reflect the generalization of the predictor. Therefore, we believe that the experiment-split method can be applied to benchmark the practical performance of a given PTM model. DeepKme is free accessible via https://github.com/guoyangzou/DeepKme.

源语言英语
文章编号e1009682
期刊PLoS Computational Biology
17
12
DOI
出版状态已出版 - 12月 2021
已对外发布

指纹图谱

探究 'Development of an experiment-split method for benchmarking the generalization of a PTM site predictor: Lysine methylome as an example' 的科研主题。它们共同构成独一无二的指纹。

引用此