![]() ![]() In this paper, we use PIP-DB as a gold standard reference for comparison, and describe the benchmarking of protein p I prediction. We have previously described the database PIP-DB ( Bunkute et al., 2015), a collection of proteins, with associated experimentally determined p I values, as collated from the literature ( Bunkute et al., 2015). Extant comparison has been exiguous, using very small datasets ( Patrickios and Yamasaki, 1995), peptides rather than proteins ( Cargile et al., 2004 Lengqvist et al., 2011) or has reported poor accuracy ( Henriksson et al., 1995 Patrickios and Yamasaki, 1995). As many such alternative theoretical methods have been proposed, the calculation of protein p I values is in urgent need of benchmarking, since its accuracy remains largely untested. Many authors have reported different values for the p K as of protein side chains and most of them are derived from measurements of side chains in isolated amino acids or from model compounds as well as values derived from ionizable side chains in situ ( Bjellqvist et al., 1993 Lengqvist et al., 2011). Most techniques exploit tabulated p K a values for the different ionizable amino acid residues such values are assumed to be constant regardless of structural context ( Maldonado et al., 2010). In addition to the resolution and dynamic range of the fractionation technique, combining the electrophoretic separation of proteins with mass spectrometry analysis provides an orthogonal analytical method for improving protein identification in different workflows ( Perez-Riverol et al., 2013).Īssuming a protein to be denatured, theoretical calculation of the p I is typically rapid, requiring only the sequence as input ( Cargile et al., 2004). Electrophoresis-based separation of proteins and peptides in both free-flow and gel systems has been adapted to a wide variety of proteomics platforms in order to reduce the complexity of the studied proteome ( Ramos et al., 2008, 2011). The p I is obtained as essentially incidental information during isoelectric focusing (IEF) experiments, free flow electrophoresis (FFE), capillary electrophoresis, and in-gel electrophoresis experiments using IPG strips ( Audain et al., 2014 Ramos et al., 2008). Protein p I values are amongst the most widely determined and widely reported quantities in all of biochemistry and proteomics. In a titration curve, the isoelectric point (p I) is the value at which the overall net surface charge of a macromolecular polyprotic species equals zero. Supplementary information: Supplementary data are available at Bioinformatics online. ![]() In contrast with Iterative methods, machine-learning algorithms have the advantage of being able to add new features to improve the accuracy of prediction.Ĭontact: and Implementation: The software and data are freely available at. In general, learning-based p I prediction methods (such as Cofactor, SVM and Branca) require a large training dataset and their resulting performance will strongly depend of the quality of that data. The machine-learning algorithms, especially the SVM-based algorithm, showed a superior performance when studying peptide mixtures. We find that methods vary in their accuracy and are highly sensitive to the choice of basis set. Results: Using data from the database PIP-DB and one publically available dataset as our reference gold standard, we have undertaken the benchmarking of p I calculation methods. While such p I calculation is widely used, it remains largely untested, motivating our efforts to benchmark p I prediction methods. Therefore accurate theoretical prediction of p I would expedite such analysis. Peptide fractionation according to their p I is also widely used in current proteomics sample preparation procedures previous to the LC-MS/MS analysis. Protein separation by isoelectric point is a critical part of 2-D gel electrophoresis, a key precursor of proteomics, where discrete spots can be digested in-gel, and proteins subsequently identified by analytical mass spectrometry. Different modern analytical biochemistry and proteomics methods depend on the isoelectric point as a principal feature for protein and peptide characterization. ![]() Motivation: In any macromolecular polyprotic system-for example protein, DNA or RNA-the isoelectric point-commonly referred to as the p I-can be defined as the point of singularity in a titration curve, corresponding to the solution pH value at which the net overall surface charge-and thus the electrophoretic mobility-of the ampholyte sums to zero.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |