TranCEP: Predicting the substrate class of transmembrane transport proteins using compositional, evolutionary, and positional information
Autoři:
Munira Alballa aff001; Faizah Aplop aff003; Gregory Butler aff001
Působiště autorů:
Department of Computer Science and Software Engineering, Concordia University, Montréal, Québec, Canada
aff001; College of Computer and Information Sciences, King Saud University, Riyadh, Saudi Arabia
aff002; School of Informatics and Applied Mathematics, Universiti Malaysia Terengganu, Malaysia
aff003; Centre for Structural and Functional Genomics, Concordia University, Montréal, Québec, Canada
aff004
Vyšlo v časopise:
PLoS ONE 15(1)
Kategorie:
Research Article
doi:
https://doi.org/10.1371/journal.pone.0227683
Souhrn
Transporters mediate the movement of compounds across the membranes that separate the cell from its environment and across the inner membranes surrounding cellular compartments. It is estimated that one third of a proteome consists of membrane proteins, and many of these are transport proteins. Given the increase in the number of genomes being sequenced, there is a need for computational tools that predict the substrates that are transported by the transmembrane transport proteins. In this paper, we present TranCEP, a predictor of the type of substrate transported by a transmembrane transport protein. TranCEP combines the traditional use of the amino acid composition of the protein, with evolutionary information captured in a multiple sequence alignment (MSA), and restriction to important positions of the alignment that play a role in determining the specificity of the protein. Our experimental results show that TranCEP significantly outperforms the state-of-the-art predictors. The results quantify the contribution made by each type of information used.
Klíčová slova:
Anions – Cations – Membrane proteins – Multiple alignment calculation – Protein sequencing – Sequence alignment – Sequence databases – Transmembrane transport proteins
Zdroje
1. Buehler L. The Structure of Membrane Proteins. Cell Membranes. Garland Science; 2015.
2. Kozma D, Simon I, Tusnády GE. PDBTM: Protein Data Bank of transmembrane proteins after 8 years. Nucleic Acids Research. 2013;41(D1):D524–D529. doi: 10.1093/nar/gks1169
3. Gromiha M, Ou Y. Bioinformatics approaches for functional annotation of membrane proteins. Briefings in Bioinformatics. 2014;15(2):155–168. doi: 10.1093/bib/bbt015
4. Butt AH, Rasool N, Khan YD. A treatise to computational approaches towards prediction of membrane protein and its subtypes. The Journal of Membrane Biology. 2017;250(1):55–76. doi: 10.1007/s00232-016-9937-7 27866233
5. Berman HM, Battistuz T, Bhat TN, Bluhm WF, Bourne PE, Burkhardt K, Feng Z, Gilliland GL, Iype L, Jain S, Fagan P. The protein data bank. Acta Crystallographica Section D: Biological Crystallography. 2002;58(6):899–907. doi: 10.1107/S0907444902003451
6. Schaadt NS, Christoph J, Helms V. Classifying substrate specificities of membrane transporters from Arabidopsis thaliana. Journal of Chemical Information and Modeling. 2010;50(10):1899–1905. doi: 10.1021/ci100243m 20925375
7. Chen S, Ou Y, Lee T, Gromiha MM. Prediction of transporter targets using efficient RBF networks with PSSM profiles and biochemical properties. Bioinformatics. 2011;27(15):2062–2067. doi: 10.1093/bioinformatics/btr340 21653515
8. Schaadt N, Helms V. Functional classification of membrane transporters and channels based on filtered TM/non-TM amino acid composition. Biopolymers. 2012;97(7):558–567. doi: 10.1002/bip.22043 22492257
9. Barghash A, Helms V. Transferring functional annotations of membrane transporters on the basis of sequence similarity and sequence motifs. BMC Bioinformatics. 2013;14(1):343. doi: 10.1186/1471-2105-14-343 24283849
10. Mishra NK, Chang J, Zhao PX. Prediction of membrane transport proteins and their substrate specificities using primary sequence information. PLoS One. 2014;9(6):1–14. doi: 10.1371/journal.pone.0100278
11. Gromiha MM, Yabuki Y. Functional discrimination of membrane proteins using machine learning techniques. BMC Bioinformatics. 2008;9(1):135. doi: 10.1186/1471-2105-9-135 18312695
12. Li H, Benedito VA, Udvardi MK, Zhao PX. TransportTP: A two-phase classification approach for membrane transporter prediction and characterization. BMC Bioinformatics. 2009;10(418):1–13.
13. Ou YY, Chen SA, Gromiha MM. Classification of transporters using efficient radial basis function networks with position-specific scoring matrices and biochemical properties. Proteins: Structure, Function, and Bioinformatics. 2010;78(7):1789–1797.
14. Busch W, Saier M Jr. The IUBMB-endorsed transporter classification system. Methods in Molecular Biology. 2003;227:21. doi: 10.1385/1-59259-387-9:21 12824641
15. Saier MH Jr, Tran CV, Barabote RD. TCDB: the Transporter Classification Database for membrane transport protein analyses and information. Nucleic Acids Research. 2006;34(suppl_1):D181–D186. doi: 10.1093/nar/gkj001
16. Saier MH Jr, Reddy VS, Tsu BV, Ahmed MS, Li C, Moreno-Hagelsieb G. The transporter classification database (TCDB): recent advances. Nucleic Acids Research. 2016;44(D1):D372–D379. doi: 10.1093/nar/gkv1103 26546518
17. Thiele I, Palsson BØ. A protocol for generating a high-quality genome-scale metabolic reconstruction. Nature protocols. 2010;5(1):93–121. doi: 10.1038/nprot.2009.203 20057383
18. Sahoo S, Aurich MK, Jonsson JJ, Thiele I. Membrane transporters in a human genome-scale metabolic knowledgebase and their implications for disease. Frontiers in Physiology. 2014;5:91. doi: 10.3389/fphys.2014.00091 24653705
19. Dias O, Rocha M, Ferreira EC, Rocha I. Reconstructing genome-scale metabolic models with merlin. Nucleic Acids Research. 2015;43(8):3899–3910. doi: 10.1093/nar/gkv294 25845595
20. Loira N, Zhukova A, Sherman DJ. Pantograph: A template-based method for genome-scale metabolic model reconstruction. Journal of Bioinformatics and Computational Biology. 2015;13(02):1550006. doi: 10.1142/S0219720015500067 25572717
21. Aplop F, Butler G. TransATH: transporter prediction via annotation transfer by homology. ARPN Journal of Engineering and Applied Sciences. 2017;12(2).
22. Aplop F. Computational approaches to improving the reconstruction of metabolic pathway. Concordia University; 2016.
23. Farwick A, Bruder S, Schadeweg V, Oreb M, Boles E. Engineering of yeast hexose transporters to transport D-xylose without inhibition by D-glucose. Proceedings of the National Academy of Sciences. 2014;111(14):5159–5164. doi: 10.1073/pnas.1323464111
24. Teppa E, Wilkins AD, Nielsen M, Buslje CM. Disentangling evolutionary signals: conservation, specificity determining positions and coevolution. Implication for catalytic residue prediction. BMC Bioinformatics. 2012;13(1):235. doi: 10.1186/1471-2105-13-235 22978315
25. Chakraborty A, Chakrabarti S. A survey on prediction of specificity-determining sites in proteins. Briefings in Bioinformatics. 2014;16(1):71–88. doi: 10.1093/bib/bbt092 24413183
26. Pirovano W, Feenstra KA, Heringa J. PRALINE™: a strategy for improved multiple alignment of transmembrane proteins. Bioinformatics. 2008;24(4):492–497. doi: 10.1093/bioinformatics/btm636 18174178
27. Chang JM, Di Tommaso P, Taly JF, Notredame C. Accurate multiple sequence alignment of transmembrane proteins with PSI-Coffee. BMC Bioinformatics. 2012;13(Suppl 4):S1. doi: 10.1186/1471-2105-13-S4-S1 22536955
28. Floden EW, Tommaso PD, Chatzou M, Magis C, Notredame C, Chang JM. PSI/TM-Coffee: a web server for fast and accurate multiple sequence alignments of regular and transmembrane proteins using homology extension on reduced databases. Nucleic Acids Research. 2016;44(W1):W339–W343. doi: 10.1093/nar/gkw300 27106060
29. Bhat B, Ganai NA, Andrabi SM, Shah RA, Singh A. TM-Aligner: Multiple sequence alignment tool for transmembrane proteins with reduced time and improved accuracy. Scientific reports. 2017;7(1):12543. doi: 10.1038/s41598-017-13083-y 28970546
30. Chang JM, Di Tommaso P, Notredame C. TCS: a new multiple sequence alignment reliability measure to estimate alignment accuracy and improve phylogenetic tree reconstruction. Molecular Biology and Evolution. 2014; p. 1625–1637. doi: 10.1093/molbev/msu117 24694831
31. Lee TJ, Paulsen I, Karp P. Annotation-based inference of transporter function. Critical Reviews in Biochemistry and Molecular Biology. 2008;24:i259–i267.
32. Karp PD, Riley M, Paley SM, Pellegrini-Toole A. The MetaCyc database. Nucleic Acids Research. 2002;30(1):59–61. doi: 10.1093/nar/30.1.59 11752254
33. Reddy VS, Saier MH. BioV Suite—a collection of programs for the study of transport protein evolution. FEBS Journal. 2012;279(11):2036–2046. doi: 10.1111/j.1742-4658.2012.08590.x 22568782
34. Saier MH Jr, Tran CV, Barabote RD. TCDB: the Transporter Classification Database for membrane transport protein analyses and information. Nucleic Acids Research. 2006;34(suppl_1):D181–6. doi: 10.1093/nar/gkj001
35. Tusnady GE, Simon I. The HMMTOP transmembrane topology prediction server. Bioinformatics. 2001;17(9):849–50. doi: 10.1093/bioinformatics/17.9.849 11590105
36. Paparoditis P, Västermark Å, Le AJ, Fuerst JA, Saier MH. Bioinformatic analyses of integral membrane transport proteins encoded within the genome of the planctomycetes species, Rhodopirellula baltica. Biochimica et Biophysica Acta (BBA)-Biomembranes. 2014;1838(1):193–215. doi: 10.1016/j.bbamem.2013.08.007
37. Li H, Dai X, Zhao X. A nearest neighbor approach for automated transporter prediction and categorization from protein sequences. Bioinformatics. 2008;24(9):1129–1136. doi: 10.1093/bioinformatics/btn099 18337257
38. Ren Q, Chen K, Paulsen IT. TransportDB: a comprehensive database resource for cytoplasmic membrane transport systems and outer membrane channels. Nucleic Acids Research. 2007;35:D274–D279. doi: 10.1093/nar/gkl925 17135193
39. Lin H, Han L, Cai C, Ji Z, Chen Y. Prediction of transporter family from protein sequence by support vector machine approach. Proteins: Structure, Function, and Bioinformatics. 2006;62(1):218–231. doi: 10.1002/prot.20605
40. Smith TF, Waterman MS. Identification of common molecular subsequences. Journal of Molecular Biology. 1981;147(1):195–7. doi: 10.1016/0022-2836(81)90087-5 7265238
41. Dias O, Gomes D, Vilaça P, Cardoso J, Rocha M, Ferreira EC, et al. Genome-wide semi-automated annotation of transporter systems. IEEE/ACM Transactions on Computational Biology and Bioinformatics. 2017;14(2):443–456. doi: 10.1109/TCBB.2016.2527647 26887005
42. Loira N, Dulermo T, Nicaud JM, Sherman DJ. A genome-scale metabolic model of the lipid-accumulating yeast Yarrowia lipolytica. BMC Systems Biology. 2012;6(1):35. doi: 10.1186/1752-0509-6-35 22558935
43. Liou YF, Vasylenko T, Yeh CL, Lin WC, Chiu SH, Charoenkwan P, et al. SCMMTP: identifying and characterizing membrane transport proteins using propensity scores of dipeptides. BMC Genomics. 2015;16(12):S6. doi: 10.1186/1471-2164-16-S12-S6 26677931
44. Li L, Li J, Xiao W, Li Y, Qin Y, Zhou S, et al. Prediction the substrate specificities of membrane transport proteins based on support vector machine and hybrid features. IEEE/ACM Transactions on Computational Biology and Bioinformatics. 2016;13(5):947–953. doi: 10.1109/TCBB.2015.2495140 26571537
45. Gene Ontology Consortium. The Gene Ontology (GO) database and informatics resource. Nucleic Acids Research. 2004;32(suppl_1):D258–61. doi: 10.1093/nar/gkh036 14681407
46. Chou KC. Prediction of protein cellular attributes using pseudo-amino acid composition. Proteins: Structure, Function, and Bioinformatics. 2001;43(3):246–255. doi: 10.1002/prot.1035
47. Tanford C. Contribution of hydrophobic interactions to the stability of the globular conformation of proteins. Journal of the American Chemical Society. 1962;84(22):4240–4247. doi: 10.1021/ja00881a009
48. Hopp TP, Woods KR. Prediction of protein antigenic determinants from amino acid sequences. Proceedings of the National Academy of Sciences. 1981;78(6):3824–3828. doi: 10.1073/pnas.78.6.3824
49. Thompson JD, Higgins DG, Gibson TJ. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Research. 1994;22(22):4673–4680. doi: 10.1093/nar/22.22.4673 7984417
50. Boeckmann B, Bairoch A, Apweiler R, Blatter MC, Estreicher A, Gasteiger E, et al. The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Research. 2003;31(1):365–370. doi: 10.1093/nar/gkg095 12520024
51. Ding Z. Diversified ensemble classifiers for highly imbalanced data learning and their application in bioinformatics. Georgia State University; 2011.
52. Weiss GM, Provost F. Learning when training data are costly: The effect of class distribution on tree induction. Journal of Artificial Intelligence Research. 2003;19:315–354. doi: 10.1613/jair.1199
53. Bekkar M, Djemaa HK, Alitouche TA. Evaluation measures for models assessment over imbalanced data sets. Journal of Information Engineering and Applications. 2013;3(10).
54. Manning C, Raghavan P, Schütze H. Introduction to information retrieval. Natural Language Engineering. 2010;16(1):280–3.
55. Gorodkin J. Comparing two K-category assignments by a K-category correlation coefficient. Computational Biology and Chemistry. 2004;28(5):367–374. doi: 10.1016/j.compbiolchem.2004.09.006 15556477
56. Kwak SG, Kim JH. Central limit theorem: the cornerstone of modern statistics. Korean Journal of Anesthesiology. 2017;70(2):144–156. doi: 10.4097/kjae.2017.70.2.144 28367284
57. Kyte J, Doolittle RF. A simple method for displaying the hydropathic character of a protein. Journal of Molecular Biology. 1982;157(1):105–32. doi: 10.1016/0022-2836(82)90515-0 7108955
58. Tsirigos KD, Peters C, Shu N, Käll L, Elofsson A. The TOPCONS web server for consensus prediction of membrane protein topology and signal peptides. Nucleic Acids Research. 2015;43(W1)W401–W407. doi: 10.1093/nar/gkv485 25969446
59. Tsirigos KD, Elofsson A, Bagos PG. PRED-TMBB2: improved topology prediction and detection of beta-barrel outer membrane proteins. Bioinformatics. 2016;32(17):i665–i671. doi: 10.1093/bioinformatics/btw444 27587687
Článek vyšel v časopise
PLOS One
2020 Číslo 1
- S diagnostikou Parkinsonovy nemoci může nově pomoci AI nástroj pro hodnocení mrkacího reflexu
- Proč při poslechu některé muziky prostě musíme tančit?
- Je libo čepici místo mozkového implantátu?
- Chůze do schodů pomáhá prodloužit život a vyhnout se srdečním chorobám
- Pomůže v budoucnu s triáží na pohotovostech umělá inteligence?
Nejčtenější v tomto čísle
- Severity of misophonia symptoms is associated with worse cognitive control when exposed to misophonia trigger sounds
- Chemical analysis of snus products from the United States and northern Europe
- Calcium dobesilate reduces VEGF signaling by interfering with heparan sulfate binding site and protects from vascular complications in diabetic mice
- Effect of Lactobacillus acidophilus D2/CSL (CECT 4529) supplementation in drinking water on chicken crop and caeca microbiome
Zvyšte si kvalifikaci online z pohodlí domova
Všechny kurzy