Deep2Full: Evaluating strategies for selecting the minimal mutational experiments for optimal computational predictions of deep mutational scan outcomes
Autoři:
C. K. Sruthi aff001; Meher Prakash aff001
Působiště autorů:
Theoretical Sciences Unit, Jawaharlal Nehru Centre for Advanced Scientific Research, Bangalore, India
aff001
Vyšlo v časopise:
PLoS ONE 15(1)
Kategorie:
Research Article
doi:
https://doi.org/10.1371/journal.pone.0227621
Souhrn
Performing a complete deep mutational scan with all single point mutations may not be practical, and may not even be required, especially if predictive computational models can be developed. Computational models are however naive to cellular response in the myriads of assay-conditions. In a realistic paradigm of assay context-aware predictive hybrid models that combine minimal experimental data from deep mutational scans with structure, sequence information and computational models, we define and evaluate different strategies for choosing this minimal set. We evaluated the trivial strategy of a systematic reduction in the number of mutational studies from 85% to 15%, along with several others about the choice of the types of mutations such as random versus site-directed with the same 15% data completeness. Interestingly, the predictive capabilities by training on a random set of mutations and using a systematic substitution of all amino acids to alanine, asparagine and histidine (ANH) were comparable. Another strategy we explored, augmenting the training data with measurements of the same mutants at multiple assay conditions, did not improve the prediction quality. For the six proteins we analyzed, the bin-wise error in prediction is optimal when 50-100 mutations per bin are used in training the computational model, suggesting that good prediction quality may be achieved with a library of 500-1000 mutations.
Klíčová slova:
Alanine – Amino acid substitution – Human mobility – Mutation detection – Neural networks – Point mutation – Protein sequencing – Substitution mutation
Zdroje
1. Eyre-Walker A, Keightley PD. The distribution of fitness effects of new mutations. Nature Reviews Genetics. 2007;8(8):610. doi: 10.1038/nrg2146 17637733
2. Nachman M. Single nucleotide polymorphisms and recombination rate in humans. Trends in Genetics. 2001;17(9):481–485. doi: 10.1016/s0168-9525(01)02409-x 11525814
3. Barreiro LB, Laval G, Quach H, Patin E, Quintana-Murci L. Natural selection has driven population differentiation in modern humans. Nature Genetics. 2008;40(3):340–345. doi: 10.1038/ng.78 18246066
4. Gudmundsson J, Sulem P, Steinthorsdottir V, Bergthorsson JT, Thorleifsson G, Thorsteinsdottir U, et al. Two variants on chromosome 17 confer prostate cancer risk, and the one in TCF2 protects against type 2 diabetes. Nature Genetics. 2007;39(8):977–983. doi: 10.1038/ng2062 17603485
5. O’Hayre M, Vazquez-Prado J, Kufareva I, Stawiski EW, Handel TM, Seshagiri S, et al. The emerging mutational landscape of G proteins and G-protein-coupled receptors in cancer. Nature Genetics. 2013;13:412–424.
6. Walsh C. Molecular mechanisms that confer antibacterial drug resistance. Nature. 2000;406:775–781. doi: 10.1038/35021219 10963607
7. Brown ED, Wright GD. Antibacterial drug discovery in the resistance era. Nature. 2017;529:336–343. doi: 10.1038/nature17042
8. Sommer MOA, Munck C, Toft-Kehler RV, Andersson DI. Molecular mechanisms that confer antibacterial drug resistance. Nature. 2000;406:775–781. doi: 10.1038/35021219
9. Cunningham B, Wells J. High-resolution epitope mapping of high-receptor interactions by alanine-scanning mutagenesis. Science. 1989;244(4908):1081–1085.
10. Kristensen C, Kjeldsen T, Wiberg FC, Schäffer L, Hach M, Havelund S, et al. Alanine scanning mutagenesis of insulin. Journal of Biological Chemistry. 1997;272(20):12978–12983. doi: 10.1074/jbc.272.20.12978 9148904
11. Yu MH, Weissman JS, Kim PS. Contribution of individual side-chains to the stability of BPTI examined by alanine-scanning mutagenesis. Journal of molecular biology. 1995;249(2):388–397. doi: 10.1006/jmbi.1995.0304 7540212
12. Hove-Jensen B, Bentsen AKK, Harlow KW. Catalytic residues Lys197 and Arg199 of Bacillus subtilis phosphoribosyl diphosphate synthase: Alanine-scanning mutagenesis of the flexible catalytic loop. The FEBS journal. 2005;272(14):3631–3639. doi: 10.1111/j.1742-4658.2005.04785.x 16008562
13. Fowler DM, Araya CL, Fleishman SJ, Kellogg EH, Stephany JJ, Baker D, et al. High-resolution mapping of protein sequence-function relationships. Nature Methods. 2010;7(9):741. doi: 10.1038/nmeth.1492 20711194
14. Hietpas RT, Jensen JD, Bolon DNA. Experimental illumination of a fitness landscape. Proceedings of the National Academy of Sciences of the United States of America. 2011;108(19):7896–7901. doi: 10.1073/pnas.1016024108 21464309
15. Zheng L, Baumann U, Reymond JL. An efficient one-step site-directed and site-saturation mutagenesis protocol. Nucleic Acids Research. 2004;32:e115. doi: 10.1093/nar/gnh110 15304544
16. Araya CL, Fowler DM. Deep mutational scanning: assessing protein function on a massive scale. Trends in Biotechnology. 2011;29(9):435–442. doi: 10.1016/j.tibtech.2011.04.003 21561674
17. Weinreich DM, Delaney NF, DePristo MA, Hartl DL. Darwinian evolution can follow only very few mutational paths to fitter proteins. science. 2006;312(5770):111–114. doi: 10.1126/science.1123539 16601193
18. Sim NL, Kumar P, Hu J, Henikoff S, Schneider G, Ng PC. SIFT web server: predicting effects of amino acid substitutions on proteins. Nucleic Acids Research. 2012;40(W1):W452–W457. doi: 10.1093/nar/gks539 22689647
19. Hecht M, Bromberg Y, Rost B. Better prediction of functional effects for sequence variants. BMC genomics. 2015;16(8):S1. doi: 10.1186/1471-2164-16-S8-S1 26110438
20. Niroula A, Urolagin S, Vihinen M. PON-P2: prediction method for fast and reliable identification of harmful variants. PloS one. 2015;10(2):e0117380. doi: 10.1371/journal.pone.0117380 25647319
21. Yue P, Li Z, Moult J. Loss of protein structure stability as a major causative factor in monogenic disease. Journal of Molecular Biology. 2005;353(2):459–473. doi: 10.1016/j.jmb.2005.08.020 16169011
22. Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, et al. A method and server for predicting damaging missense mutations. Nature Methods. 2010;7(4):248–249. doi: 10.1038/nmeth0410-248 20354512
23. González-Pérez A, López-Bigas N. Improving the assessment of the outcome of nonsynonymous SNVs with a consensus deleteriousness score, Condel. The American Journal of Human Genetics. 2011;88(4):440–449. doi: 10.1016/j.ajhg.2011.03.004 21457909
24. Kircher M, Witten DM, Jain P, O’Roak BJ, Cooper GM, Shendure J. A general framework for estimating the relative pathogenicity of human genetic variants. Nature genetics. 2014;46(3):310. doi: 10.1038/ng.2892 24487276
25. Ioannidis NM, Rothstein JH, Pejaver V, Middha S, McDonnell SK, Baheti S, et al. REVEL: an ensemble method for predicting the pathogenicity of rare missense variants. The American Journal of Human Genetics. 2016;99(4):877–885. doi: 10.1016/j.ajhg.2016.08.016 27666373
26. Olatubosun A, Väliaho J, Härkönen J, Thusberg J, Vihinen M. PON-P: Integrated predictor for pathogenicity of missense variants. Human mutation. 2012;33(8):1166–1174. doi: 10.1002/humu.22102 22505138
27. Hopf TA, Ingraham JB, Poelwijk FJ, Scharfe CPI, Springer M, Sander C, et al. Mutation effects predicted from sequence co-variation. Nature Biotechnology. 2017;35(2):128–135. doi: 10.1038/nbt.3769 28092658
28. Riesselman AJ, Ingraham JB, Marks DS. Deep generative models of genetic variation capture the effects of mutations. Nature Methods. 2018;15(10):816+. doi: 10.1038/s41592-018-0138-4 30250057
29. Gray VE, Hause RJ, Luebeck J, Shendure J, Fowler DM. Quantitative missense variant effect prediction using large-scale mutagenesis data. Cell systems. 2018;6(1):116–124. doi: 10.1016/j.cels.2017.11.003 29226803
30. Riera C, Padilla N, de la Cruz X. The complementarity between protein-specific and general pathogenicity predictors for amino acid substitutions. Human mutation. 2016;37(10):1013–1024. doi: 10.1002/humu.23048 27397615
31. Weile J, Sun S, Cote AG, Knapp J, Verby M, Mellor JC, et al. A framework for exhaustively mapping functional missense variants. Molecular Systems Biology. 2017;13(12). doi: 10.15252/msb.20177908 29269382
32. Yingzhou W, Weile J, Cote A, Sun S, Knapp J, Verby M, et al. A web application and service for imputing and visualizing missense variant effect maps. Bioinformatics (Oxford, England). 2019.
33. Stiffler MA, Hekstra DR, Ranganathan R. Evolvability as a Function of Purifying Selection in TEM-1 beta-Lactamase. Cell. 2015;160(5):882–892. doi: 10.1016/j.cell.2015.01.035 25723163
34. Melnikov A, Rogov P, Wang L, Gnirke A, Mikkelsen TS. Comprehensive mutational scanning of a kinase in vivo reveals substrate-dependent fitness landscapes. Nucleic Acids Research. 2014;42(14). doi: 10.1093/nar/gku511 24914046
35. Mishra P, Flynn JM, Starr TN, Bolon DNA. Systematic Mutant Analyses Elucidate General and Client-Specific Aspects of Hsp90 Function. Cell Reports. 2016;15(3):588–598. doi: 10.1016/j.celrep.2016.03.046 27068472
36. Brenan L, Andreev A, Cohen O, Pantel S, Kamburov A, Cacchiarelli D, et al. Phenotypic Characterization of a Comprehensive Set of MAPK1/ERK2 Missense Mutants. Cell Reports. 2016;17(4):1171–1183. doi: 10.1016/j.celrep.2016.09.061 27760319
37. Gray VE, Hause RJ, Fowler DM. Analysis of Large-Scale Mutagenesis Data To Assess the Impact of Single Amino Acid Substitutions. Genetics. 2017;207(1):53–61. doi: 10.1534/genetics.117.300064 28751422
38. Adkar BV, Tripathi A, Sahoo A, Bajaj K, Goswami D, Chakrabarti P, et al. Protein model discrimination using mutational sensitivity derived from deep sequencing. Structure. 2012;20(2):371–381. doi: 10.1016/j.str.2011.11.021 22325784
39. Wong TS, Roccatano D, Zacharias M, Schwaneberg U. A statistical analysis of random mutagenesis methods used for directed protein evolution. Journal of molecular biology. 2006;355(4):858–871. doi: 10.1016/j.jmb.2005.10.082 16325201
40. Currin A, Swainston N, Day PJ, Kell DB. Synthetic biology for the directed evolution of protein biocatalysts: navigating sequence space intelligently. Chemical Society Reviews. 2015;44(5):1172–1239. doi: 10.1039/c4cs00351a 25503938
41. Abdullah T, Faiza M, Pant P, Akhtar MR, Pant P. An Analysis of Single Nucleotide Substitution in Genetic Codons-Probabilities and Outcomes. Bioinformation. 2016;12(3):98. doi: 10.6026/97320630012098 28149042
42. Matuszewski S, Hildebrandt ME, Ghenu AH, Jensen JD, Bank C. A Statistical Guide to the Design of Deep Mutational Scanning Experiments. Genetics. 2016;204(1):77–87. doi: 10.1534/genetics.116.190462 27412710
44. Chennubhotla C, Bahar I. Signal propagation in proteins and relation to equilibrium fluctuations. PLOS Computational Biology. 2007;3(9):1716–1726. doi: 10.1371/journal.pcbi.0030172 17892319
43. Bromberg Y, Yachdav G, Rost B. SNAP predicts effect of mutations on protein function. Bioinformatics. 2008;24(20):2397–2398. doi: 10.1093/bioinformatics/btn435 18757876
45. Henikoff S, Henikoff J. Amino-acid substitution matrices from protein blocks. Proceedings of the National Academy of Sciences, USA. 1992;89(22):10915–10919. doi: 10.1073/pnas.89.22.10915
46. Kyte J, Doolittle RF. A simple method for displaying the hydropathic character of a protein. Journal of molecular biology. 1982;157(1):105–132. doi: 10.1016/0022-2836(82)90515-0 7108955
47. Sruthi C, Prakash M. Amino acid impact factor. PloS one. 2018;13(6):e0198645. doi: 10.1371/journal.pone.0198645 29897971
48. Halabi N, Rivoire O, Leibler S, Ranganathan R. Protein sectors: evolutionary units of three-dimensional structure. Cell. 2009;138(4):774–786. doi: 10.1016/j.cell.2009.07.038 19703402
49. Csardi G, Nepusz T. The igraph software package for complex network research. InterJournal, Complex Systems. 2006;1695(5):1–9.
Článek vyšel v časopise
PLOS One
2020 Číslo 1
- Ženy v medicíně, medicína pro ženy – „jednohubky“ z výzkumu 2025/9
- Jak mluvit s dítětem o lékařské profesi a její náplni?
- Metamizol jako analgetikum první volby: kdy, pro koho, jak a proč?
- Není statin jako statin aneb praktický přehled rozdílů jednotlivých molekul
- Genderová nerovnováha v českém zdravotnictví přetrvává. Co s tím?
Nejčtenější v tomto čísle
- Severity of misophonia symptoms is associated with worse cognitive control when exposed to misophonia trigger sounds
- Chemical analysis of snus products from the United States and northern Europe
- Calcium dobesilate reduces VEGF signaling by interfering with heparan sulfate binding site and protects from vascular complications in diabetic mice
- Effect of Lactobacillus acidophilus D2/CSL (CECT 4529) supplementation in drinking water on chicken crop and caeca microbiome
Zvyšte si kvalifikaci online z pohodlí domova
Všechny kurzy