Population genetic models of GERP scores suggest pervasive turnover of constrained sites across mammalian evolution
Autoři:
Christian D. Huber aff001; Bernard Y. Kim aff002; Kirk E. Lohmueller aff003
Působiště autorů:
School of Biological Sciences, University of Adelaide, Adelaide, South Australia, Australia
aff001; Department of Biology, Stanford University, Stanford, California, United States of America
aff002; Department of Ecology and Evolutionary Biology, University of California, Los Angeles, California, United States of America
aff003; Interdepartmental Program in Bioinformatics, University of California, Los Angeles, California, United States of America
aff004; Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, California, United States of America
aff005
Vyšlo v časopise:
Population genetic models of GERP scores suggest pervasive turnover of constrained sites across mammalian evolution. PLoS Genet 16(5): e32767. doi:10.1371/journal.pgen.1008827
Kategorie:
Research Article
doi:
https://doi.org/10.1371/journal.pgen.1008827
Souhrn
Comparative genomic approaches have been used to identify sites where mutations are under purifying selection and of functional consequence by searching for sequences that are conserved across distantly related species. However, the performance of these approaches has not been rigorously evaluated under population genetic models. Further, short-lived functional elements may not leave a footprint of sequence conservation across many species. We use simulations to study how one measure of conservation, the Genomic Evolutionary Rate Profiling (GERP) score, relates to the strength of selection (Nes). We show that the GERP score is related to the strength of purifying selection. However, changes in selection coefficients or functional elements over time (i.e. functional turnover) can strongly affect the GERP distribution, leading to unexpected relationships between GERP and Nes. Further, we show that for functional elements that have a high turnover rate, adding more species to the analysis does not necessarily increase statistical power. Finally, we use the distribution of GERP scores across the human genome to compare models with and without turnover of sites where mutations are under purifying selection. We show that mutations in 4.51% of the noncoding human genome are under purifying selection and that most of this sequence has likely experienced changes in selection coefficients throughout mammalian evolution. Our work reveals limitations to using comparative genomic approaches to identify deleterious mutations. Commonly used GERP score thresholds miss over half of the noncoding sites in the human genome where mutations are under purifying selection.
Klíčová slova:
Comparative genomics – Deletion mutation – Genome evolution – Human genomics – Natural selection – Phylogenetic analysis – Sequence alignment – Substitution mutation
Zdroje
1. Visscher PM, Wray NR, Zhang Q, Sklar P, McCarthy MI, Brown MA, et al. 10 Years of GWAS Discovery: Biology, Function, and Translation. Am J Hum Genet. 2017;101: 5–22. doi: 10.1016/j.ajhg.2017.06.005 28686856
2. Edwards SL, Beesley J, French JD, Dunning AM. Beyond GWASs: illuminating the dark road from association to function. Am J Hum Genet. 2013;93: 779–797. doi: 10.1016/j.ajhg.2013.10.012 24210251
3. Schubert M, Jónsson H, Chang D, Der Sarkissian C, Ermini L, Ginolhac A, et al. Prehistoric genomes reveal the genetic foundation and cost of horse domestication. Proc Natl Acad Sci U S A. 2014;111: E5661–9. doi: 10.1073/pnas.1416991111 25512547
4. Marsden CD, Vecchyo DO-D, O’Brien DP, Taylor JF, Ramirez O, Vilà C, et al. Bottlenecks and selective sweeps during domestication have increased deleterious genetic variation in dogs. Proc Natl Acad Sci U S A. 2016;113: 152–157. doi: 10.1073/pnas.1512501113 26699508
5. Henn BM, Botigué LR, Peischl S, Dupanloup I, Lipatov M, Maples BK, et al. Distance from sub-Saharan Africa predicts mutational load in diverse human genomes. Proc Natl Acad Sci U S A. 2016;113: E440–E449. doi: 10.1073/pnas.1510805112 26712023
6. van der Valk T, de Manuel M, Marques-Bonet T, Guschanski K. Estimates of genetic load in small populations suggest extensive purging of deleterious alleles. bioRxiv. 2019. doi: 10.1101/696831
7. Cooper GM, Stone EA, Asimenos G, NISC Comparative Sequencing Program, Green ED, Batzoglou S, et al. Distribution and intensity of constraint in mammalian genomic sequence. Genome Res. 2005;15: 901–913. doi: 10.1101/gr.3577405 15965027
8. Pollard KS, Hubisz MJ, Rosenbloom KR, Siepel A. Detection of nonneutral substitution rates on mammalian phylogenies. Genome Res. 2010;20: 110–121. doi: 10.1101/gr.097857.109 19858363
9. Siepel A, Bejerano G, Pedersen JS, Hinrichs AS, Hou M, Rosenbloom K, et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 2005;15: 1034–1050. doi: 10.1101/gr.3715005 16024819
10. Margulies EH, Blanchette M, NISC Comparative Sequencing Program, Haussler D, Green ED. Identification and characterization of multi-species conserved sequences. Genome Res. 2003;13: 2507–2518. doi: 10.1101/gr.1602203 14656959
11. Asthana S, Roytberg M, Stamatoyannopoulos J, Sunyaev S. Analysis of Sequence Conservation at Nucleotide Resolution. PLoS Computational Biology. 2007; 3: e254. doi: 10.1371/journal.pcbi.0030254 18166073
12. Boffelli D, McAuliffe J, Ovcharenko D, Lewis KD, Ovcharenko I, Pachter L, et al. Phylogenetic shadowing of primate sequences to find functional regions of the human genome. Science. 2003;299: 1391–1394. doi: 10.1126/science.1081331 12610304
13. Miller W, Makova KD, Nekrutenko A, Hardison RC. Comparative genomics. Annu Rev Genomics Hum Genet. 2004;5: 15–56. doi: 10.1146/annurev.genom.5.061903.180057 15485342
14. Mouse Genome Sequencing Consortium. Initial sequencing and comparative analysis of the mouse genome. Nature. 2002;420: 520–562. doi: 10.1038/nature01262 12466850
15. Ng PC, Henikoff S. SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res. 2003;31: 3812–3814. doi: 10.1093/nar/gkg509 12824425
16. Adzhubei I, Jordan DM, Sunyaev SR. Predicting functional effect of human missense mutations using PolyPhen-2. Curr Protoc Hum Genet. 2013;Chapter 7: Unit7.20.
17. Rentzsch P, Witten D, Cooper GM, Shendure J, Kircher M. CADD: predicting the deleteriousness of variants throughout the human genome. Nucleic Acids Res. 2019;47: D886–D894. doi: 10.1093/nar/gky1016 30371827
18. Huang Y-F, Gulko B, Siepel A. Fast, scalable prediction of deleterious noncoding variants from functional and population genomic data. Nat Genet. 2017;49: 618–624. doi: 10.1038/ng.3810 28288115
19. Davydov EV, Goode DL, Sirota M, Cooper GM, Sidow A, Batzoglou S. Identifying a High Fraction of the Human Genome to be under Selective Constraint Using GERP++. PLoS Comput Biol. 2010;6: e1001025. doi: 10.1371/journal.pcbi.1001025 21152010
20. Cooper GM, Stone EA, Asimenos G, NISC Comparative Sequencing Program, Green ED, Batzoglou S, et al. Distribution and intensity of constraint in mammalian genomic sequence. Genome Res. 2005;15: 901–913. doi: 10.1101/gr.3577405 15965027
21. Kimura M. On the probability of fixation of mutant genes in a population. Genetics. 1962;47: 713–719. 14456043
22. Lanfear R, Kokko H, Eyre-Walker A. Population size and the rate of evolution. Trends Ecol Evol. 2014;29: 33–41. doi: 10.1016/j.tree.2013.09.009 24148292
23. Lawrie DS, Petrov DA. Comparative population genomics: power and principles for the inference of functionality. Trends Genet. 2014;30: 133–139. doi: 10.1016/j.tig.2014.02.002 24656563
24. Nielsen R, Yang Z. Estimating the distribution of selection coefficients from phylogenetic data with applications to mitochondrial and viral DNA. Mol Biol Evol. 2003;20: 1231–1239. doi: 10.1093/molbev/msg147 12777508
25. Rands CM, Meader S, Ponting CP, Lunter G. 8.2% of the Human Genome Is Constrained: Variation in Rates of Turnover across Functional Element Classes in the Human Lineage. PLoS Genet. 2014;10: e1004525. doi: 10.1371/journal.pgen.1004525 25057982
26. Gulko B, Hubisz MJ, Gronau I, Siepel A. A method for calculating probabilities of fitness consequences for point mutations across the human genome. Nat Genet. 2015;47: 276–283. doi: 10.1038/ng.3196 25599402
27. McAuliffe JD, Jordan MI, Pachter L. Subtree power analysis and species selection for comparative genomics. Proc Natl Acad Sci U S A. 2005;102: 7900–7905. doi: 10.1073/pnas.0502790102 15911755
28. Koepfli K-P, Paten B, O’Brien SJ, the Genome 10K Community of Scientists. The Genome 10K Project: A Way Forward. Annual Review of Animal Biosciences. 2015;3: 57–111. doi: 10.1146/annurev-animal-090414-014900 25689317
29. Genome 10K Community of Scientists. Genome 10K: A Proposal to Obtain Whole-Genome Sequence for 10 000 Vertebrate Species. Journal of Heredity. 2009;100: 659–674. doi: 10.1093/jhered/esp086 19892720
30. Smith NGC, Brandström M, Ellegren H. Evidence for turnover of functional noncoding DNA in mammalian genome evolution. Genomics. 2004;84: 806–813. doi: 10.1016/j.ygeno.2004.07.012 15475259
31. Guenet JL. The mouse genome. Genome Research. 2005;15: 1729–1740. doi: 10.1101/gr.3728305 16339371
32. Pollard KS, Hubisz MJ, Rosenbloom KR, Siepel A. Detection of nonneutral substitution rates on mammalian phylogenies. Genome Res. 2010;20: 110–121. doi: 10.1101/gr.097857.109 19858363
33. Ponting CP, Hardison RC. What fraction of the human genome is functional? Genome Res. 2011;21: 1769–1776. doi: 10.1101/gr.116814.110 21875934
34. The ENCODE Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489: 57–74. doi: 10.1038/nature11247 22955616
35. Graur D, Zheng Y, Price N, Azevedo RBR, Zufall RA, Elhaik E. On the immortality of television sets: “function” in the human genome according to the evolution-free gospel of ENCODE. Genome Biol Evol. 2013;5: 578–590. doi: 10.1093/gbe/evt028 23431001
36. Doolittle WF. Is junk DNA bunk? A critique of ENCODE. Proceedings of the National Academy of Sciences. 2013;110: 5294–5300.
37. Meader S, Ponting CP, Lunter G. Massive turnover of functional sequence in human and other mammalian genomes. Genome Res. 2010;20: 1335–1343. doi: 10.1101/gr.108795.110 20693480
38. Ward LD, Kellis M. Evidence of abundant purifying selection in humans for recently acquired regulatory functions. Science. 2012;337: 1675–1678. doi: 10.1126/science.1225057 22956687
39. Ludwig M. Functional evolution of noncoding DNA. Current Opinion in Genetics & Development. 2002;12: 634–639.
40. Bullaughey K. Changes in selective effects over time facilitate turnover of enhancer sequences. Genetics. 2011;187: 567–582. doi: 10.1534/genetics.110.121590 21098721
41. Henn BM, Botigué LR, Bustamante CD, Clark AG, Gravel S. Estimating the mutation load in human genomes. Nat Rev Genet. 2015;16: 333–343. doi: 10.1038/nrg3931 25963372
42. Wang L, Beissinger TM, Lorant A, Ross-Ibarra C, Ross-Ibarra J, Hufford MB. The interplay of demography and selection during maize domestication and expansion. Genome Biol. 2017;18: 215. doi: 10.1186/s13059-017-1346-4 29132403
43. Pedersen C-ET, Lohmueller KE, Grarup N, Bjerregaard P, Hansen T, Siegismund HR, et al. The Effect of an Extreme and Prolonged Population Bottleneck on Patterns of Deleterious Variation: Insights from the Greenlandic Inuit. Genetics. 2017;205: 787–801. doi: 10.1534/genetics.116.193821 27903613
44. Kim BY, Huber CD, Lohmueller KE. Inference of the Distribution of Selection Coefficients for New Nonsynonymous Mutations Using Large Samples. Genetics. 2017;206: 345–361. doi: 10.1534/genetics.116.197145 28249985
45. Torgerson DG, Boyko AR, Hernandez RD, Indap A, Hu X, White TJ, et al. Evolutionary Processes Acting on Candidate cis-Regulatory Regions in Humans Inferred from Patterns of Polymorphism and Divergence. PLoS Genet. 2009;5: e1000592. doi: 10.1371/journal.pgen.1000592 19662163
46. Church DM, Goodstadt L, Hillier LW, Zody MC, Goldstein S, She X, et al. Lineage-specific biology revealed by a finished genome assembly of the mouse. PLoS Biol. 2009;7: e1000112. doi: 10.1371/journal.pbio.1000112 19468303
47. Pheasant M, Mattick JS. Raising the estimate of functional human sequences. Genome Res. 2007;17: 1245–1253. doi: 10.1101/gr.6406307 17690206
48. Künstner A, Nabholz B, Ellegren H. Significant selective constraint at 4-fold degenerate sites in the avian genome and its consequence for detection of positive selection. Genome Biol Evol. 2011;3: 1381–1389.
49. Hellmann I, Zollner S, Enard W, Ebersberger I, Nickel B, Paabo S. Selection on human genes as revealed by comparisons to chimpanzee cDNA. Genome Res. 2003;13: 831–837. doi: 10.1101/gr.944903 12727903
50. Alföldi J, Lindblad-Toh K. Comparative genomics as a tool to understand evolution and disease. Genome Res. 2013;23: 1063–1068. doi: 10.1101/gr.157503.113 23817047
51. Lewinger JP, Conti DV, Baurley JW, Triche TJ, Thomas DC. Hierarchical Bayes prioritization of marker associations from a genome-wide association scan for further investigation. Genet Epidemiol. 2007;31: 871–882. doi: 10.1002/gepi.20248 17654612
52. Chen GK, Witte JS. Enriching the analysis of genomewide association studies with hierarchical modeling. Am J Hum Genet. 2007;81: 397–404. doi: 10.1086/519794 17668389
53. King DC, Taylor J, Elnitski L, Chiaromonte F, Miller W, Hardison RC. Evaluation of regulatory potential and conservation scores for detecting cis-regulatory modules in aligned mammalian genome sequences. Genome Res. 2005;15: 1051–1060. doi: 10.1101/gr.3642605 16024817
54. Gazal S, Finucane HK, Furlotte NA, Loh P-R, Palamara PF, Liu X, et al. Linkage disequilibrium-dependent architecture of human complex traits shows action of negative selection. Nat Genet. 2017;49: 1421–1427. doi: 10.1038/ng.3954 28892061
55. Finucane HK, Bulik-Sullivan B, Gusev A, Trynka G, Reshef Y, Loh P-R, et al. Partitioning heritability by functional annotation using genome-wide association summary statistics. Nat Genet. 2015;47: 1228–1235. doi: 10.1038/ng.3404 26414678
56. Lek M, Karczewski KJ, Minikel EV, Samocha KE, Banks E, Fennell T, et al. Analysis of protein-coding genetic variation in 60,706 humans. Nature. 2016;536: 285–291. doi: 10.1038/nature19057 27535533
57. Karczewski KJ, Francioli LC, Tiao G, Cummings BB, Alföldi J, Wang Q, et al. Variation across 141,456 human exomes and genomes reveals the spectrum of loss-of-function intolerance across human protein-coding genes. bioRxiv. 2019. doi: 10.1101/531210
58. Schrider DR, Kern AD. Inferring Selective Constraint from Population Genomic Data Suggests Recent Regulatory Turnover in the Human Brain. Genome Biol Evol. 2015;7: 3511–3528. doi: 10.1093/gbe/evv228 26590212
59. Havrilla JM, Pedersen BS, Layer RM, Quinlan AR. A map of constrained coding regions in the human genome. Nat Genet. 2019;51: 88–95. doi: 10.1038/s41588-018-0294-6 30531870
60. Gulko B, Siepel A. An evolutionary framework for measuring epigenomic information and estimating cell-type-specific fitness consequences. Nat Genet. 2019;51: 335–342. doi: 10.1038/s41588-018-0300-z 30559490
61. Kircher M, Witten DM, Jain P, O’Roak BJ, Cooper GM, Shendure J. A general framework for estimating the relative pathogenicity of human genetic variants. Nat Genet. 2014;46: 310–315. doi: 10.1038/ng.2892 24487276
62. Spielman SJ, Wilke CO. Pyvolve: A Flexible Python Module for Simulating Sequences along Phylogenies. PLoS One. 2015;10: e0139047. doi: 10.1371/journal.pone.0139047 26397960
Článek vyšel v časopise
PLOS Genetics
2020 Číslo 5
- S diagnostikou Parkinsonovy nemoci může nově pomoci AI nástroj pro hodnocení mrkacího reflexu
- Proč při poslechu některé muziky prostě musíme tančit?
- Chůze do schodů pomáhá prodloužit život a vyhnout se srdečním chorobám
- Metamizol jako analgetikum první volby: kdy, pro koho, jak a proč?
- „Jednohubky“ z klinického výzkumu – 2024/44
Nejčtenější v tomto čísle
- The domesticated transposase ALP2 mediates formation of a novel Polycomb protein complex by direct interaction with MSI1, a core subunit of Polycomb Repressive Complex 2 (PRC2)
- Polyploidy breaks speciation barriers in Australian burrowing frogs Neobatrachus
- The phosphorelay BarA/SirA activates the non-cognate regulator RcsB in Salmonella enterica
- Congenital hearing impairment associated with peripheral cochlear nerve dysmyelination in glycosylation-deficient muscular dystrophy