Leveraging correlations between variants in polygenic risk scores to detect heterogeneity in GWAS cohorts
Autoři:
Jie Yuan aff001; Henry Xing aff001; Alexandre Louis Lamy aff001; aff001; Todd Lencz aff002; Itsik Pe’er aff001
Působiště autorů:
Department of Computer Science, Columbia University, New York, United States of America
aff001; The Center for Psychiatric Neuroscience, Feinstein Institutes for Medical Research, New York, United States of America
aff002
Vyšlo v časopise:
Leveraging correlations between variants in polygenic risk scores to detect heterogeneity in GWAS cohorts. PLoS Genet 16(9): e32767. doi:10.1371/journal.pgen.1009015
Kategorie:
Research Article
doi:
https://doi.org/10.1371/journal.pgen.1009015
Souhrn
Evidence from both GWAS and clinical observation has suggested that certain psychiatric, metabolic, and autoimmune diseases are heterogeneous, comprising multiple subtypes with distinct genomic etiologies and Polygenic Risk Scores (PRS). However, the presence of subtypes within many phenotypes is frequently unknown. We present CLiP (Correlated Liability Predictors), a method to detect heterogeneity in single GWAS cohorts. CLiP calculates a weighted sum of correlations between SNPs contributing to a PRS on the case/control liability scale. We demonstrate mathematically and through simulation that among i.i.d. homogeneous cases generated by a liability threshold model, significant anti-correlations are expected between otherwise independent predictors due to ascertainment on the hidden liability score. In the presence of heterogeneity from distinct etiologies, confounding by covariates, or mislabeling, these correlation patterns are altered predictably. We further extend our method to two additional association study designs: CLiP-X for quantitative predictors in applications such as transcriptome-wide association, and CLiP-Y for quantitative phenotypes, where there is no clear distinction between cases and controls. Through simulations, we demonstrate that CLiP and its extensions reliably distinguish between homogeneous and heterogeneous cohorts when the PRS explains as low as 3% of variance on the liability scale and cohorts comprise 50, 000 − 100, 000 samples, an increasingly practical size for modern GWAS. We apply CLiP to heterogeneity detection in schizophrenia cohorts totaling > 50, 000 cases and controls collected by the Psychiatric Genomics Consortium. We observe significant heterogeneity in mega-analysis of the combined PGC data (p-value 8.54 × 0−4), as well as in individual cohorts meta-analyzed using Fisher’s method (p-value 0.03), based on significantly associated variants. We also apply CLiP-Y to detect heterogeneity in neuroticism in over 10, 000 individuals from the UK Biobank and detect heterogeneity with a p-value of 1.68 × 10−9. Scores were not significantly reduced when partitioning by known subclusters (“Depression” and “Worry”), suggesting that these factors are not the primary source of observed heterogeneity.
Klíčová slova:
Gene expression – Genome-wide association studies – Genomics – Medical risk factors – Normal distribution – Polynomials – Schizophrenia – Single nucleotide polymorphisms
Zdroje
1. Visscher PM, Wray NR, Zhang Q, Sklar P, McCarthy MI, Brown MA, et al. 10 years of GWAS discovery: biology, function, and translation. The American Journal of Human Genetics. 2017;101(1):5–22. doi: 10.1016/j.ajhg.2017.06.005 28686856
2. Welter D, MacArthur J, Morales J, Burdett T, Hall P, Junkins H, et al. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic acids research. 2013;42(D1):D1001–D1006. doi: 10.1093/nar/gkt1229 24316577
3. Wray NR, Goddard ME, Visscher PM. Prediction of individual genetic risk to disease from genome-wide association studies. Genome research. 2007;17(10):000–000. doi: 10.1101/gr.6665407 17785532
4. Wray NR, Maier R. Genetic basis of complex genetic disease: the contribution of disease heterogeneity to missing heritability. Current Epidemiology Reports. 2014;1(4):220–227. doi: 10.1007/s40471-014-0023-3
5. Purcell SM, Wray NR, Stone JL, Visscher PM, O’Donovan MC, Sullivan PF, et al. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature. 2009;460(7256):748–752. doi: 10.1038/nature08185 19571811
6. Yang J, Lee SH, Goddard ME, Visscher PM. GCTA: a tool for genome-wide complex trait analysis. The American Journal of Human Genetics. 2011;88(1):76–82. doi: 10.1016/j.ajhg.2010.11.011 21167468
7. Lee SH, Wray NR, Goddard ME, Visscher PM. Estimating missing heritability for disease from genome-wide association studies. The American Journal of Human Genetics. 2011;88(3):294–305. doi: 10.1016/j.ajhg.2011.02.002 21376301
8. Gillett AC, Vassos E, Lewis C. Transforming summary statistics from logistic regression to the liability scale: application to genetic and environmental risk scores. bioRxiv. 2018; p. 385740.
9. Wray NR, Goddard ME. Multi-locus models of genetic risk of disease. Genome Medicine. 2010;2(2):10. doi: 10.1186/gm131 20181060
10. Visscher PM, Wray NR. Concepts and misconceptions about the polygenic additive model applied to disease. Human heredity. 2015;80(4):165–170. doi: 10.1159/000446931 27576756
11. Edwards SL, Beesley J, French JD, Dunning AM. Beyond GWASs: illuminating the dark road from association to function. The American Journal of Human Genetics. 2013;93(5):779–797. doi: 10.1016/j.ajhg.2013.10.012 24210251
12. Boyle EA, Li YI, Pritchard JK. An expanded view of complex traits: from polygenic to omnigenic. Cell. 2017;169(7):1177–1186. doi: 10.1016/j.cell.2017.05.038 28622505
13. Liu X, Li YI, Pritchard JK. Trans effects on gene expression can drive omnigenic inheritance. Cell. 2019;177(4):1022–1034. doi: 10.1016/j.cell.2019.04.014 31051098
14. Bhattacharjee S, Rajaraman P, Jacobs KB, Wheeler WA, Melin BS, Hartge P, et al. A subset-based approach improves power and interpretation for the combined analysis of genetic association studies of heterogeneous traits. The American Journal of Human Genetics. 2012;90(5):821–835. doi: 10.1016/j.ajhg.2012.03.015 22560090
15. Wang M, Spiegelman D, Kuchiba A, Lochhead P, Kim S, Chan AT, et al. Statistical methods for studying disease subtype heterogeneity. Statistics in medicine. 2016;35(5):782–800. doi: 10.1002/sim.6793 26619806
16. Milaneschi Y, Lamers F, Peyrot WJ, Abdellaoui A, Willemsen G, Hottenga JJ, et al. Polygenic dissection of major depression clinical heterogeneity. Molecular psychiatry. 2016;21(4):516. doi: 10.1038/mp.2015.86 26122587
17. Charney A, Ruderfer D, Stahl E, Moran J, Chambert K, Belliveau R, et al. Evidence for genetic heterogeneity between clinical subtypes of bipolar disorder. Translational psychiatry. 2017;7(1):e993. doi: 10.1038/tp.2016.242 28072414
18. Graham DSC. Genome-wide association studies in systemic lupus erythematosus: a perspective; 2009.
19. Disanto G, Berlanga AJ, Handel AE, Para AE, Burrell AM, Fries A, et al. Heterogeneity in multiple sclerosis: scratching the surface of a complex disease. Autoimmune Diseases. 2011;2011. doi: 10.4061/2011/932351 21197462
20. Myers CT, Mefford HC. Advancing epilepsy genetics in the genomic era. Genome medicine. 2015;7(1):91. doi: 10.1186/s13073-015-0214-7 26302787
21. He N, Lin ZJ, Wang J, Wei F, Meng H, Liu XR, et al. Evaluating the pathogenic potential of genes with de novo variants in epileptic encephalopathies. Genetics in Medicine. 2019;21(1):17. doi: 10.1038/s41436-018-0011-y 29895856
22. Hinks A, Cobb J, Marion MC, Prahalad S, Sudman M, Bowes J, et al. Dense genotyping of immune-related disease regions identifies 14 new susceptibility loci for juvenile idiopathic arthritis. Nature genetics. 2013;45(6):664. doi: 10.1038/ng.2614 23603761
23. Martin AR, Kanai M, Kamatani Y, Okada Y, Neale BM, Daly MJ. Clinical use of current polygenic risk scores may exacerbate health disparities. Nature genetics. 2019;51(4):584. doi: 10.1038/s41588-019-0379-x 30926966
24. Mostafavi H, Harpak A, Conley D, Pritchard JK, Przeworski M. Variable prediction accuracy of polygenic scores within an ancestry group. BioRxiv. 2019; p. 629949.
25. Arnedo J, Svrakic DM, Del Val C, Romero-Zaliz R, Hernández-Cuervo H, of Schizophrenia Consortium MG, et al. Uncovering the hidden risk architecture of the schizophrenias: confirmation in three independent genome-wide association studies. American Journal of Psychiatry. 2015;172(2):139–153. doi: 10.1176/appi.ajp.2014.14040435 25219520
26. Derringer J. Explaining heritable variance in human character. bioRxiv. 2018; p. 446518.
27. Breen G, Bulik-Sullivan B, Daly M, Medland S, Neale B, O’Donovan M, et al. Eight types of schizophrenia? Not so fast…. http://genomesunzippedorg. 2014.
28. Dahl A, Cai N, Ko A, Laakso M, Pajukanta P, Flint J, et al. Reverse GWAS: Using genetics to identify and model phenotypic subtypes. PLoS genetics. 2019;15(4):e1008009. doi: 10.1371/journal.pgen.1008009 30951530
29. Gratten J, Visscher PM. Genetic pleiotropy in complex traits and diseases: implications for genomic medicine. Genome medicine. 2016;8(1):78. doi: 10.1186/s13073-016-0332-x 27435222
30. Uher R, Zwicker A. Etiology in psychiatry: embracing the reality of poly-gene-environmental causation of mental illness. World Psychiatry. 2017;16(2):121–129. doi: 10.1002/wps.20436 28498595
31. Brown GW, Ban M, Craig TK, Harris TO, Herbert J, Uher R. Serotonin transporter length polymorphism, childhood maltreatment, and chronic depression: a specific gene–environment interaction. Depression and Anxiety. 2013;30(1):5–13. doi: 10.1002/da.21982 22847957
32. Lee SH, Ripke S, Neale BM, Faraone SV, Purcell SM, Perlis RH, et al. Genetic relationship between five psychiatric disorders estimated from genome-wide SNPs. Nature genetics. 2013;45(9):984. doi: 10.1038/ng.2711 23933821
33. Ruderfer DM, Ripke S, McQuillin A, Boocock J, Stahl EA, Pavlides JMW, et al. Genomic dissection of bipolar disorder and schizophrenia, including 28 subphenotypes. Cell. 2018;173(7):1705–1715. doi: 10.1016/j.cell.2018.05.046 29906448
34. Lencz T, Guha S, Liu C, Rosenfeld J, Mukherjee S, DeRosse P, et al. Genome-wide association study implicates NDST3 in schizophrenia and bipolar disorder. Nature communications. 2013;4:2739. doi: 10.1038/ncomms3739 24253340
35. Han B, Pouget JG, Slowikowski K, Stahl E, Lee CH, Diogo D, et al. A method to decipher pleiotropy by detecting underlying heterogeneity driven by hidden subgroups applied to autoimmune and neuropsychiatric diseases. Nature genetics. 2016;48(7):803. doi: 10.1038/ng.3572 27182969
36. Ripke S, Neale BM, Corvin A, Walters JT, Farh KH, Holmans PA, et al. Biological insights from 108 schizophrenia-associated genetic loci. Nature. 2014;511(7510):421. doi: 10.1038/nature13595 25056061
37. Nelis M, Esko T, Mägi R, Zimprich F, Zimprich A, Toncheva D, et al. Genetic structure of Europeans: a view from the north–east. PloS one. 2009;4(5). doi: 10.1371/journal.pone.0005472 19424496
38. Gusev A, Ko A, Shi H, Bhatia G, Chung W, Penninx BW, et al. Integrative approaches for large-scale transcriptome-wide association studies. Nature genetics. 2016;48(3):245. doi: 10.1038/ng.3506 26854917
39. Mancuso N, Shi H, Goddard P, Kichaev G, Gusev A, Pasaniuc B. Integrating gene expression with summary association statistics to identify genes associated with 30 complex traits. The American Journal of Human Genetics. 2017;100(3):473–487. doi: 10.1016/j.ajhg.2017.01.031 28238358
40. Raj A, Stephens M, Pritchard JK. fastSTRUCTURE: variational inference of population structure in large SNP data sets. Genetics. 2014;197(2):573–589. doi: 10.1534/genetics.114.164350 24700103
41. Bycroft C, Freeman C, Petkova D, Band G, Elliott LT, Sharp K, et al. The UK Biobank resource with deep phenotyping and genomic data. Nature. 2018;562(7726):203–209. doi: 10.1038/s41586-018-0579-z 30305743
42. Sudlow C, Gallacher J, Allen N, Beral V, Burton P, Danesh J, et al. UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS medicine. 2015;12(3). doi: 10.1371/journal.pmed.1001779 25826379
43. Eysenck SB, Eysenck HJ, Barrett P. A revised version of the psychoticism scale. Personality and individual differences. 1985;6(1):21–29. doi: 10.1016/0191-8869(85)90026-1
44. Nagel M, Watanabe K, Stringer S, Posthuma D, Van Der Sluis S. Item-level analyses reveal genetic heterogeneity in neuroticism. Nature communications. 2018;9(1):1–10. doi: 10.1038/s41467-018-03242-8 29500382
45. Bergen SE, Ploner A, Howrigan D, Group CA, the Schizophrenia Working Group of the Psychiatric Genomics Consortium, O’Donovan MC, et al. Joint contributions of rare copy number variants and common SNPs to risk for schizophrenia. American Journal of Psychiatry. 2019;176(1):29–35. doi: 10.1176/appi.ajp.2018.17040467 30392412
46. Martin J, O’Donovan MC, Thapar A, Langley K, Williams N. The relative contribution of common and rare genetic variants to ADHD. Translational psychiatry. 2015;5(2):e506–e506. doi: 10.1038/tp.2015.5 25668434
47. Evangelou E, Ioannidis JP. Meta-analysis methods for genome-wide association studies and beyond. Nature Reviews Genetics. 2013;14(6):379. doi: 10.1038/nrg3472 23657481
48. Delaneau O, Marchini J, Zagury JF. A linear complexity phasing method for thousands of genomes. Nature methods. 2012;9(2):179. doi: 10.1038/nmeth.1785
49. Howie B, Fuchsberger C, Stephens M, Marchini J, Abecasis GR. Fast and accurate genotype imputation in genome-wide association studies through pre-phasing. Nature genetics. 2012;44(8):955. doi: 10.1038/ng.2354 22820512
50. Ullah E, Mall R, Abbas MM, Kunji K, Nato AQ, Bensmail H, et al. Comparison and assessment of family-and population-based genotype imputation methods in large pedigrees. Genome research. 2019;29(1):125–134. doi: 10.1101/gr.236315.118 30514702
51. Haeussler M, Zweig AS, Tyner C, Speir ML, Rosenbloom KR, Raney BJ, et al. The UCSC genome browser database: 2019 update. Nucleic acids research. 2018;47(D1):D853–D858. doi: 10.1093/nar/gky1095 30407534
Článek vyšel v časopise
PLOS Genetics
2020 Číslo 9
- Nový algoritmus zpřesní predikci rizika kardiovaskulárních onemocnění
- Není statin jako statin aneb praktický přehled rozdílů jednotlivých molekul
- Metamizol jako analgetikum první volby: kdy, pro koho, jak a proč?
- Jak se válečná Ukrajina stala semeništěm superrezistentních bakterií
- Mohou být časté noční můry předzvěstí demence?
Nejčtenější v tomto čísle
- Cocoonase is indispensable for Lepidoptera insects breaking the sealed cocoon
- Alleviating chronic ER stress by p38-Ire1-Xbp1 pathway and insulin-associated autophagy in C. elegans neurons
- Trichoderma reesei XYR1 activates cellulase gene expression via interaction with the Mediator subunit TrGAL11 to recruit RNA polymerase II
- Adiponectin GWAS loci harboring extensive allelic heterogeneity exhibit distinct molecular consequences