eQTL mapping of rare variant associations using RNA-seq data: An evaluation of approaches
Autoři:
Sharon Marie Lutz aff001; Annie Thwing aff003; Tasha Fingerlin aff003
Působiště autorů:
Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care, Boston, MA, United States of America
aff001; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America
aff002; Department of Biostatistics and Informatics, University of Colorado, Anschutz Medical Campus, Aurora, CO, United States of America
aff003; Center for Genes, Environment, and Health, National Jewish Health, Denver, CO, United States of America
aff004
Vyšlo v časopise:
PLoS ONE 14(10)
Kategorie:
Research Article
doi:
https://doi.org/10.1371/journal.pone.0223273
Souhrn
Expression quantitative trait loci (eQTL) provide insight on transcription regulation and illuminate the molecular basis of phenotypic outcomes. High-throughput RNA sequencing (RNA-seq) is becoming a popular technique to measure gene expression abundance. Traditional eQTL mapping methods for microarray expression data often assume the expression data follow a normal distribution. As a result, for RNA-seq data, total read count measurements can be normalized by normal quantile transformation in order to fit the data using a linear regression. Other approaches model the total read counts using a negative binomial regression. While these methods work well for common variants (minor allele frequencies > 5% or 1%), an extension of existing methodology is needed to accommodate a collection of rare variants in RNA-seq data. Here, we examine 2 approaches that are direct applications of existing methodology and apply these approaches to RNAseq studies: 1) collapsing the rare variants in the region and using either negative binomial regression or Poisson regression and 2) using the normalized read counts with the Sequence Kernel Association Test (SKAT), the burden test for SKAT (SKAT-Burden), or an optimal combination of these two tests (SKAT-O). We evaluated these approaches via simulation studies under numerous scenarios and applied these approaches to the 1,000 Genomes Project.
Klíčová slova:
Europe – Gene expression – Microarrays – Normal distribution – Phenotypes – Research errors – RNA sequencing
Zdroje
1. Cookson W, Liang L, Abecasis G, Moffatt M, Lathrop M. Mapping complex disease traits with global gene expression. Nat Rev Genet. 2009; 10:184–194. doi: 10.1038/nrg2537 19223927
2. Mortazavi A, Williams B, McCue K, Schaeffer L, Wold B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Meth. 2008; 5:621–628. doi: 10.1038/nmeth.1226
3. Wang Z, Gerstein M, Snyder M. RNA-seq: A revolutionary tool for transcriptomics. Nat Rev Genet. 2009; 10:57–63. doi: 10.1038/nrg2484 19015660
4. Kendziorski C, Chen M, Yuan M, Lan H, Attie AD. Statistical methods for expression quantitative trait loci (eQTL) mapping. Biometrics. 2006; 62:19–27. doi: 10.1111/j.1541-0420.2005.00437.x 16542225
5. Pickrell J, Marioni J, Pai A, Degner JF, Engelhardt BE, Nkadori E, et al. Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature. 2010; 464:768–772. doi: 10.1038/nature08872 20220758
6. Sun W. A Statistical Framework for eQTL Mapping Using RNA-seq Data. Biometrics. 2012; 68:1–11. doi: 10.1111/j.1541-0420.2011.01654.x 21838806
7. Sun W, Hu Y. eQTL mapping using RNA-seq data. Statistics in Biosciences. 2013; 5(1):198–219. doi: 10.1007/s12561-012-9068-3 23667399
8. Hu YJ, Sun W, Tzeng JY, Perou CM. Proper Use of Allele-Specific Expression Improves Statistical Power for cis-eQTL Mapping with RNA-Seq Data. J Am Stat Assoc. 2015; 110(511):962–974. doi: 10.1080/01621459.2015.1038449 26568645
9. Robinson MD, McCarthy DJ, Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010; 26:139–140. doi: 10.1093/bioinformatics/btp616 19910308
10. Robinson MD, Smyth GK. Small sample estimation of negative binomial dispersion, with applications to SAGE data. Biostatistics. 2008; 9:321–332. doi: 10.1093/biostatistics/kxm030 17728317
11. Li B, Leal SM. Methods for detecting associations with rare variants for common diseases: Application to analysis of sequence data. American Journal of Human Genetics. 2008; 83:311–321. doi: 10.1016/j.ajhg.2008.06.024 18691683
12. Wu MC, Lee S, Cai T, Li Y, Boehnke M, Lin X. Rare variant association testing for sequencing data using the sequence kernel association test (SKAT). American Journal of Human Genetics. 2011; 89:82–93. doi: 10.1016/j.ajhg.2011.05.029 21737059
13. Lee S, Emond MJ, Bamshad MJ. Optimal Unified Approach for Rare-Variant Association Testing with Application to Small-Sample Case-Control Whole-Exome Sequencing Studies. American Journal of Human Genetics. 2012; 91(2):224–237. doi: 10.1016/j.ajhg.2012.06.007 22863193
14. Liu DJ, Peloso GM, Zhan X, Holmen OL, Zawistowski M, Feng S, Nikpay M, Auer PL, Goel A, Zhang H, Peters U. Meta-analysis of gene-level tests for rare variant association. Nature Genetics. 2014; 46(2):200. doi: 10.1038/ng.2852 24336170
15. Dean C. Testing for overdispersion in Poisson and binomial regression models. Journal of the American Statistical Association. 1992; 87:451–457. doi: 10.1080/01621459.1992.10475225
16. Davies R. The distribution of a linear combination of chi-square random variables. J. R. Stat. Soc. Ser. C Appl. Stat. 1980; 29: 323–333.
17. Lappalainen T, Sammeth M, Friedlander MR, Hoen PA, Monlong J, Rivas MA, et al. Transcriptome and genome sequencing uncovers functional variation in humans. Nature. 2013; 501:506–511. doi: 10.1038/nature12531 24037378
Článek vyšel v časopise
PLOS One
2019 Číslo 10
- S diagnostikou Parkinsonovy nemoci může nově pomoci AI nástroj pro hodnocení mrkacího reflexu
- Je libo čepici místo mozkového implantátu?
- Pomůže v budoucnu s triáží na pohotovostech umělá inteligence?
- AI může chirurgům poskytnout cenná data i zpětnou vazbu v reálném čase
- Nová metoda odlišení nádorové tkáně může zpřesnit resekci glioblastomů
Nejčtenější v tomto čísle
- Correction: Low dose naltrexone: Effects on medication in rheumatoid and seropositive arthritis. A nationwide register-based controlled quasi-experimental before-after study
- Combining CDK4/6 inhibitors ribociclib and palbociclib with cytotoxic agents does not enhance cytotoxicity
- Experimentally validated simulation of coronary stents considering different dogboning ratios and asymmetric stent positioning
- Risk factors associated with IgA vasculitis with nephritis (Henoch–Schönlein purpura nephritis) progressing to unfavorable outcomes: A meta-analysis
Zvyšte si kvalifikaci online z pohodlí domova
Všechny kurzy