Transcriptomic stratification of late-onset Alzheimer's cases reveals novel genetic modifiers of disease pathology

Authors: Nikhil Milind ^aff001; Christoph Preuss ^aff001; Annat Haber ^aff001; Guruprasad Ananda ^aff001; Shubhabrata Mukherjee ^aff003; Cai John ^aff001; Sarah Shapley ^aff001; Benjamin A. Logsdon ^aff005; Paul K. Crane ^aff003; Gregory W. Carter ^aff001
Authors place of work: The Jackson Laboratory, Bar Harbor, Maine, United States of America ^aff001; Program in Genetics, Department of Biological Sciences, North Carolina State University, Raleigh, North Carolina, United States of America ^aff002; Department of Medicine, School of Medicine, University of Washington, Seattle, Washington, United States of America ^aff003; Program in Neuroscience, Department of Biology and Geology, Baldwin Wallace University, Berea, Ohio, United States of America ^aff004; Sage Bionetworks, Seattle, Washington, United States of America ^aff005
Published in the journal: Transcriptomic stratification of late-onset Alzheimer's cases reveals novel genetic modifiers of disease pathology. PLoS Genet 16(6): e32767. doi:10.1371/journal.pgen.1008775
Category: Research Article
doi: https://doi.org/10.1371/journal.pgen.1008775

Summary

Late-Onset Alzheimer’s disease (LOAD) is a common, complex genetic disorder well-known for its heterogeneous pathology. The genetic heterogeneity underlying common, complex diseases poses a major challenge for targeted therapies and the identification of novel disease-associated variants. Case-control approaches are often limited to examining a specific outcome in a group of heterogenous patients with different clinical characteristics. Here, we developed a novel approach to define relevant transcriptomic endophenotypes and stratify decedents based on molecular profiles in three independent human LOAD cohorts. By integrating post-mortem brain gene co-expression data from 2114 human samples with LOAD, we developed a novel quantitative, composite phenotype that can better account for the heterogeneity in genetic architecture underlying the disease. We used iterative weighted gene co-expression network analysis (WGCNA) to reduce data dimensionality and to isolate gene sets that are highly co-expressed within disease subtypes and represent specific molecular pathways. We then performed single variant association testing using whole genome-sequencing data for the novel composite phenotype in order to identify genetic loci that contribute to disease heterogeneity. Distinct LOAD subtypes were identified for all three study cohorts (two in ROSMAP, three in Mayo Clinic, and two in Mount Sinai Brain Bank). Single variant association analysis identified a genome-wide significant variant in TMEM106B (p-value < 5×10⁻⁸, rs1990620^G) in the ROSMAP cohort that confers protection from the inflammatory LOAD subtype. Taken together, our novel approach can be used to stratify LOAD into distinct molecular subtypes based on affected disease pathways.

Keywords:

Gene expression – Genetic loci – Gene mapping – Cognitive impairment – Alzheimer's disease – Transcriptome analysis – Genetics of disease – RNA sequencing

Introduction

Late-onset Alzheimer’s disease (LOAD) is the most common form of dementia in the elderly. The clinical features associated with LOAD are an amnesic type of memory impairment, deterioration of language, and visuospatial deficits. In the later stages of the disease, symptoms may include motor and sensory abnormalities, gait disturbances, and seizures. Without advances in therapy, the number of symptomatic cases in the United States is predicted to rise to 13.2 million by 2050 [1].

Many common, complex diseases such as LOAD present with heterogeneous phenotypes due to interactions between genetic and environmental factors affecting a range of pathways and processes. LOAD has no simple form of inheritance and is governed by a common set of risk alleles across multiple genes that, in combination, have a substantial effect on disease predisposition and age of onset [2]. Genome-Wide Association Studies (GWAS) have become an important tool for identifying variants in complex diseases [3,4]. GWAS for LOAD have identified variants in over 500 genes as potential risk factors with the ε4 variant in APOE as the strongest contributor to overall disease risk [2,5]. LOAD has a strong polygenic component and an estimated heritability of up to 80% [6]. It has been challenging to transition from the identification of associated genetic variants to the molecular mechanisms that lead to the accumulation of amyloid plaques and helical tau filaments [7]. Furthermore, there is mounting evidence that the observed heterogeneity in LOAD is associated with multiple distinct subtypes [8,9].

Gene co-expression modules tend to consist of genes that belong to the same cellular pathways or programs and help explain the global properties of the transcriptome as it relates to disease risk [10]. Networks-based co-expression module approaches have been used to identify causal variants in Late-Onset Alzheimer's disease [7,11]. However, such studies have failed to account for the heterogeneity of mechanisms that lead to complex diseases. Here, we analyze whole genome sequencing (WGS) and whole transcriptome data from three independent human cohorts from the Accelerating Medicines Partnership—Alzheimer's Disease (AMP-AD) Consortium. We use gene co-expression modules to develop quantitative phenotypes that account for the complex genetic architecture and heterogeneity of LOAD to more effectively map associated variants using genome-wide association. Furthermore, the method presented in this paper can be used to identify variants in other complex diseases.

Results

Description of post-mortem transcriptome study populations

To define novel quantitative phenotypes for LOAD, we obtained 26 post-mortem brain co-expression modules (doi.org/10.7303/syn11932957.1) that were harmonized from three independent cohorts of the AMP-AD consortium (Fig 1A, S1 Fig). This included post-mortem brain samples of 623 decedents from the ROSMAP cohort for the dorsolateral prefrontal cortex (DLPFC) brain region, 271 decedents from the Mayo cohort for the temporal cortex (TCX) brain region, and 364 decedents from the MSBB cohort for the frontopolar prefrontal cortex (FP), inferior temporal gyrus (IFG), parahippocampal gyrus (PHG) and superior temporal gyrus (STG) brain regions (Fig 1A). Approximately one-third of the patients were diagnosed with LOAD, while two-thirds were considered controls. The control group included elderly with normal cognition, as well as mild cognitive impairment and other forms of dementia. An overview of the post-mortem brain samples used in our analysis pipeline is provided in S1 and S2 Tables. Details on post-mortem brain sample collection, tissue and RNA preparation, sequencing, and sample quality control can be found in published work related to each cohort [12–14]. Multiple variables, such as sex, age of death, and sequencing batch effects were used as covariates in the normalization process to remove possible confounding factors across cohorts (Methods) [15].

**Fig. 1. Method used in this study to map genetic drivers of LOAD pathology in the ROSMAP cohort.**

Refinement of 26 human co-expression modules identifies disease-associated transcriptomic signals

We performed an iterative gene list pruning process using the iterativeWGCNA approach [16] to refine the 26 human co-expression modules from the AMP-AD consortium (S3 Table). Each of these 26 modules contains several thousand co-expressed genes, implicated in multiple disease processes across multiple cell types. Therefore, it is often difficult to assign a cell-type-specific role for expression modules linked to a certain brain region. Our approach resulted in 68 distinct subsets, or submodules, of highly correlated genes that were exclusive to each module (Fig 1B, S4 Table). Genes that were not highly correlated to any submodule were removed since they are less likely to contribute to the overall signal of the submodule and more likely to introduce noise. We then annotated the 68 co-expression submodules to identify molecular pathways and processes that are significantly enriched within submodules across the six brain regions from the three independent LOAD cohorts (S2 Fig). Pathway enrichment analysis was performed using GO terms, KEGG pathways, and Reactome pathway data sets to highlight the biological specificity of co-expression signals captured by the different submodules (S5 Table). We identified multiple functional consensus clusters across the 68 submodules, which showed a significant overlap in functional enrichment for similar biological pathways and processes across the six brain regions (S2 Fig). These functional consensus clusters associated with the 68 submodules revealed gene sets for specific biological pathways, including tau-protein kinase activity, neuroinflammation, myelination, and cytoskeletal reorganization (S2 Fig). Furthermore, incorporating information from previously defined cell-type-specific markers derived from bulk and single cell RNA Sequencing (RNA-Seq) [17] showed that refining the 26 co-expression modules into 68 submodules resulted in multiple novel submodules enriched for cell-type-specific markers (Fig 2, S3 Fig).

**Fig. 2. Cell-type specificity of modules is refined in submodules.**

Single-variant association mapping of submodule eigengenes across cohorts

To map the genetic drivers of biological disease-associated signals resolved by submodules, we performed single-variant association mapping of submodule eigengenes (Fig 1C) using whole-genome sequencing data from the AMP-AD knowledge portal (doi.org/10.7303/syn10901601). We then applied a variance component linear mixed model implemented in the EMMAX software to identify novel genetic loci associated with submodule eigengenes (Methods) [18]. Eigengenes were defined as the first principle component of the gene expression data associated with each of the 68 submodules. They capture most of the variation in gene co-expression and reduce noise associated with the transcriptomic data. We included eigengene expression data from four of the brain regions (TCX, PHG, FP, DLPFC), focusing on tissues from the frontal cortex, temporal cortex, and hippocampus due to their relevance to LOAD neuropathology [19]. QQ plots indicate minimal effects of genomic inflation, and consequently population substructure, on the analyses (S4 Fig).

Genome-wide suggestive and significant loci were detected for submodules in the four brain regions (Fig 3, S6–S9 Tables). We identified multiple loci that were replicated across the three cohorts at a genome-wide significant level. For instance, rs1990620 is a known LOAD-associated variant in TMEM106B that was identified as genome-wide significant in the DLPFC region from the ROSMAP cohort and was replicated (p < 5×10⁻²) in the remaining three brain regions from the Mayo and MSBB cohorts. This highlights the usefulness of our newly derived quantitative phenotypes to identify genetic variants associated with specific co-expression submodules driving disease pathology.

**Fig. 3. Manhattan plots of single-variant association of select submodule eigengenes in ROSMAP.**

Stratification of LOAD cases based on 68 AMP-AD co-expression submodules

We next assessed if submodule composite phenotypes could be used to better account for the observed heterogeneity in the genetic architecture of LOAD by clustering patients based on their co-expression profiles (Fig 1C). Clustering was performed to determine subtypes of LOAD cases for four brain regions (TCX, PHG, FP, DLPFC). The NbClust R package was used to identify the optimal number of clusters for different clustering methods by polling with the majority rule across 30 indices [20]. The NbClust package identified two subtype clusters for the ROSMAP (DLPFC region) and MSBB cohorts (FP, PHG regions), while three clusters were observed for the Mayo cohort (TCX region). An example for the ROSMAP cohort is shown in Fig 4. The number of cases in each identified subtype cluster was balanced across all three cohorts (S10 Table). A comparison of the different methods in terms of cluster assignment indicates that results are not affected substantially by the choice of clustering method in either of the cohorts (S11 Table). Notably, our newly defined molecular subtypes were not enriched for common LOAD-associated covariates, such as sex, APOEε4 genotype, or years of education (Fig 4C, S5 Fig). Furthermore, eigengene expression profiles for each subtype were used to assess the association of each subtype with molecular and biological pathways associated with submodules (Fig 4D). We observed no significant enrichment of cognitive or neuropathological measures between the subtypes for the DLPFC region (S5 Fig).

**Fig. 4. Clustering on eigengene expression in ROSMAP data generates two subtypes.**

Single-variant association mapping of subtype specificity metric in ROSMAP

In order to determine genetic variants associated with subtype classification, we used the Euclidean distances of each patient from the centroid of each subtype as a quantitative trait for genetic mapping. We performed genome-wide mapping for LOAD subtype association using the 623 patients of the ROSMAP cohort due to the larger sample size when compared to the MSBB and Mayo cohorts.

Genome wide association mapping revealed various significant variants across subtypes in ROSMAP decedents (S6 Fig, S9 Table). Several variants in TMEM106B reached our genome-wide significance threshold after multiple testing correction (p < 5×10⁻⁸). TMEM106B is a known modifier of neurodegenerative disease and cognitive aging, which has been previously linked with cognitive performance [21]. Among the variants we identified in TMEM106B, one genome-wide suggestive allele was identified for LOAD Subtype B (p < 4×10⁻⁶, rs1990620^G) in ROSMAP. This association with the protective rs1990620^G variant reached a genome-wide significant level with three of our previously mapped co-expression submodules from the ROSMAP cohort (Fig 3): DLPFCbrown_2 (p = 3.72x10^-07), DLPFCbrown_1 (p = 8.91x10^-11), and DLPFCyellow_2 (p = 5.88x10^-14). The DLPFCbrown_1 submodule is enriched for genes related to myelination and lysosomal activity (KEGG pathways hsa00600 and hsa04142), while DLPFCyellow_2 is enriched for genes related to endocytosis and potassium channel activity (KEGG pathway hsa04144 and Reactome pathway R-HSA-1296071). We replicated the subtype specific association of the protective rs1990620^G variant in the TCX brain region from Mayo cohort (Subtype B, p = 0.041), while we did not observe a significant association with variants in TMEM106B in the FP and PHG brain regions in the MSBB cohort. To our knowledge, this is the first report associating protective TMEM106B variants with molecular LOAD endophenotypes that link disruption of lysosomal and myelination pathways to disease subtypes. This is in line with results from a study in mice which showed that loss of TMEM106B function rescued lysosomal phenotypes related to frontotemporal dementia [22]. Furthermore, the identified protective allele rs1990620^G disrupts a known CCCTC-binding factor (CTCF) site, which has been shown to modify the inflammatory response in the course of aging [23]. Differential expression analysis of haplotype carriers of the protective rs1990620^G variant in TMEM106B showed an up-regulation of neuroactive ligand receptor interactions, while decedents carrying the risk variant showed significant up-regulation for pathways related to neuroinflammation (KEGG pathway hsa04380) (S7 Fig). Besides the association with TMEM106B in Subtype B, protective variants near MTUS2 were identified which are in close vicinity to HMGB1, a locus that has been previously implicated in brain atrophy [24]. In order to provide a better overview of the associated loci, we generated a directed network to visualize loci that were associated with different modules, submodules, subtypes, and diagnostic criteria for the ROSMAP cohort (Methods). Interestingly, we observed that while certain loci were uniquely associated with single modules or submodules, a community of shared loci was associated with modules and submodules annotated for microglia, endothelial cells, astrocytes, and oligodendrocytes (Fig 5). A separate community of loci was associated with modules and submodules annotated for proteostasis (Fig 5). Many of the loci associated with diagnostic criterion in the ROSMAP cohort were independent from these two communities (Fig 5). Only one locus was identified which showed a suggestive association with both Braak stage and Subtype B in our analysis.

**Fig. 5. Network of endophenotypes and associated loci.**

Suggestive SNPs in ROSMAP are replicated in the Mayo and MSBB cohorts

To assess the validity of our genetic findings in ROSMAP, we aimed to replicate our results across three brain regions from the Mayo (TCX brain region) and MSBB (PHG, FP brain regions) cohorts (Fig 1C). In addition, we compared our results to a catalog of recently published GWAS results in order to evaluate the novelty of our findings.

A total of 1326 unique variants representing 163 loci reached a genome-wide suggestive or significant p-value (p < 1×10⁻⁵) in the DLPFC region when pooled from all 11 DLPFC eigengenes and two subtype-specific variant mapping analyses (S12 Table). Of these, 645 SNPs were replicated in the PHG analyses, 762 SNPs were replicated in the FP analysis, and 482 SNPs were replicated in the TCX analyses (replication threshold, p < 1×10⁻²). Overlapping co-expression submodules across brain regions (S2 Fig) were associated with similar loci.

Of the 1326 variants identified in ROSMAP, 29 variants have also been previously reported in the NHGRI-EBI catalog (S13 Table). In each case, the most significant SNP from a prior study was a suggestive SNP in the DLPFC region. Fifteen of these 29 previously reported variants were suggestive SNPs at the TMEM106B locus in the DLPFC region. These 15 variants were previously reported for association with traits such as depression [25–27], neuroticism [26,28–31], coronary artery disease [32], and frontotemporal dementia [33]. The TMEM106B variant associated with dementia, rs1990620, was replicated with submodule eigengene expression in three out of four brain regions (DLPFC, TCX, PHG) in the AMP-AD cohorts (S12 and S13 Tables). An ITGA2B variant (rs5910), previously associated with Parkinson’s disease [34], was replicated in the TCX and PHG regions (S12 and S13 Tables). Three suggestive ROSMAP variants at the LMX1B locus were previously reported for association with glaucoma [35–37], and replicated in the TCX and FP regions (S12 and S13 Tables). Taken together, however, a significant number of the 163 loci detected in the ROSMAP cohort implicated novel variants in LOAD processes, many of which were replicated in brain regions from the Mayo and MSBB cohorts.

Molecular LOAD subtypes differ in their inflammatory response

To better understand the underlying molecular differences across the novel LOAD subtypes, we performed differential expression analysis for each subtype against a set of controls in the ROSMAP cohort (Fig 6A, S8 Fig). The set of controls included 471 decedents who were either cognitively normal or had mild cognitive impairment. Performing this analysis without the mild cognitive impairment cases yielded essentially the same results (S9 Fig). The Venn diagram in Fig 6B depicts the comparison across the two subtypes. Interestingly, we found that cases associated with Subtype A showed a stronger transcriptional response with 127 differentially expressed genes (adjusted p < 0.05, absolute log fold change > 0.5) when compared with controls. Among the most significantly down-regulated genes associated with Subtype A cases was the stress-response mediator corticotropin-releasing hormone (CRH; Fig 6A). Overacting CRH signaling has been implicated in inflammatory disorders and LOAD where it has been proposed as a therapeutic target to reduce the negative effects of chronic stress related to memory function and amyloid beta (Aβ) production [38]. Cases associated with Subtype B had 40 differentially expressed genes (FDR adjusted p < 0.05, absolute log fold change > 0.5), 39 of which were down-regulated when compared to controls. Notably, we found that two key pro-inflammatory mediators of amyloid deposition (S100A8, S100A9) were among the most significantly down-regulated genes in Subtype B decedents when compared to controls (Fig 6A). Both genes, which are established inflammatory biomarkers, are part of a complex that serves as a critical link between the amyloid cascade and inflammatory events in LOAD [39]. Furthermore, we identified multiple pathways linked to S100A8/9 activation, including IL-10 signaling and complement activation, that were enriched across down-regulated genes in Subtype B but not in Subtype A (Fig 6C). In addition, molecular pathways linked to microglia activation, the immune response, and the stress response were found among the most significant pathways and gene sets that differ across subtypes (S8 Fig, S14 Table). This highlights that our LOAD subtypes differ in their inflammatory response and that known LOAD biomarkers might be used to stratify patients based upon their inflammatory response to the observed disease state. The same analysis in the Mayo and MSBB cohorts revealed that the corresponding subtypes can also be distinguished based on their inflammatory response (S10 Fig). However, the signal derived from the molecular pathway expression profiles in both the Mayo and MSBB cohorts is not as strong as in the ROSMAP cohort, which is likely due to the smaller sample size and differences in population structure across both cohorts. Although inflammatory markers were the most differentially expressed, the subtypes are characterized by diametric associations with the eigengenes of multiple submodules annotated for different pathological processes (Fig 4D). Thus, the detection of differentially expressed inflammatory markers between subtypes is likely because inflammation is the strongest post-mortem signal present in transcriptomic data.

**Fig. 6. Differential expression analysis of ROSMAP subtypes reveals heterogeneity in inflammatory response in LOAD cases.**

Discussion

Common, complex diseases such as LOAD are characterized by phenotypic heterogeneity and the presence of multiple common variants affecting disease risk. In this study, we present an analysis that uses transcriptomic co-expression data and whole-genome sequencing from multiple cohorts to dissect phenotypic heterogeneity and identify potential genetic drivers of complex trait pathology in LOAD.

Here, we used an iterative pruning approach based on 26 human post-mortem co-expression modules to generate 68 novel submodules that contained genes associated with LOAD specific biological pathways and molecular processes. Indeed, we observed that genes in the novel submodules are enriched for functional terms that were specific to pathways associated with LOAD, such as lipid modification, the TREM2/TYROBP signaling axis, and tau-protein kinase activity. Furthermore, submodules from distinct brain regions clustered independently, suggesting that the genes captured in each submodule represented signals that were associated with LOAD pathology rather than cohort -⁠ or tissue-specific factors. Notably, our pruning approach identified submodules which were much more specific for markers of different brain cell types when compared to the initial co-expression modules. This is in line with recent studies showing that different cell types in the brain play specific roles at different stages in the pathogenesis of LOAD [40]. Taken together, our results demonstrate that the novel human co-expression submodules identified in this study capture cell-type-specific pathways associated with LOAD pathogenesis in the brain.

Mapping the eigengene expression for individual submodules represents a pathway -⁠ or process-level alternative to expression quantitative trait locus (eQTL) mapping for each individual transcript. Since the human co-expression submodules represented pathological, cell-type-specific pathways in LOAD brain tissue, mapping eigengene expression for decedents was expected to identify genetic drivers of LOAD pathology. RNA-Seq data from post-mortem brain tissue in human cohorts contains a strong immune signal, as evidenced by repeated identification of genetic loci related to microglial response in meta-analyses with increasingly large cohorts [5,41]. Using submodule eigengenes as quantitative traits for single-variant association provided an opportunity to identify genetic drivers of biological processes that are known to be drivers of early LOAD pathogenesis, such as astrogliosis, neuronal plasticity, myelination, and vascular blood brain barrier interactions [40]. Suggestive variants identified were unique to subsets of submodules. For instance, the TMEM106B locus was associated at a genome-wide significant level with the DLPFCbrown_1 and DLPFCyellow_2 eigengenes (Fig 3), representing processes related to oligodendrocytic myelination, lysosomal activity, endocytosis, and potassium channel activity. This novel association between protective variants in TMEM106B with molecular LOAD endophenotypes linked to lysosomal and myelination dysfunction is potentially of great interest. The TMEM106B locus has been implicated in cognitive aging, with functional consequences in frontotemporal dementia related to lysosomal activity [21–23]. A recent transcriptome study implicated protective TMEM106B variants in differences in neuronal proportions across LOAD patients, supporting the idea that impaired lysosomal function may lead to a toxic buildup of waste in the cell, a common process among many neurodegenerative disorders [42]. Therefore, the presence of TMEM106B variants in combination with other risk factors might alter the course and severity of neurodegeneration across patient subtypes. Furthermore, we identified multiple loci associated with a module linked to microglia function (DLPFCblue_3). This co-expression module contains members of the TREM2/TYROBP signaling pathway, an important mediator of neuroinflammation. Variants in the FAM110A (rs1014897), the CNTNAP5 (rs76854344) and NTM (rs1040103) genes associated with this inflammatory module have been previously linked to posterior cortical atrophy, LOAD [43], and white blood cell count [44]. Taken together, we show that quantitative trait mapping using submodule eigengene expression can identify novel genetic variants impacting relevant disease pathways.

Eigengenes represent a dimensional reduction of transcriptomic data onto axes of pathological relevance. Thus, we expected that clustering on the eigengene expression of LOAD cases would generate pathway-level profiles of putative molecular LOAD subtypes based on case heterogeneity. We observed that average eigengene expression of different subtypes was enriched for different submodules in the four brain regions for which subtype analysis was performed, an example of which is presented for the DLPFC region in Fig 4. Similar diametric enrichment patterns were identified in the remaining brain regions (S10 Fig). These results suggest that the biological programs identified by submodules in this study align themselves along the heterogeneity of transcriptomic data present in LOAD cases across multiple cohorts rather than differentiating solely based on cases and controls. Furthermore, the stratification of patients based on submodule expression profiles demonstrated that there is significant variation in immune response in post-mortem brain tissue (Fig 6, S7 Fig), a process that is considered a hallmark of LOAD pathogenesis. Variants associated with the subtype specificity metric overlapped with the variants associated with individual submodule eigengenes (Fig 5). This suggests that the genetic factors influencing subtypes can be dissected into loci driving specific submodules. Such dissection of genetic loci can provide the basis for more targeted treatment of dysfunctional pathways that contribute to different subtypes of LOAD.

Our subtypes in the DLPFC brain region of the ROSMAP cohort represent differences in transcriptomic profiles of LOAD cases derived from post-mortem RNA-Seq data. A lack of temporal data makes it challenging to decisively interpret these profiles derived from post-mortem brain samples. The identified subtypes may represent distinct LOAD endpoints, differences in disease severity, environmental effects, or phases of molecular pathology. Neither of our novel subtypes was associated with cognitive or neuropathological outcome (S5 Fig). Furthermore, covariates such as sex, APOE genotype, and years of education were not significantly enriched in any given subtype (Fig 4C). This suggests that the transcriptomic profiles do not represent transitions in disease severity and that there are overall risk factors not reflected in transcriptomic subtypes. Furthermore, both subtypes are associated with unique loci that belong to the same community of loci detected by submodule mapping (Fig 5), indicating that the subtypes capture various combinations of genetic elements that lead to LOAD pathology. While suggestive, these transcriptomic LOAD subtypes will require further validation in cohorts that adequately account for disease progression.

The methodology presented in this study is not limited to RNA-Seq data and can be performed on other omics datasets, such as proteomics or metabolomics. As such data become available for the decedents in these cohorts, this analysis can be expanded across these additional informative dimensions.

Methods

Whole genome sequencing and RNA sequencing data

We obtained whole-genome sequencing and RNA-Seq data from Synapse (https://www.synapse.org/) for three cohorts from the AMP-AD consortium: the Mayo Clinic, Mount Sinai Brain Bank, and Rush University. The Mayo Clinic (Mayo) cohort consists of 276 temporal cortex (TCX) samples from 312 North American Caucasian subjects consisting of cases characterized with LOAD, pathological aging (PA), progressive supranuclear palsy (PSP), or elderly controls [13] (doi.org/10.7303/syn5550404). The Mount Sinai Brain Bank (MSBB) cohort consists of 214 frontopolar prefrontal cortex (FP), 187 inferior temporal gyrus (IFG), 160 parahippocampal gyrus (PHG), and 187 superior temporal gyrus (STG) samples characterized with LOAD, elderly control, or mild cognitive impairment (MCI) (doi.org/10.7303/syn3159438). The Rush University's Religious Orders Study and Memory and Aging Project (ROSMAP) cohort consists of 623 dorsolateral prefrontal cortex (DLPFC) samples of individuals from 40 groups of religious orders from across the United States (ROS) and older adults in retirement communities in the Chicago area (MAP), characterized with LOAD, elderly control, or MCI [7,45] (doi.org/10.7303/syn3219045). A summary of samples from each of the cohorts is provided in S1 and S2 Tables. Sex, age of death, and batch were used as covariates for normalization in the ROSMAP and Mayo data. Sex, age of death, race, and batch were used as covariates for normalization in the MSBB data.

Co-expression modules and iterativeWGCNA

Data on human AMP-AD co-expression modules were obtained for the three cohorts from Synapse (doi.org/10.7303/syn11932957.1). A recent study has identified these modules as part of a transcriptome wide LOAD meta-analysis [15]. In brief, a modified procedure using five different co-expression analysis protocols followed by graph clustering methods was performed to obtain 30 modules across all three cohorts (doi.org/10.7303/syn2580853), 26 of which corresponded to the six tissue regions used in this study. A summary of these modules is provided in S3 Table. We focused on tissues from the frontal cortex, temporal cortex, and hippocampus due to their relevance to LOAD neuropathology [19]. These modules are generally large, containing thousands of genes that represent multiple functions [15]. In order to construct more functionally-specific submodules from these AMP-AD co-expression modules, we subjected them to a repeated pruning process called iterativeWGCNA [16], which includes performing WGCNA on each AMP-AD co-expression module independently. The gene sets produced by this process were then pruned to ensure that only highly correlated genes remained by evaluating the connectivity of the genes to the gene set eigengene. The resulting gene sets, containing highly correlated genes, were combined and the process was repeated until the gene sets converged. The algorithm then attempted to reclassify genes from the residual gene set into submodules. We specified a soft-threshold power of six, a minimum eigengene connectivity of 0.6, and a required module size of 100 to promote the generation of submodules that capture pathway-level signals. The final set of 68 submodules consisted of highly correlated and cell-type-specific genes. The submodules were mutually exclusive for a given cohort but overlapped with submodules from other cohorts. A summary of these submodules is provided in S4 Table. An eigengene for a given submodule is defined as the first principle component of gene expression data within each submodule.

Stratification of LOAD cases based on clustering of human co-expression submodules

Eigengene expression data for TCX, PHG, FP, and DLPFC regions was used to stratify LOAD cases into subtypes based on each brain region separately. We used the NbClust R package to determine the optimal number of clusters across different clustering methods by polling with the majority rule across 30 indices [20]. We tested agglomerative hierarchical approaches (Ward, UPGMA, WPGMA) and a reallocation approach (K-means) on the eigengene expression data and evaluated the within-cluster similarity of cases using silhouettes. The silhouette score of a given object (data point) is a measure that simultaneously assesses how similar this object is to its cluster and how different it is from all the other clusters [46]. A simulation study suggests that no one clustering method outperforms the other consistently and that mean silhouette widths can be used to pick the ideal clustering method and compare clustering across datasets [47]. The silhouette plots revealed that different methods were required for the different regions to generate clusters with the largest average silhouette widths. We determined that K-means was an optimal approach for DLPFC, Ward was optimal for PHG and TCX, and UPGMA was optimal for FP after analyzing silhouette plots of clusters generated by each method for each region. An example of silhouettes used to determine the ideal clustering method for the DLPFC region is shown in Fig 4. A summary of the clusters for each brain region, considered case subtypes, is provided in S10 Table. In the subtypes generated for the DLPFC region from the ROSMAP cohort, we assessed each subtype for enrichment of cognitive and pathological measures. We used Braak stages as a measure of neurofibrillary tangle burden and CERAD scores as a measure of neuritic plaque burden [48,49]. We also assessed the rate of decline in memory, executive function, visuospatial function, and language across the subtypes. Definitions, collection, and standardization of these decline measures can be found in previously published work [50].

Differential expression analysis of case subtypes

For differential expression analysis, control decedents were defined as cognitively normal and MCI decedents for PHG, FP, and DLPFC. In the case of TCX, control decedents included cognitively normal, PSP, and PA decedents. For each of the regions used to stratify LOAD cases (TCX, PHG, FP, and DLPFC), we performed differential expression analysis to compare gene expression in LOAD subtypes with control decedents as described above. We repeated this analysis excluding MCI, PSP, and PA decedents from the control group and got essentially the same results (S9 Fig). We used the limma R package to perform the differential expression analysis between subtype and control decedents [51]. We used the clusterProfiler R package to perform KEGG and Reactome pathway analysis on differentially expressed genes to determine the signal captured by clustering on eigengene expression data [52].

Single-variant association of eigengene expression and subtype specificity

We used EMMAX [18], a variance component linear mixed model, to perform single-variant association, using each submodule eigengene as a quantitative trait. In addition, we developed a subtype specificity metric for each brain region by calculating the Euclidean distance between the eigengene expression profile of each decedent and the centroid of each subtype cluster. This resulted in a vector of scores for each subtype that was mapped as a separate trait. All quantitative trait mapping results had a genomic inflation factor near one, indicating that there was no significant population substructure effect on the mapping. QQ plot analysis on the p-values showed no evidence of population substructure or confounding effects (S4 Fig).

Replication of suggestive and significant SNPs in other cohorts

The ROSMAP cohort represented the most adequately powered cohort in the study and was therefore used as our baseline, while the other cohorts were utilized for assessing replication of suggestive and significant SNPs. SNPs were considered suggestive with a p-value smaller than 1×10⁻⁵ and genome-wide significant with a p-value smaller than 5×10⁻⁸, which are standard cutoffs for GWAS. Suggestive and significant SNPs from the DLPFC region in ROSMAP were considered replicated in the TCX, FP, and PHG regions if the SNPs were associated with the submodule eigengenes or subtype specificity metric of the given region at a p-value of 0.05. In addition, we compared the ROSMAP loci to prior association studies using summary statistics obtained from the NHGRI-EBI catalog [53]. Loci were considered replicated in this case if suggestive and significant SNPs from the ROSMAP cohort were reported in these studies at a p-value smaller than 5×10⁻⁸ (Fig 1C and phase 3 in S1 Fig).

Network of loci and associated quantitative phenotypes in ROSMAP

We built a directed network of quantitative phenotypes and associated loci to better visualize the communities of loci that were associated with our newly derived quantitative phenotypes in the ROSMAP cohort (the subtype specificity metric and submodule eigengenes), including both suggestive and significant loci. We included in this network results of single-variant association of diagnostic criteria for other relevant traits, including module eigengenes, Braak stage, CERAD scores, cognitive diagnosis, and case-control diagnosis (using EMMAX [18] as for the other traits). The network was built in Cytoscape version 3.7 (https://cytoscape.org/) [54] and the nodes were organized using the “Circular Layout” option. The color of the edge was used to distinguish the type of association (red for diagnostic criteria, blue for subtype, yellow for module, and green for submodule).

Supporting information

S1 Fig [tif]
The complete analysis carried out in this study is divided into three phases.

S2 Fig [a]
Clusters of modules and submodules based on gene overlap reveal cell-type and functional signatures.

S3 Fig [tif]
Functional consensus clusters demonstrate cell-type specificity.

S4 Fig [tif]
QQ Plots of observed p-values from single-variant association in the ROSMAP cohort.

S5 Fig [tif]
Subtypes demonstrate no significant enrichment of cognitive or pathological measures.

S6 Fig [tif]
Manhattan plots of single-variant association of the subtype specificity metric in ROSMAP.

S7 Fig [a]
Pathway enrichment analysis for up- and downregulated KEGG pathways among rs1990620 haplotype carriers.

S8 Fig [tif]
Pathway enrichment analysis of differentially expressed genes in subtypes from the ROSMAP cohort.

S9 Fig [tif]
Comparison of differentially expressed genes from the ROSMAP cohort with and without MCI cases.

S10 Fig [a]
Pathway enrichment for subtypes from the Mayo and MSBB cohorts.

S1 Table [xlsx]
Summary of cohorts.

S2 Table [ad]
Summary of cohorts by diagnosis and sex.

S3 Table [xlsx]
Summary of modules.

S4 Table [xlsx]
Summary of submodules.

S5 Table [xlsx]
GO, KEGG, and Reactome enrichment of submodules.

S6 Table [xlsx]
Significant SNP associations from TCX region analyses.

S7 Table [xlsx]
Significant SNP associations from PHG region analyses.

S8 Table [xlsx]
Significant SNP associations from FP region analyses.

S9 Table [xlsx]
Significant SNP associations from DLPFC region analyses.

S10 Table [xlsx]
LOAD case subtypes for selected brain regions.

S11 Table [xlsx]
Comparison of clustering algorithms in ROSMAP, MSBB, and Mayo.

S12 Table [xlsx]
Genome-Wide suggestive SNPs in DLPFC Replicated in TCX, FP, and PHG.

S13 Table [xlsx]
Genome-Wide suggestive SNPs in DLPFC Replicated in the NHGRI-EBI Catalog.

S14 Table [xlsx]
KEGG and reactome pathway annotations of differentially expressed genes in ROSMAP subtypes.

Zdroje

1. Cummings JL. Alzheimer’s Disease. Wood AJJ, editor. N Engl J Med. 2004;351 : 56–67. doi: 10.1056/NEJMra040223 15229308

2. Bertram L, Tanzi RE. Thirty years of Alzheimer’s disease genetics: The implications of systematic meta-analyses. Nat Rev Neurosci. 2008;9 : 768–778. doi: 10.1038/nrn2494 18802446

3. Kilpinen H, Barrett JC. How next-generation sequencing is transforming complex disease genetics. Trends in Genetics. 2013. doi: 10.1016/j.tig.2012.10.001 23103023

4. Risch N, Merikangas K. The future of genetic studies of complex human diseases. Science. 1996. doi: 10.1126/science.273.5281.1516 8801636

5. Jansen IE, Savage JE, Watanabe K, Bryois J, Williams DM, Steinberg S, et al. Genome-wide meta-analysis identifies new loci and functional pathways influencing Alzheimer’s disease risk. Nat Genet. 2019;51 : 404–413. doi: 10.1038/s41588-018-0311-9 30617256

6. Verheijen J, Sleegers K. Understanding Alzheimer Disease at the Interface between Genetics and Transcriptomics. Trends Genet. 2018;34 : 434–447. doi: 10.1016/j.tig.2018.02.007 29573818

7. Mostafavi S, Gaiteri C, Sullivan SE, White CC, Tasaki S, Xu J, et al. A molecular network of the aging human brain provides insights into the pathology and cognitive decline of Alzheimer’s disease. Nat Neurosci. 2018;21. doi: 10.1038/s41593-018-0154-9 29802388

8. Mukherjee S, Mez J, Trittschuh E, Saykin AJ, Gibbons LE, Fardo DW, et al. Genetic data and cognitively-defined late-onset Alzheimer’s disease subgroups. Mol Psychiatry. 2018; 1–10. doi: 10.1101/367615

9. Ferreira D, Verhagen C, Hernández-Cabrera JA, Cavallin L, Guo CJ, Ekman U, et al. Distinct subtypes of Alzheimer’s disease based on patterns of brain atrophy: longitudinal trajectories and clinical applications. Sci Rep. 2017;7 : 1–13. doi: 10.1038/s41598-016-0028-x 28127051

10. Langfelder P, Horvath S. Eigengene networks for studying the relationships between co-expression modules. BMC Syst Biol. 2007; doi: 10.1186/1752-0509-1-54 18031580

11. Zhang B, Gaiteri C, Bodea LG, Wang Z, McElwee J, Podtelezhnikov AA, et al. Integrated systems approach identifies genetic nodes and networks in late-onset Alzheimer’s disease. Cell. 2013; doi: 10.1016/j.cell.2013.03.030 23622250

12. De Jager PL, Ma Y, McCabe C, Xu J, Vardarajan BN, Felsky D, et al. A multi-omic atlas of the human frontal cortex for aging and Alzheimer’s disease research. Sci Data. 2018;5 : 180142. Available: doi: 10.1038/sdata.2018.142 30084846

13. Allen M, Carrasquillo MM, Funk C, Heavner BD, Zou F, Younkin CS, et al. Human whole genome genotype and transcriptome data for Alzheimer’s and other neurodegenerative diseases. Sci Data. 2016;3 : 1–10. doi: 10.1038/sdata.2016.89 27727239

14. Wang M, Beckmann ND, Roussos P, Wang E, Zhou X, Wang Q, et al. The Mount Sinai cohort of large-scale genomic, transcriptomic and proteomic data in Alzheimer’s disease. Sci Data. 2018;5 : 1–16. doi: 10.1038/s41597-018-0002-5 30482902

15. Logsdon BA, Perumal TM, Swarup V, Wang M, Funk C, Gaiteri C, et al. Meta-analysis of the human brain transcriptome identifies heterogeneity across human AD coexpression modules robust to sample collection and methodological approach. bioRxiv. 2019; doi: 10.7303/syn17114455

16. Greenfest-Allen E, Cartailler J-P, Magnuson MA, Stoeckert CJ. iterativeWGCNA: iterative refinement to improve module detection from WGCNA co-expression networks. bioRxiv. 2017; 234062. doi: 10.1101/234062

17. McKenzie AT, Wang M, Hauberg ME, Fullard JF, Kozlenkov A, Keenan A, et al. Brain Cell Type Specific Gene Expression and Co-expression Network Architectures. Sci Rep. 2018;8 : 1–19. doi: 10.1038/s41598-017-17765-5 29311619

18. Kang HM, Sul JH, Service SK, Zaitlen NA, Kong S-Y, Freimer NB, et al. Variance component model to account for sample structure in genome-wide association studies. Nat Genet. 2010;42 : 348–354. doi: 10.1038/ng.548 20208533

19. DeTure MA, Dickson DW. The neuropathological diagnosis of Alzheimer disease. Mol Neurodegener. 2019;14 : 1–18. doi: 10.1186/s13024-018-0301-5 30630532

20. Charrad M, Ghazzali N, Boiteau V, Niknafs A. NbClust: An R Package for Determining the Relevant Number of Clusters in a Data Set. J Stat Softw. 2014;61 : 1–36. doi: 10.18637/jss.v061.i06

21. White CC, Yang HS, Yu L, Chibnik LB, Dawe RJ, Yang J, et al. Identification of genes associated with dissociation of cognitive performance and neuropathological burden: Multistep analysis of genetic, epigenetic, and transcriptional data. PLoS Med. 2017;14 : 1–23. doi: 10.1371/journal.pmed.1002287 28441426

22. Klein ZA, Takahashi H, Ma M, Stagi M, Zhou M, Lam TKT, et al. Loss of TMEM106B Ameliorates Lysosomal and Frontotemporal Dementia-Related Phenotypes in Progranulin-Deficient Mice. Neuron. 2017;95 : 281–296.e6. doi: 10.1016/j.neuron.2017.06.026 28728022

23. Gallagher MD, Posavi M, Huang P, Unger TL, Berlyand Y, Gruenewald AL, et al. A Dementia-Associated Risk Variant near TMEM106B Alters Chromatin Architecture and Gene Expression. Am J Hum Genet. 2017;101 : 643–663. doi: 10.1016/j.ajhg.2017.09.004 29056226

24. Furney SJ, Simmons A, Breen G, Pedroso I, Lunnon K, Proitsi P, et al. Genome-wide association with MRI atrophy measures as a quantitative trait locus for Alzheimer’s disease. Mol Psychiatry. 2011;16 : 1130–1138. doi: 10.1038/mp.2010.123 21116278

25. Wray NR, Ripke S, Mattheisen M, Trzaskowski M, Byrne EM, Abdellaoui A, et al. Genome-wide association analyses identify 44 risk variants and refine the genetic architecture of major depression. Nat Genet. 2018; doi: 10.1038/s41588-018-0090-3 29700475

26. Nagel M, Jansen PR, Stringer S, Watanabe K, De Leeuw CA, Bryois J, et al. Meta-analysis of genome-wide association studies for neuroticism in 449,484 individuals identifies novel genetic loci and pathways. Nat Genet. 2018; doi: 10.1038/s41588-018-0151-7 29942085

27. Howard DM, Adams MJ, Clarke TK, Hafferty JD, Gibson J, Shirali M, et al. Genome-wide meta-analysis of depression identifies 102 independent variants and highlights the importance of the prefrontal brain regions. Nat Neurosci. 2019; doi: 10.1038/s41593-018-0326-7 30718901

28. Nagel M, Watanabe K, Stringer S, Posthuma D, Van Der Sluis S. Item-level analyses reveal genetic heterogeneity in neuroticism. Nat Commun. 2018;9. doi: 10.1038/s41467-018-03242-8 29500382

29. Luciano M, Hagenaars SP, Davies G, Hill WD, Clarke TK, Shirali M, et al. Association analysis in over 329,000 individuals identifies 116 independent variants influencing neuroticism. Nat Genet. 2018; doi: 10.1038/s41588-017-0013-8 29255261

30. Baselmans BML, Jansen R, Ip HF, van Dongen J, Abdellaoui A, van de Weijer MP, et al. Multivariate genome-wide analyses of the well-being spectrum. Nat Genet. 2019; doi: 10.1038/s41588-018-0320-8 30643256

31. Hill WD, Weiss A, Liewald DC, Davies G, Porteous DJ, Hayward C, et al. Genetic contributions to two special factors of neuroticism are associated with affluence, higher intelligence, better health, and longer life. Mol Psychiatry. 2019; doi: 10.1038/s41380-019-0387-3 30867560

32. Van Der Harst P, Verweij N. Identification of 64 novel genetic loci provides an expanded view on the genetic architecture of coronary artery disease. Circ Res. 2018; doi: 10.1161/CIRCRESAHA.117.312086 29212778

33. Pottier C, Zhou X, Perkerson RB, Baker M, Jenkins GD, Serie DJ, et al. Potential genetic modifiers of disease risk and age at onset in patients with frontotemporal lobar degeneration and GRN mutations: a genome-wide association study. Lancet Neurol. 2018; doi: 10.1016/S1474-4422(18)30126–1

34. Chang D, Nalls MA, Hallgrímsdóttir IB, Hunkapiller J, Brug M van der, Cai F, et al. A meta-analysis of genome-wide association studies identifies 17 new Parkinson’s disease risk loci. Nat Genet. 2017; doi: 10.1038/ng.3955 28892059

35. MacGregor S, Ong JS, An J, Han X, Zhou T, Siggs OM, et al. Genome-wide association study of intraocular pressure uncovers new pathways to glaucoma. Nature Genetics. 2018. doi: 10.1038/s41588-018-0176-y 30054594

36. Gharahkhani P, Burdon KP, Cooke Bailey JN, Hewitt AW, Law MH, Pasquale LR, et al. Analysis combining correlated glaucoma traits identifies five new risk loci for open-angle glaucoma. Sci Rep. 2018; doi: 10.1038/s41598-018-20435-9 29449654

37. Choquet H, Paylakhi S, Kneeland SC, Thai KK, Hoffmann TJ, Yin J, et al. A multiethnic genome-wide association study of primary open-angle glaucoma identifies novel risk loci. Nat Commun. 2018; doi: 10.1038/s41467-018-04555-4 29891935

38. Futch HS, Croft CL, Truong VQ, Krause EG, Golde TE. Targeting psychologic stress signaling pathways in Alzheimer’s disease. Mol Neurodegener. 2017;12 : 49. doi: 10.1186/s13024-017-0190-z 28633663

39. Vogl T, Gharibyan AL, Morozova-Roche LA. Pro-Inflammatory S100A8 and S100A9 Proteins: Self-Assembly into Multifunctional Native and Amyloid Complexes. Int J Mol Sci. 2012;13 : 2893. doi: 10.3390/ijms13032893 22489132

40. De Strooper B, Karran E. The Cellular Phase of Alzheimer’s Disease. Cell. 2016;164 : 603–615. doi: 10.1016/j.cell.2015.12.056 26871627

41. Lambert JC, Ibrahim-Verbaas CA, Harold D, Naj AC, Sims R, Bellenguez C, et al. Meta-analysis of 74,046 individuals identifies 11 new susceptibility loci for Alzheimer’s disease. Nat Genet. 2013;45 : 1452–1458. doi: 10.1038/ng.2802 24162737

42. Li Z, Farias FG, Dube U, Del-Aguila JL, Mihindukulasuriya KA, Fernandez MV, et al. The TMEM106B rs1990621 protective variant is also associated with increased neuronal proportion. bioRxiv. 2019; doi: 10.1101/583286

43. Schott JM, Crutch SJ, Carrasquillo MM, Uphill J, Shakespeare TJ, Ryan NS, et al. Genetic risk factors for the posterior cortical atrophy variant of Alzheimer’s disease. Alzheimer’s Dement. 2016;12 : 862–871. doi: 10.1016/j.jalz.2016.01.010 26993346

44. Kichaev G, Bhatia G, Loh P-R, Gazal S, Burch K, Freund MK, et al. Leveraging Polygenic Functional Enrichment to Improve GWAS Power. Am J Hum Genet. 2019;104 : 65–75. doi: 10.1016/j.ajhg.2018.11.008 30595370

45. Chibnik LB, White CC, Mukherjee S, Raj T, Yu L, Larson EB, et al. Susceptibility to neurofibrillary tangles: role of the PTPRD locus and limited pleiotropy with other neuropathologies. Mol Psychiatry. 2018;23 : 1521–1529. doi: 10.1038/mp.2017.20 28322283

46. Rousseeuw PJ. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J Comput Appl Math. 1986;20 : 53–65. doi: 10.1177/003754977702900403

47. Clifford H, Wessely F, Pendurthi S, Emes RD. Comparison of clustering methods for investigation of genome-wide methylation array data. Front Genet. 2011;2 : 1–11. doi: 10.3389/fgene.2011.00001 22303300

48. Braak H, Thal DR, Ghebremedhin E, Del Tredici K. Stages of the Pathologic Process in Alzheimer Disease. J Neuropathol Exp Neurol. 2011;70 : 960–969. doi: 10.1097/NEN.0b013e318232a379 22002422

49. Wilson RS, Arnold SE, Schneider JA, Li Y, Bennett DA. Chronic Distress, Age-Related Neuropathology, and Late-Life Dementia. Psychosom Med. 2007;69. Available: https://journals.lww.com/psychosomaticmedicine/Fulltext/2007/01000/Chronic_Distress,_Age_Related_Neuropathology,_and.9.aspx

50. Mukherjee S, Mez J, Trittschuh EH, Saykin AJ, Gibbons LE, Fardo DW, et al. Genetic data and cognitively defined late-onset Alzheimer’s disease subgroups. Mol Psychiatry. 2018; doi: 10.1038/s41380-018-0298-8 30514930

51. Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43: e47–e47. doi: 10.1093/nar/gkv007 25605792

52. Yu G, Wang L-G, Han Y, He Q-Y. clusterProfiler: an R Package for Comparing Biological Themes Among Gene Clusters. Omi A J Integr Biol. 2012;16 : 284–287. doi: 10.1089/omi.2011.0118 22455463

53. Buniello A, Macarthur JAL, Cerezo M, Harris LW, Hayhurst J, Malangone C, et al. The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019. Nucleic Acids Res. 2019;47: D1005–D1012. doi: 10.1093/nar/gky1120 30445434

54. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al. Cytoscape: A Software Environment for Integrated Models. Genome Res. 2003;13 : 2498–2504. doi: 10.1101/gr.1239303 14597658

Článek Nitric oxide mediates neuro-glial interaction that shapes Drosophila circadian behavior

Článek Duplication and divergence of the retrovirus restriction gene Fv1 in Mus caroli allows protection from multiple retroviruses

Článek JMJD6 participates in the maintenance of ribosomal DNA integrity in response to DNA damage

Článek Super-resolution imaging of RAD51 and DMC1 in DNA repair foci reveals dynamic distribution patterns in meiotic prophase

Článek Steroid hormones regulate genome-wide epigenetic programming and gene transcription in human endometrial cells with marked aberrancies in endometriosis

Článek Regulation of olfactory-based sex behaviors in the silkworm by genes in the sex-determination cascade

Článek Osteocalcin promotes bone mineralization but is not a hormone

Článek A conserved, N-terminal tyrosine signal directs Ras for inhibition by Rabex-5

Článek Integrins regulate epithelial cell shape by controlling the architecture and mechanical properties of basal actomyosin networks

Článek Age-of-onset information helps identify 76 genetic variants associated with allergic disease

Článek Cancer-associated mutations in the iron-sulfur domain of FANCJ affect G-quadruplex metabolism

Článek NRF2 loss recapitulates heritable impacts of paternal cigarette smoke exposure