Determining the Control Circuitry of Redox Metabolism at the Genome-Scale

Download PDF České info

All heterotrophic organisms must balance the deployment of consumed carbon compounds between growth and the generation of energy. These two competing objectives have been shown, both computationally and experimentally, to exist as the principal dimensions of the function of metabolic networks. Each of these dimensions can also be thought of as the familiar metabolic functions of catabolism, anabolism, and generation of energy. Here we detail how two global transcription factors (TFs), ArcA and Fnr of Escherichia coli that sense redox ratios, act on a genome-wide basis to coordinately regulate these global metabolic functions through transcriptional control of enzyme and transporter levels in changing environments. A model results from the study that shows how global transcription factors regulate global dimensions of metabolism and form a regulatory hierarchy that reflects the structural hierarchy of the metabolic network.

Published in the journal: . PLoS Genet 10(4): e32767. doi:10.1371/journal.pgen.1004264
Category: Research Article
doi: https://doi.org/10.1371/journal.pgen.1004264

Summary

Introduction

Regulation of metabolism in response to shifting availability of electron acceptors is a fundamental process in all of biology and is a critical subject for the understanding of pathogenesis, cancer metabolism, and industrial biotechnology. However, even in the model organism Escherichia coli, the regulatory network for this fundamental metabolic function has not been fully elucidated. It has long been known that facultative anaerobes will hierarchically utilize external electron acceptors relative to the free energy change provided by each [1], [2]. Oxygen exists at the top of the hierarchy, electron acceptors like NO₃ in the middle, and lactate or acetate or other fermentation products are at the bottom [3]–[5]. Many detailed studies have determined that the transcription factors (TFs) ArcA and Fnr are the key players in managing this hierarchy through the activation or repression of the electron transport chain (ETC) machinery specific to an available electron acceptor [6]–[11]. It is also largely understood how ArcA senses redox via the flow of reducing equivalents through the ETC, and how Fnr directly senses levels of dissolved O₂ [1], [12], [13] and glutathione [14], [15]. However, it is not clear how these two TFs work together and more importantly why they regulate hundreds of gene products that lie outside of the ETC and energy metabolism [3], [5]?

Even though many biochemical details of redox regulation have been elucidated [6], [8], [16], systems level principles for the global regulatory response throughout the anaerobic shift remain elusive. An important missing piece is a clear framework, or design principle, that elucidates how hundreds of transcriptionally regulated gene products are coordinately regulated to produce the necessary quantitative shifts in metabolic flux states. On the purely metabolic side, certain design principles have emerged through the analysis of stoichiometric models that identified growth and energy generation as the two principal dimensions of metabolic network function [17]–[19]. It was further shown that linear combinations of these two dimensions could account for observed flux patterns throughout nutrient limitations and the anaerobic shift [18], [20]. A question now becomes, what are the corresponding global TFs and how do they coordinately regulate all the gene products which enable the metabolic flux map to shift from one optimal state to another?

Here we show how the global TFs ArcA and Fnr coordinately regulate the primary metabolic dimensions of growth and energy generation. We integrated polyomic data sets and used genome-scale metabolic models to enable a mechanistic understanding of hundreds of simultaneous and individual regulatory events. This analysis subsequently provides a link between global regulatory circuits and global optimality in microbial metabolism.

Results

Genome-scale identification of TF regulatory events

We first identified individual TF regulatory events at the genome-scale. Side-by-side measurements of RNA transcript abundance and TF binding were carried out to determine the structure and causality in E. coli's transcriptional regulatory network (TRN). ChIP-chip assays for ArcA and Fnr were performed under both fermentative and nitrate respiratory conditions (Figure 1A). Gene expression measurements were then used to determine causality of activation or repression for each ArcA or Fnr binding site under these same two conditions (as detailed in the later heatmap figure legend, Figure S1). We found 102, and 86 (and 143 and 132) binding regions and 58 and 54 (and 95 and 55) causal regulatory events for ArcA and Fnr under fermentation (and nitrate respiration) conditions, respectively (Figure 1A, Tables S1, S2, S3, S4). We then compiled the set of genomic sequences underlying these binding regions for each of the TFs and used the MEME program [21] to recover previously identified binding motifs [22], [23] (Figure 1B, Tables S5, S6). We confirmed 180 of 216 (83%) previously known regulatory events [24] and discovered 132 new binding regions relative to RegulonDB (Figure 1A), representing an increase of 74% over current knowledge of the regulatory functions of these two TFs. We further performed a detailed comparison of our results to recently published works [16], [25] to determine a 78% overlap in ArcA binding sites and a 50% overlap in Fnr binding sites under fermentative conditions (Figures S5, S6, S7). In addition, we report 88 novel binding sites for ArcA and 52 novel binding sites for Fnr under nitrate respiratory conditions highlighting plasticity of the network throughout shifting external electron acceptors.

**Fig. 1. ChIP-chip reveals hundreds of new binding regions and regulatory mechanisms.**

We then integrated transcription start sites (TSS) [26] with TF binding regions to identify promoter architectures [27]. The location of TF binding motifs within experimentally determined binding regions were used to prepare histograms of the frequency of TF binding relative to the TSS (Figure 1B). This analysis showed that ArcA spans the TSS or −35 box region and represses transcription while Fnr spans the −41.5 or alpha carboxy terminal domain and activates transcription [27]. While each of these regulatory strategies have been shown previously, here can we show that each strategy is ubiquitous at the genome-scale.

Discovery of transcription factor mediated bidirectional transcription

Novel cases of divergent transcriptional regulation were found in this data. The integration of binding regions with gene expression data revealed 42 regions where two divergent transcriptional units (TUs) were simultaneously regulated by a single binding event. Divergent transcriptional regulation has been observed previously [28] and is known to be mediated by transcription factors in certain cases. However, systematic regulation by global TFs has only been observed in limited cases [29]. We observe a total of 19 inverse, 16 dual activation, and 13 dual repression events for a total of 48 events spread across the 42 regions as some recur under different experimental conditions.

Two examples (Figure 1C) highlight this ‘hard coupling’ of the transcriptional regulation of seemingly unrelated but contextually dependent pathways. The acs-nrfABCDE system represents a lowest common denominator coupling between acetyl-coA synthetase (acs) acetate scavenging to acetyl-coA and usage of acetyl-coA via the TCA cycle and nrfABCDE nitrite reductase. Similarly the aroP-pdhR system couples the transport of aromatic amino acids to the regulation of pyruvate that acts as their principal precursor molecule.

The link between the acs and nrfABCD systems has been inferred/suggested in previous work which attempted to understand how E. coli could survive on acetate as a sole carbon source under anaerobic conditions [30]. In particular, E. coli cannot utilize acetate under fully anaerobic conditions because acetate must be scavenged into acetyl-coA via acs and then utilized by the TCA cycle. Anaerobically the TCA cycle cannot be used unless there is an electron acceptor in the ETC to enable oxidative phosphorylation. Thus, some usage of the TCA cycle via an alternative electron acceptor such as nitrite or nitrate is necessary for E. coli to utilize acetate and acetyl-coA anaerobically. This metabolic feature is physiologically crucial in the gut environment that is rich in fatty acids that cannot be used if E.coli does not utilize alternative electron acceptors like nitrite. Hence, the direct coupling of acs and nrfABCD through bidirectional transcriptional regulation is consistent with the necessity of a flux through the nrfABCD system in order for the acetyl-coA formed by acs to be utilized. The transcriptional coupling acts as bidirectional gate controlled by ArcA and the redox state of the cell to coordinate this evolutionarily crucial metabolic capability.

Similarly the aroP-pdhR system couples the transport of aromatic amino acids to the regulation of pyruvate that acts as their principal precursor molecule through the action of Fnr. To understand the network level connection between the aromatic amino acid transporter (aroP) and the pyruvate dehydrogenase repressor TF (pdhR) one can examine Figure 2, which shows the connection between catabolic biomass precursors and biosynthetic pathways. Tyrosine and tryptophan are both made directly from PEP that is rapidly dephosphorylated into pyruvate. The corresponding activation of aroP and repression of pdhR is consistent with an increased need for amino acid transport when the precursors for biosynthesis (PEP) are critical to maintain cellular energy levels. This characteristic is supported by a dampening of the switch upon the transition to nitrate respiration, resulting in decreased transporter expression when less pyruvate is needed for fermentation and can thus be shuttled to amino acid biosynthesis. In general, pdhR acts as a classic repressor that “pops off” of its binding site in the presence of pyruvate and hence allows expression of pyruvate dehydrogenase and other oxidative enzymes. Anaerobically pyruvate dehydrogenase (aceEF-lpd) is repressed regardless of pdhR by ArcA and Fnr and given that there is also a higher concentration of pyruvate it would presumably not be active. Thus, while this switch is highlighted anaerobically in that full repression of pdhR is concomitant with aroP activation its physiological significance is more prevalent under nitrate or even fully aerobic conditions in which it can function to directly couple and balance the catabolic and anabolic demands around pyruvate which acts as a critical second messenger in the aerobic-anaerobic shift [6]. It is very insightful to view such a switch as it is ramped fully up under anaerobic conditions and then turned down under nitrate respiration to maintain a physiologically crucial metabolic balance.

**Fig. 2. ArcA and Fnr ubiquitously regulate the three branches of metabolism.**

Ubiquitous regulation of the principal dimensions of metabolism by ArcA and Fnr

Previous work has identified biomass production and energy production as the two principal dimensions characterizing the overall function of metabolic networks [17]–[19]. This duality in function is conceptually equivalent to considering heterotrophic metabolism as the standard combustion equation (Figure 2) in which an electron donor (glucose) is broken apart with an electron acceptor (oxygen, nitrate, etc.) to form biomass, energy, waste and heat. Here we use the terms catabolism to describe oxidation of the electron donor, anabolism to describe biomass formation, and chemiosmosis to describe energy generation. The genes in each of these categories were determined by a manual curation of the E. coli metabolic model [31] and associated literature sources [4], [26]. Catabolic genes correspond to nutrient transporters, recycling machinery, and central catabolic machinery. Anabolic genes correspond to biosynthetic and macromolecular synthesis pathways. Chemiosmotic genes correspond to the electron transport chain (ETC), fermentation pathways, and ion pumps (Figure 3).

**Fig. 3. Integration of ChIP-chip, gene expression, and biological context.**

From the data sets described above, the regulation of these three classes of genes by ArcA and Fnr can be analyzed using their metabolic functions as context. ArcA and Fnr directly regulate a total of 127 catabolic genes including 49 transporter genes, 38 recycling or secondary catabolic enzymes, 33 central metabolic genes, and 7 associated TFs (Figures 2,3). In particular, recovery of all of the classic targets of ArcA and Fnr is complemented by the simultaneous discovery of transporter genes and recycling enzymes like peptidases and proteases (Figure 3). It can also be recognized that there existed many classically unknown glycolytic targets along with generally unrecognized activation of the glucose transporter ptsG. Activation of ptsG by Fnr is consistent with the fact that cells nearly double their uptake of carbon during fermentative growth compared with aerobic growth.

In anabolism, ArcA and Fnr directly regulate 54 genes including 34 metabolite synthesis genes, 14 macromolecular synthesis genes, and 6 TFs. Broad trends of nucleotide biosynthesis activation and amino acid biosynthetic activation of nucleotide precursors is consistent with redox related demands. However, perhaps the most important of these findings is the regulation of both transhydrogenases (sthA, pntAB) in E. coli. Previous work has shown that a large portion of the NADPH used for biosynthetic reactions comes from the membrane bound transhydrogenase PntAB [32] and that the soluble SthA is used for re-oxidation of NADPH under aerobic growth with excess glucose. Our data shows that ArcA activates pntAB and represses sthA in a redox-dependent fashion consistent with an increased need for NADPH under nitrate respiration relative to fermentation (Figure 3). This regulatory shuttling of reduction equivalents thus plays a critical role in maintaining the balance between growth and energy generation by increasing growth only once when energy demands are satisfied.

In the chemiosmotic category we observe regulation of 120 genes including 83 genes of the ETC, 6 for fermentation, 21 for ion pumps, 2 for motility, and 8 TFs. Nearly all of the regulation can be shown to coincide with redox related demands including regulation of ion pumps which coincides with an increased need to maintain a positive electrical gradient across the inner membrane to make up for the diminished proton gradient. We also observed strong regulation of the flhDC, gadW, and appY transcription factors. The flhDC system is a master regulator for the motility and flagellum apparatus of the cell that feeds off the chemiosmotic gradient in search of nutrients. appY and gadW are key regulators of cytochromes and acidic tolerance, respectively. After including regulation through appY we can conclude that ArcA and Fnr exhibit control either directly or indirectly over 15 out of the 16 known dehydrogenase and oxidoreductase reactions in E. coli [4] (Figures 2,3).

High-level architecture of the metabolic-regulatory network

Enumerating regulatory events is informative, but how do they all together form a coherent regulatory logic that produces meaningful physiological states? Network analysis of these regulatory interactions reveals a qualitative feedforward and feedback flow-based model of the primary metabolic dimensions (Figure 4A). The model input is the total set of catabolites (glucose or electron donor) available to the cell that are oxidized based on the availability of an electron acceptor into a ratio of reduced to oxidized components. These components (primarily NADH/NAD and NADPH/NADP) are then used by the anabolic machinery to generate biomass, or by the chemiosmotic machinery to generate energy as outputs. The ratio of reduced-to-oxidized components is sensed by ArcA and Fnr [1], and they can feedback and feedforward regulate the catabolic, anabolic, and chemiosmotic processes in a coordinated fashion to maintain the ratio. Consistent with this schema, it has been shown that TFs are ideal flux sensors [33].

**Fig. 4. Flow based model of the metabolic-regulatory network explains regulation throughout the anaerobic shift.**

Feedforward with feedback-trim architecture regulates the anaerobic shift

Analyzing the regulatory events within the context of the qualitative flow-based model reveals a feedforward with feedback-trim architecture of the overall regulatory logic. Counting the number of genes that are activated or repressed (Figure 3) provides a measure of the extent of feedforward or feedback regulation exerted (Figure 4B). Under fermentation ArcA represses 70 catabolic genes and Fnr activates 75 chemiosmotic genes. Under nitrate respiration ArcA represses 73 catabolic genes and Fnr activates 61 chemiosmotic output genes. A similar trend is observed for regulation of the anabolic circuitry in which Fnr activates 14 and 11 genes under fermentation and nitrate respiration. This circuitry is consistent with fast sensing of oxygen by Fnr and slow but continuous sensing of redox flow through the ETC by ArcA [34].

The regulatory architecture revealed by this qualitative model is comprehensive and novel, but primarily topological. To more quantitatively assess the functions of the observed transcriptional regulatory architecture on the metabolic network that it regulates we sampled all allowable network flux states of a highly curated genome-scale metabolic model of E. coli metabolism [31] under both fermentative and nitrate respiratory conditions. This sampling of allowable flux states of the metabolic network was then integrated with the experimentally determined regulatory architecture to discern the amount of total flux (sum of flux loads across all reactions) regulated by ArcA and Fnr under each of the conditions studied. This calculation revealed that 60% and 57% (and 88% and 80%) of all metabolic flux is directly (and indirectly) controlled by ArcA and Fnr under fermentative and nitrate respiratory conditions respectively (Tables S7, S8). We further show that 69% and 62% of the catabolic fluxes producing each of the redox molecules and biomass precursors along with 71% and 69% of the downstream anabolic and chemiosmotic fluxes are directly regulated under fermentative and nitrate respiratory conditions respectively (Figure S3, Table S9, S10). From a gene level we find that 246 genes are differentially expressed (fdr<.05, fold change >2) between fermentative and nitrate respiratory conditions and that 236/246 or ∼96% of the genes are directly (73) or indirectly (163) regulated by ArcA or Fnr (Table S12). Taken together, these measurements quantify the global metabolic regulation of flux by ArcA and Fnr and provide further evidence towards the proposed feedforward with feedback-trim regulatory architecture.

To provide more validation for the feedforward with feedback-trim architecture at the genome-scale we first assessed the set of 91 reactions that significantly differed (flux cutoff of 0.25 mmol/gDW-1 -h-1) between fermentation and nitrate respiration; gDW is denotes grams dry weight. We were then able to show that 89 of the 91 reactions were regulated directly (40 reactions) or indirectly (49 reactions) by ArcA or by Fnr (Table S11). We then calculated the change in flux for each of these 89 reactions between the two conditions along with the change in regulatory strength for the genes encoding these 89 reactions across the same conditions (Table S11). We plotted the change in flux versus the change in regulation (Figure 5A) and calculated an r² correlation value of 0.71 (p<1e-6) for the directly regulated genes. This correlation provides quantitative evidence for the logic of the regulatory circuit in the transition from fermentation to nitrate respiration. The linear positive slope shows not only that the reactions responsible for the redox shift are regulated, but also that these reactions are quantitatively regulated to help minimize the redox ratio in concert with the quantitative model predictions. Most of the ArcA regulated reactions are de-repressed, as indicated by the lightening shade of blue under nitrate respiration (Figure 5B). Most of the Fnr regulated reactions are de-activated as highlighted by the lightening shade of yellow under nitrate respiration (Figure 5B). The broad repression of crucial catabolic genes by ArcA and activation of chemiosmotic genes by Fnr is also shown through analysis of C-13 MFA data generated between wild type and Δfnr or ΔarcA strains (Figure S8). This trend of redox ratio minimization was so strong that the only outliers resulted in identification of new biology in the form of transport-coupled redox balancing for allosterically regulated amino acid biosynthetic reactions (Figure S4, Text S1).

**Fig. 5. Quantitative correlation of shifts in regulatory strength between experimental conditions with shifts in flux through regulated enzymes.**

We then sought to show that this quantitative regulatory model was truly redox dependent and not just fermentative/nitrate respiration specific. We thus took C-13 measured flux data [35] for E. coli grown aerobically in batch under either fully respiratory galactose conditions or partially fermentative glucose conditions. Even though both conditions are aerobic, we hypothesized that a similar shift in the redox ratio as observed between fully fermentative and nitrate respiration would occur given the comparison between a partially fermentative and fully respiratory condition. We made the same plot (Figure 5C) as in Figure 5a and even used regulatory strengths taken from the fermentative/nitrate shift. Only 16 flux measurements could be mapped of which only 9 showed any difference between glucose and galactose conditions. Of those 9 fluxes we were able to see a clear correlation for 7 and an overall weak but significant r² correlation value of .26 (p = .079). This plot again shows genes regulated by ArcA being de-repressed and genes regulated by Fnr being de-activated upon the shift to more oxidative conditions (Figure 5D).

Hierarchy of the joint metabolic-regulatory network

An expansion of the top-level of the flow-based model contextualizes the function of the hundreds of individual gene products and provides a window into the structure of the full metabolic-regulatory network (Figure 6A). Each different type of catabolite (Figure 3, Figure 4A, Figure 6A) is maintained via production fluxes (transport or recycling) and consumption fluxes (secondary catabolism or central catabolism). The catabolism specific production set consists of genes for amino acid, carbohydrate, lipid, and nucleic acid transport and recycling. The same expansion can be performed for anabolism and chemiosmosis. For anabolism, the total biomass is a result of the sum of the rate of metabolite biosynthesis plus the rate of macromolecular synthesis [36] minus the rate of dilution and recycling. For chemiosmosis, the total gradient is a sum of protons pumped across the inner membrane via the ETC, proton equivalents pumped across the inner membrane via fermentation, and ions translocated across the inner membrane minus the usage of the gradient for ATP production, nutrient transport, and motility [37].

**Fig. 6. Topological organization of the joint metabolic-regulatory network.**

This expansion also accounts for the classically observed hierarchy [38] of the TRN via sensing of lower level metabolites and subsequent regulatory control of the TFs themselves or of the production or consumption pathways for sensed metabolites (Figure 6B). A full tracing of the TRN to explain the effects of the global TF deletion is consistent with 69% of observed differential expression (Figure S2).

Discussion

This work presents a systems level and genome-scale mechanism for the coordinate action of global transcription factors throughout an electron acceptor shift. Our mechanism accounts for the previously unexplained genes regulated by ArcA and Fnr, it predicts changes in flux patterns, and perhaps most importantly shows that the classically observed hierarchy of transcriptional regulation mirrors the hierarchy of dimensions in the metabolic network. By basing our work off of the extensive body of detailed biological literature and the more recent work of principal dimensionality in metabolic networks we are able to present a systematic and remarkably consistent genome-scale mechanism.

At the local level, we first greatly expanded the number of cases of promoter architectures [39]. This validates and highlights the importance of understanding initiation mechanisms, as they may be extendable to a systems level in future development of computational models. We were then able to make the novel discovery that 42 regions across the genome contained divergently transcribed TUs controlled by a single global TF binding region. We recognize that due to ChIP-chip resolution it is possible (and even likely) that multiple binding sites exist under the larger ChIP peak, however the local proximity still affords the same hard-coupling within the regulon. This hard coupling suggests switch like mechanisms in which sets of seemingly unrelated genes are jointly regulated to obey non-obvious systems level constraints. We identify two such cases of this in the acs-nrfABCDE operon and the aroP-pdhR operon.

To understand systems level mechanisms of transcriptional regulation we turned to previous work that showed the principal dimensions of a metabolic space were biomass and energy generation. We hypothesized that global regulators must play a role in regulating globally decisive metabolic dimensionality. This hypothesis is supported by broad regulation across all of these main categories and the abilities of ArcA and Fnr to sense the molecules that govern the branch point between the two dimensions.

Although we were able to make an unbiased characterization of the genes in each of the categories using the iJO1366 model we were still unsatisfied with such a coarse grained approach and sought to understand the composition of each of the categories. This led us to a hierarchical expansion and classification of pathways around key metabolic intermediates. Going on in this fashion led us to realize that the global transcriptional regulatory hierarchy plays out not only on the level of TF-TF regulation, but perhaps more importantly at the level of global TFs regulating the production or consumption fluxes of lower level metabolites which are correspondingly sensed by other intermediate regulators. In essence, the regulatory network is shaped by the underlying metabolite pools and vice versa.

After determining the broad circuitry of the metabolic-regulatory network we mapped our data onto it and discovered that a strong feedforward with feedback trim architecture dominates at the genome scale. This occurs via ArcA's strong repression of input catabolic circuits coupled with Fnr's strong activation of downstream chemiosmotic and anabolic circuitry. This circuit is corroborated by Fnr's ability to sense oxygen [13] which will diffuse quickly whereas ArcA will more continuously sense the flow of reducing equivalents through the ETC by sensing of the ratio of reduced to oxidized quinones [12]. This pattern of a fast component feeding forward for downstream “planning” coupled with a slower but continuous feedback sensor is a common pattern in basic process control schemes [40]. If coupled with other common process control patterns such as hierarchical and PID control one can envision a process control based model for the entire joint metabolic-regulatory network.

This work presents a formal integration and reconstruction of over 50 years of research on E. coli metabolism and its transcriptional regulation. The result is a detailed and coherent hierarchical view of the regulation of the principal dimensions of metabolism through a critical environmental shift. We find that the mathematical notions of optimality in metabolic functions are in line with our observations of global regulation. TRNs are not just TF-gene networks but rather TF-gene-enzyme-reaction flux networks, that are tightly integrated as levels or ratios of metabolites can drive TF activity [41], [42]. The full elucidation of an electron acceptor response in the important model organism, E. coli, may have implications for similar metabolic responses in other organisms. For cancer, recent focus has shifted towards an understanding of the metabolic drivers and Warburg effect, where the hypoxia inducible factor (HIF) [43] senses the redox ratio and feedforward or feedback regulates genes producing or consuming reduction potential.

Taken together, we are able to show how the two principal dimensions of metabolism are controlled in a shifting environment by global TFs through the use of polyomic data sets and genome-scale metabolic models. This study is likely to be useful as a guide for similar studies in other organisms where the same tools for experimentation and analysis are available.

Methods

Bacterial strains and growth conditions

All strains used in this study were E. coli K-12 MG1655 and its derivatives. The E. coli strains harboring Fnr-8myc and ArcA-8myc were generated as described previously [44]. The deletion mutants (Δfnr and ΔarcA) were constructed by a λ red and FLP-mediated site-specific recombination method. Glycerol stocks of E. coli strains were inoculated into M9 minimal medium containing 0.2% (w/v) carbon source (glucose) and 0.1% (w/v) nitrogen source (NH4Cl), and cultured overnight at 37°C with constant agitation. The cultures were diluted 1∶100 into fresh minimal medium and then cultured at 37°C to an appropriate cell density with constant agitation. For the anaerobic cultures, the minimal medium were flushed with nitrogen and then continuously monitored using a polarographic-dissolved oxygen probe (Cole-Parmer Instruments) to ensure anaerobicity. For nitrate respiration 20 mmol potassium nitrate was added.

ChIP-chip

To identify Fnr and ArcA binding regions in vivo, we used the ChIP-chip approach as described previously [44], [45]. Briefly, cells at appropriate cells density were cross-linked by 1% formaldehyde at ∼20°C for 25 min. Following the quenching of the unused formaldehyde with a final concentration of 125 mM glycine at ∼20°C for 5 min, the cross-linked cells were harvested and washed three times with 50 ml of ice-cold Trisbuffered saline. The washed cells were resuspended in 0.5 ml lysis buffer composed of 50 mM Tris-HCl (pH 7.5), 100 mM NaCl, 1 mM EDTA, 1 µg/ml RNaseA, protease inhibitor cocktail (Sigma) and 1 kU Ready-Lyse lysozyme Epicentre). The cells were incubated at 37°C for 30 min and then treated with 0.5 ml of 2 Å∼IP buffer with the protease inhibitor cocktail. The lysate was then sonicated four times for 20 s each in an ice bath to fragment the chromatin complexes using a Misonix sonicator 3000 (output level, 2.5). The range of the DNA size resulting from the sonication procedure was 300–1,000 base pairs (bp). The specific antibodies that specifically recognizes myc tag (9E10, Santa Cruz Biotech) were used to immunoprecipitate each chromatin complex, respectively. For the control (mock-IP), 2 µg of normal mouse IgG (Upstate) was added into the cell extract. The remaining ChIP-chip procedures were performed as described previously [44], [45]. The high-density oligonucleotide tiling arrays used to perform ChIP-chip analysis consisted of 371,034 oligonucleotide probes spaced 25 bp apart (25 bp overlap between two probes) across the E. coli genome (Roche NimbleGen). After hybridization and washing steps, the arrays were scanned on an Axon GenePix 4000B scanner and features were extracted as a pair format by using NimbleScan 2.4 software (RocheNimbleGen).

qPCR

To monitor the enrichment of promoter regions, 1 µL immunoprecipitated DNA was used to carry out gene-specific qPCR. The quantitative real-time PCR of each sample was performed in triplicate using iCycler (Bio-Rad Laboratories) and SYBR green mix (Qiagen). The real-time qPCR conditions were as follows: 25 µL SYBR mix (Qiagen), 1 µL of each primer (10 pM), 1 µL of immunoprecipitated or mock-immunoprecipitated 3DNA and 22 µL of ddH2O. All real-time qPCR reactions were done in triplicates. The samples were cycled to 94°C for 15 s, 52°C for 30 s and 72°C for 30 s (total 40 cycles) on a LightCycler (Bio-Rad). The threshold cycle values were calculated automatically by the iCycler iQ optical system software (Bio-Rad Laboratories). Any primer sequences used were described previously [44].

Transcriptome analysis

Samples for transcriptome analysis were taken from exponentially growing cells. From the cells treated by RNAprotect Bacteria Reagent (Qiagen), total RNA samples were isolated using RNeasy columns (Qiagen) in accordance with manufacturer's instruction. Total RNA yields were measured using a spectrophotometer (A260), and quality was checked by visualization on agarose gels and by measuring the sample A260/A280 ratio (>1.8). Affymetrix GeneChip E. coli Genome 2.0 arrays were used for genome-scale transcriptional analyses. cDNA synthesis, fragmentation, end-terminus biotin labeling, and array hybridization were performed as recommended by Affymetrix standard protocol. Raw CEL files were analyzed using robust multi-array average for normalization and calculation of probe intensities. The processed probe signals derived from each microarray were averaged for both the wild type and deletion mutant strains.

ChIP-chip and expression data analysis

To identify TF-binding regions, we used the peak finding algorithm built into the NimbleScan software. Processing of ChIP-chip data was performed in three steps: normalization, IP/mock-IP ratio computation (log base 2), and enriched region identification. The log2 ratios of each spot in the microarray were calculated from the raw signals obtained from both Cy5 and Cy3 channels, and then the values were scaled by Tukey bi-weight mean. The log2 ratio of Cy5 (IP DNA) to Cy3 (mock-IP DNA) for each point was calculated from the scanned signals. Then, the bi-weight mean of this log2 ratio was subtracted from each point. Each log ratio dataset from duplicate samples was used to identify TF-binding regions using the software (width of sliding window = 300 bp). Our approach to identify the TF-binding regions was to first determine binding locations from each data set and then combine the binding locations from at least five of six datasets to define a binding region using the MetaScope software (http://sbrg.ucsd.edu/Downloads/MetaScope). Raw gene expression CEL files were normalized using background corrected robust multi-array average implemented in the R affy package. To detect differential expression between the wild type and TF deletion strains we applied a two-tailed unpaired students t-test between the experimental triplicates for the wild type and gene deletion strains. This was followed by a false discovery rate adjustment. Before performing the FDR correction we removed all genes that exhibited an expression level below the background across all experiments. The background level was calculated as the average expression level across all intergenic probes. We then only considered genes meeting a 5% FDR (false discovery rate)-adjusted P-value cut-off to be differentially expressed. ChIP binding tracks for Figure 1a and the heatmap for Figure 3 were generated using D3 [46]. Related code is available at http://nbviewer.ipython.org/gist/steve-federowicz/7cceedba73982c0ae995. All raw and processed data have been deposited in NCBI/GEO under accession number GSE55367 (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE55367).

Motif searching

The ArcA and Fnr binding motif analysis was completed using the MEME and FIMO tools from the MEME software suite [21]. We first determined the proper binding motif and then scanned the full genome for its presence. The elicitation of the motif was done using the MEME program on the set of sequences defined by the ArcA and Fnr binding regions respectively. Using default settings the previously determined ArcA and Fnr motifs were recovered and then tailored to the correct size by setting the width parameter to 18-bp and 16-bp respectively. We then used these motifs and the PSPM (position specific probability matrix) generated for each by MEME to rescan the entire genome with the FIMO program.

Promoter architecture determination

We integrated transcription start sites (TSS) with our TF binding regions to identify promoter architectures genome wide [27], [47]. We first determined the location of motif binding sites within experimentally determined binding regions. We then calculated the distance between motif center position and previously determined TSS locations [26]. Finally, we prepared a histogram of the number of motifs that occur at varying distances relative to the TSS (Figure 1B) and included the gene expression data to determine the regulatory outcome of each binding event. The results showed that ArcA spans the TSS or −35 box region and represses transcription while Fnr spans the −41.5 or alpha carboxy terminal domain [47] and activates transcription. The histograms also reveal the previously reported trend [48] of motif frequency oscillation at a roughly 10.5 bp interval consistent with helical phasing of the DNA strand.

Genome-scale metabolic sampling

To perform sampling we first generated pFBA [49] constrained models of the iJO1366 [31] metabolic model corresponding to fermentative and nitrate respiratory conditions. Fermentative conditions were simulated by setting the lower bound of the oxygen exchange reaction (EX_o2) to zero. Nitrate respiratory conditions were simulated by setting the lower bound for nitrate uptake (EX_no3) to −20 mmol gDW⁻¹ h⁻¹ (mirroring experimental addition of 20 mmol KNO₃) along with the lower bound of EX_o2 set to zero. pFBA constrained models were generated by first using the convertToIrreversible() function of the COBRA toolbox [50] followed by a standard FBA for growth rate. This growth rate was then imposed as a constraint in a subsequent optimization that found the minimum sum of flux able to achieve that growth rate. Finally, using the gpSampler() [50] method we sampled each of the pFBA constrained models. All sampling runs were for a full 24 hours to ensure a mixing fraction below .55. After sampling was performed we took the average across the 7046 sampling points (2n where n = 3,523 reactions in the metabolic model). Sampling results were then interfaced with the regulatory network and metabolic model via the COBRApy project (http://opencobra.sourceforge.net/openCOBRA), iPython notebook [51], and in-house databases.

Supporting Information

Zdroje

1. GreenJ, PagetMS (2004) Bacterial redox sensors. Nat Rev Micro 2 : 954–966.

2. IuchiS, LinEC (1991) Adaptation of Escherichia coli to respiratory conditions: regulation of gene expression. Cell 66 : 5–7.

3. RolfeMD, BeekAT, GrahamAI, TrotterEW, AsifHMS, et al. (2011) Transcript Profiling and Inference of Escherichia coli K-12 ArcA Activity across the Range of Physiologically Relevant Oxygen Concentrations. Journal of Biological Chemistry 286 : 10147–10154 doi:10.1074/jbc.M110.211144

4. UndenG, DünnwaldP (2008) The Aerobic and Anaerobic Respiratory Chain of Escherichia coli and Salmonella enterica: Enzymes and Energetics. Ecosal 79 : 4218–4226.

5. ConstantinidouC, HobmanJ, GriffithsL, PatelM, PennC, et al. (2006) A reassessment of the FNR regulon and transcriptomic analysis of the effects of nitrate, nitrite, NarXL, and NarQP as Escherichia coli K12 adapts from aerobic to nitrate respiration. The Journal of biological chemistry 281 : 4802–4815.

6. TrotterEW, RolfeMD, HounslowAM, CravenCJ, WilliamsonMP, et al. (2011) Reprogramming of Escherichia coli K-12 Metabolism during the Initial Phase of Transition from an Anaerobic to a Micro-Aerobic Environment. PLoS ONE 6: e25501 doi:10.1371/journal.pone.0025501.t003

7. IuchiS, LinECC (1993) Adaptation of Escherichia coli to redox environments by gene expression. Mol Microbiol 9 : 9–15 doi:10.1111/j.1365-2958.1993.tb01664.x

8. IuchiS, WeinerL (1996) Cellular and molecular physiology of Escherichia coli in the adaptation to aerobic environments. J Biochem 120 : 1055–1063.

9. AlexeevaS, HellingwerfK (2003) Requirement of ArcA for Redox Regulation in Escherichia coli under Microaerobic but Not Anaerobic or Aerobic Conditions. Journal of Bacteriology 185 (1) 204–9.

10. Shalel LevanonS, SanK-Y, BennettGN (2005) Effect of oxygen on the Escherichia coli ArcA and FNR regulation systems and metabolic responses. Biotechnol Bioeng 89 : 556–564 doi:10.1002/bit.20381

11. PartridgeJD, ScottC, TangY, PooleRK, GreenJ (2006) Escherichia coli transcriptome dynamics during the transition from anaerobic to aerobic conditions. The Journal of biological chemistry 281 : 27806–27815 doi:10.1074/jbc.M603450200

12. MalpicaR, FrancoB, RodriguezC, KwonO, GeorgellisD (2004) Identification of a quinone-sensitive redox switch in the ArcB sensor kinase. Proceedings of the National Academy of Sciences 101 : 13318–13323 doi:10.1073/pnas.0403064101

13. KileyP, BeinertH (1998) Oxygen sensing by the global regulator, FNR: the role of the iron-sulfur cluster. FEMS Microbiology Reviews

14. UndenG, AchebachS, HolighausG (2002) Control of FNR Function of Escherichia coli by O2 and Reducing Conditions. Journal of molecular microbiolog 4 (3) 263–8.

15. AchebachS, SelmerT, UndenG (2005) Properties and significance of apoFNR as a second form of air-inactivated [4Fe-4S]·FNR of Escherichia coli. FEBS Journal 272 : 4260–4269 doi:10.1111/j.1742-4658.2005.04840.x

16. ParkDM, AkhtarMS, AnsariAZ, LandickR, KileyPJ (2013) The Bacterial Response Regulator ArcA Uses a Diverse Binding Site Architecture to Regulate Carbon Oxidation Globally. PLoS Genet 9: e1003839 doi:10.1371/journal.pgen.1003839.s017

17. CarlsonR, SriencF (2003) Fundamental Escherichia coli biochemical pathways for biomass and energy production: Identification of reactions. Biotechnol Bioeng 85 : 1–19 doi:10.1002/bit.10812

18. SchuetzR, ZamboniN, ZampieriM, HeinemannM, SauerU (2012) Multidimensional Optimality of Microbial Metabolism. Science 336 : 597–601 doi:10.1126/science.1216283

19. NoorE, EdenE, MiloR, AlonU (2010) Central Carbon Metabolism as a Minimal Biochemical Walk between Precursors for Biomass and Energy. Molecular Cell 39 : 809–820 doi:10.1016/j.molcel.2010.08.031

20. CarlsonR, SriencF (2004) Fundamental Escherichia coli biochemical pathways for biomass and energy production: creation of overall flux states. Biotechnol Bioeng 86 : 149–162 doi:10.1002/bit.20044

21. BaileyTL, BodenM, BuskeFA, FrithM, GrantCE, et al. (2009) MEME SUITE: tools for motif discovery and searching. Nucleic acids research 37: W202–W208 doi:10.1093/nar/gkp335

22. RobisonK, McGuireAM, ChurchGM (1998) A comprehensive library of DNA-binding site matrices for 55 proteins applied to the complete Escherichia coli K-12 genome. J Mol Biol 284 : 241–254 doi:10.1006/jmbi.1998.2160

23. McGuireAM, De WulfP, ChurchGM, LinECC (1999) A weight matrix for binding recognition by the redox-response regulator ArcA-P of Escherichia coli. Mol Microbiol 32 : 219–221 doi:10.1046/j.1365-2958.1999.01347.x

24. Gama-CastroS, SalgadoH, Peralta-GilM, Santos-ZavaletaA, Muniz-RascadoL, et al. (2010) RegulonDB version 7.0: transcriptional regulation of Escherichia coli K-12 integrated within genetic sensory response units (Gensor Units). Nucleic acids research 39: D98–D105 Available: http://nar.oxfordjournals.org/content/39/suppl_1/D98.short.

25. MyersKS, YanH, OngIM, ChungD, LiangK, et al. (2013) Genome-scale Analysis of Escherichia coli FNR Reveals Complex Features of Transcription Factor Binding. PLoS Genet 9: e1003565 doi:10.1371/journal.pgen.1003565.s021

26. KeselerIM, Collado-VidesJ, Santos-ZavaletaA, Peralta-GilM, Gama-CastroS, et al. (2010) EcoCyc: a comprehensive database of Escherichia coli biology. Nucleic acids research 39: D583–D590 doi:10.1093/nar/gkq1143

27. EstremS, RossW, GaalT, ChenZ, NiuW, et al. (1999) Bacterial promoter architecture: subsite structure of UP elements and interactions with the carboxy-terminal domain of the RNA polymerase α subunit. Genes & Development 13 : 2134–2147.

28. BeckCF, WarrenAJ (1988) Divergent Promoters, a Common Form of Gene Organization. Microbiological reviews 52 : 318–326.

29. ChumsakulO, TakahashiH, OshimaT, HishimotoT, KanayaS, et al. (2011) Genome-wide binding profiles of the Bacillus subtilis transition state regulator AbrB and its homolog Abh reveals their interactive role in transcriptional regulation. Nucleic acids Research 39 (2) 414–28.

30. ClarkDP, CronanJEJr (2005) Two-carbon compounds and fatty acids as carbon sources. Escherichia coli and Salmonella: cellular and molecular biochemistry

31. OrthJD, ConradTM, NaJ, LermanJA, NamH, et al. (2011) A comprehensive genome-scale reconstruction of Escherichia coli metabolism—2011. Mol Syst Biol 7 Available: http://www.nature.com/msb/journal/v7/n1/full/msb201165.html.

32. SauerU (2003) The Soluble and Membrane-bound Transhydrogenases UdhA and PntAB Have Divergent Functions in NADPH Metabolism of Escherichia coli. Journal of Biological Chemistry 279 : 6613–6619 doi:10.1074/jbc.M311657200

33. KochanowskiK, VolkmerB, GerosaL, van RijsewijkBRH, SchmidtA, et al. (2013) Functioning of a metabolic flux sensor in Escherichia coli. Proceedings of the National Academy of Sciences 110 : 1130–1135.

34. PartridgeJD, SanguinettiG, DibdenDP, RobertsRE, PooleRK, et al. (2007) Transition of Escherichia coli from Aerobic to Micro-aerobic Conditions Involves Fast and Slow Reacting Regulatory Components. Journal of Biological Chemistry 282 : 11230–11237 doi:10.1074/jbc.M700728200

35. van Rijsewijk BartRBH, NanchenA, NalletS, KleijnRJ, SauerU (2011) Large-scale 13C-flux analysis reveals distinct transcriptional control of respiratory and fermentative metabolism in Escherichia coli. Mol Syst Biol 7 : 1–12 doi:10.1038/msb.2011.9

36. LermanJA, HydukeDR, LatifH, PortnoyVA, LewisNE, et al. (2012) In silico method for modelling metabolism and gene product expression at genome scale. Nat Comms 3 : 929 doi:10.1038/ncomms1928

37. KrulwichTA, SachsG, PadanE (2011) Molecular aspects of bacterial pH sensing and homeostasis. Nat Rev Micro 9 : 330–343 doi:10.1038/nrmicro2549

38. Cosentino LagomarsinoM, JonaP, BassettiB, IsambertH (2007) Hierarchy and feedback in the evolution of the Escherichia coli transcription network. Proceedings of the National Academy of Sciences 104 : 5516–5520 doi:10.1073/pnas.0609023104

39. BrowningDF, BusbySJW (2004) The regulation of bacterial transcription initiation. Nat Rev Micro 2 : 57–65 doi:10.1038/nrmicro787

40. Seborg DE, Mellichamp DA, Edgar TF, Francis J Doyle I (2010) Process Dynamics and Control. Wiley.

41. PatilK, NielsenH (2005) Uncovering transcriptional regulation of metabolism by using metabolic network topology. Proceedings of the National Academy of Sciences 102 : 2685–2689.

42. GruningN-M, LehrachH, RalserM (2010) Regulatory crosstalk of the metabolic network. Trends in Biochemical Sciences 35 : 220–227 doi:10.1016/j.tibs.2009.12.001

43. SemenzaG (2012) Hypoxia-Inducible Factors in Physiology and Medicine. Cell 148 : 399–408.

44. ChoB-K, KnightEM, PalssonBØ (2006) PCR-based tandem epitope tagging system for Escherichia coli genome engineering. Biotechniques 40 (1) 67–72.

45. ChoBK, BarrettCL, KnightEM, ParkYS, PalssonBO (2008) Genome-scale reconstruction of the Lrp regulatory network in Escherichia coli. Proceedings of the National Academy of Sciences 105 : 19462–19467 doi:10.1073/pnas.0807227105

46. BostockM, OgievetskyV, HeerJ (2011) Data-Driven Documents 1–9.

47. GourseR, RossW (2000) UPs and downs in bacterial transcription initiation: the role of the alpha subunit of RNA polymerase in promoter recognition. Mol Microbiol 37 (4) 687–95.

48. SharonE, KalmaY, SharpA, Raveh-SadkaT, LevoM, et al. (2012) Inferring gene regulatory logic from high-throughput measurements of thousands of systematically designed promoters. Nat Biotechnol 30 : 521–530 doi:10.1038/nbt.2205

49. LewisNE, HixsonKK, ConradTM, LermanJA, CharusantiP, et al. (2010) Omic data from evolved E. coli are consistent with computed optimal growth from genome-scale models. Mol Syst Biol 6 : 390 doi:10.1038/msb.2010.47

50. SchellenbergerJ, QueR, FlemingRMT, ThieleI, OrthJD, et al. (2011) Quantitative prediction of cellular metabolism with constraint-based models: the COBRA Toolbox v2.0. Nat Protoc 6 : 1290–1307 doi:10.1038/nprot.2011.308

51. PerezF, GrangerBE (2007) IPython: A System for Interactive Scientific Computing. Comput Sci Eng 9 : 21–29 doi:10.1109/MCSE.2007.53