0115 966 7955 Today's Opening Times 10:00 - 20:00 (BST)

Analysis of GATA3 ChIP-seq in the Four Cells

Published: Last Edited:

Disclaimer: This essay has been submitted by a student. This is not an example of the work written by our professional essay writers. You can view samples of our professional work here.

Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of UK Essays.

Analysis of GATA3 ChIP-seq data in A549、MCF-7、T-47D and SK-N-SH cells

Highlights:

1. There was 4839 common GATA3 target genes among the four cells

2. The common GATA3 target genes were mainly enriched in pathway related with cancer.

3. The GATA3 target genes were identified, including ER1, FOXA1, AKR1C1 and LGALS1.

4. “GAT” motif was significantly enriched in the four motif of GATA3 binding site.

Abstract:

Objective: To investigate regulatory mechanism of GATA3 on different tumor cells.

Methods: We downloaded fastq file of GATA3 ChIP-Seq data from ENCODE project. The GATA3 ChIP-Seq data was from human lung adenocarcinoma epithelial cells (A549), human hormone-dependent breast cancer cells (T47-D and MCF7), and bone marrow cells in neuroblastoma patients (SK-N-SH). Then GATA3 ChIP-Seq data was subjected to pairwise comparison and identification of GATA3 binding site. The GATA3 target gene in the four tumor cells was further subjected to pathway enrichment analysis, Gene ontology enrichment analysis and motif identification.

Results: We found a strong similarity for GATA3 ChIP-Seq in MCF and T47-D cells. GATA3 ChIP-seq data in T47-D cells was strongly similar with that in A549 cells. Analyzing GATA3 binding site in the four tumor cells, we identified 21,507 peaks in A549, 120,057 peaks in MCF-7, 141,332 peaks in SK-N-SH and 69,153 peaks in T-47D, respectively. And there was4839 common GATA3 target genes among the four tumor cells. In addition, 4 GATA3 target genes were identified, including common gene ER1, MCF-7 and T47-D common gene FOXA1, A549-specific gene AKR1C1, and SK-N-SH-specific gene LGALS1. Motif analysis showed significant enrichment of “GAT” motif in the four motif of binding site, and with varying bases sequences around “GAT” motif.

Conclusion: We obtain some genes which may play an important role in regulatory action of GATA3 on tumor model, including ER1, FOXA1, AKR1C1, and LGALS. And this may provide scientific basis for detection regulatory mechanism of GATA3 on tumor model.

Key words: GATA3, ChIP-seq data, Human lung adenocarcinoma epithelial cells (A549), Human hormone-dependent breast cancer cells (T47-D and MCF7), Bone marrow cells in neuroblastoma patients (SK-N-SH)

Introduction

GATA family is a zinc finger transcription factor family and contains 6 members, and it binds the DNA sequence (A/T)GATA(A/G) via a DNA binding domain containing one or two zinc-finger domains [1]. GATA3 (GATA binding protein 3 to DNA sequence [A/T]GATA[A/G]) as one member of GATA family plays an important role in promoting and directing cell proliferation, development, and differentiation in many tissues and cell types [2, 3], such as mammary-gland [4], luminal cell of the mammary gland [1], T lymphocytes [5], thymocyte [6] and so on.

Many researchers have reported over-expression of GATA3 in breast carcinomas [7-9]. Low GATA3 expression has also been suggested to correlate with poor prognosis in breast cancer [2]. GATA3 expression is highly correlated with estrogen receptor α (ERα) in human breast tumors [10, 11]. In addition, one research have showed GATA3 could inhibit breast cancer metastasis by the reversal of epithelial mesenchymal transition [12]. Furthermore, aberrant expression of GATA3 has been reported in other cancers except breast cancer, such as pancreatic cancer [13], prostatic carcinoma [14], neuroblastoma tumors [15]. However, the mechanism of GATA3 in the progression of tumor remains unclear and it need further research.

In this study, in order to investigate regulatory mechanism of GATA3 on different tumor cells, we used GATA3 ChIP-Seq data for human lung adenocarcinoma epithelial cells (A549), human hormone-dependent breast cancer cells (T47-D and MCF7), and bone marrow cells in neuroblastoma patients (SK-N-SH). Then, we performed pairwise comparison of GATA3 ChIP-Seq data and identified GATA3 binding site. The GATA3 targets gene among the four tumor cells were further subjected to pathway enrichment analysis, Gene ontology (GO) enrichment analysis and motif identification.

Material and methods

GATA3 ChIP-Seq data

GATA3 ChIP-Seq data for A549, T47-D and MCF7, and SK-N-SH was used in this study. Although both T47-D 47D and MCF7 are human hormone-dependent breast cancer cells, they had different proteomic profiles [16]. The fastq file of ChIP-Seq data was downloaded from ENCODE (Encyclopedia of DNA Elements) project (https://www.encodeproject.org/). The ENCODE project is aimed to identify all functional elements in the human genome sequence.

Reads preprocessing

Before read mapping, low quality sequences need be removed from the reads by multiple rounds of trimming and cleaning. Initial cleaning and trimming was accomplished with trimmomatic (version 0.22) [17] with the default values.

Reads Mapping

Read mapping was performed with bowtie2 (version 2.0.1) [18]. Reads were mapped to the hg19 human genome using bowtie2. Bowtie2 allows mismatch and gap, and could make alignment very fast and memory-efficient.

Pairwise comparison of ChIP-Seq data in the four cells

We performed pairwise comparison of GATA3 ChIP-Seq data in the four tumor cells by calculating the Pearson correlation coefficient of the overlapping GATA3 ChIP-Seq data. Pearson correlation coefficient were normalized for peak width and height, the number of total peaks and the number of peaks in the four cells. The higher Pearson correlation coefficient means that the overlapping ChIP-Seq data was almost coincident in the two different tumor cells.

Identification of GATA3 binding sites

MACS (Model-based Analysis of ChIP-Seq) [19] and CisGenome [20] are two different peak detection algorithms in predicting transcription factor binding sites in ChIP-seq data. We firstly used MACS (version2.0.10.20130712) to find ChIP peaks in GATA3 binding sites, with bandwidth of 200 and the other default values. FDR (false discovery rate) < 0.05 was identified. Then, CisGenome was used to identify regions with peaks (-1000, 2000), regions at FDR < 0.05 was identified.

Pathway enrichment analysis

In order to obtain the pathway associated with GATA3 target gene in the four tumor cells, we used a web-based toolset for functional interpretation of gene lists g: Profiler [21] for pathway annotation and enrichment analyses with P value < 0.01.

Gene ontology enrichment analysis of the GATA3 target-gene

DAVID (database for annotation, visualization, and integrated discovery) [22] is an analytical tools aimed at systematically extracting biological meaning from large gene/protein lists. We used the DAVID to analyze the functions of the GATA3 target gene in the four tumor cells and defined significant function enrichment of these genes in multiple GO categories with P value < 0.01.

Motif identification

Among highly expressed transcripts, the top 100 peaks with the lowest FDR in the four cells were selected for motif search. The highest point in a peak was used and extended on both sides with either 50 bp. To identify motifs of binding site, the MEME (Multiple EM for Motif Elicitation)-ChIP software [23] was used with the default values. Random genomic sequences were used as background. Motifs with a P-value of 0.001 were called enriched. Then, the motifs identified in the four tumor cells were compared with the existing motifs using the database of human motifs [24].

Results

Comparison of GATA3 ChIP-seq data in the four cells

As shown in Table 1, We observed a strong similarity for GATA3 ChIP-seq data in MCF and T47-D cells (pearson correlation = 0.61). However, GATA3 ChIP-seq data in T47-D cells was strongly similar with that in A549 cells (pearson correlation = 0.72). There was no similarity between GATA3 ChIP-seq data in SK-N-SH cells and that in the other three tumor cells.

Identification of GATA3 binding site

The numbers of peaks and genes annotation information were shown in Table 2, we identified 21,507 peaks in A549, 120,057 peaks in MCF-7, 141,332 peaks in SK-N-SH and 69,153 peaks in T-47D, with FDR < 0.05. In addition, the lowest number of peaks was detected in T46-D cells, and the lowest number of annotated genes was in A549 cells.

As shown in Figure 1, 41.7~47.96 percentage peaks were located within intron region, 41.39~48.12 percentage peaks (< 2kb or within genes) were not annotated. Most of the peaks in exon region was located in the first exon region. For the peaks near transcriptional start site (TSS), the distance between most of the peaks and TSS was less than 500 kb. There was no apparent significance in distribution of peak in gene binding site among the four cells. As shown in Figure 3, there was 4839 common GATA3 target genes among the four tumor cells.

Pathway enrichment analysis of the common GATA3 target genes in the four tumor cells

To explore whether 4839 GATA3 target genes share specific pathway, we performed pathway enrichment analysis using g: Profiler. As shown in table 3, there were 15 KEGG (kyoto encyclopedia of genes and genomes) pathways relevant with the common GATA3 target genes. 6 of 15 KEGG pathways were related with cancer. Two other KEGG pathways were involved in focal adhesion and adherens junction, respectively. In addition, the common GATA3 target genes were also enriched in and ubiquition mediated proteolysis pathways.The ubiquition mediated proteolysis pathway was relevant with cell cycle [25] and the remaining three pathways were related with cell migration [26-28].

Gene ontology enrichment analysis of the GATA3 target gene

Table 4 and Table 5 shows the specific GO categories and pathways in each tumor cells, respectively. As shown in Table 5, only GATA3 target genes in A549 cells shared specific pathway enrichment. The GATA3 target genes in A549 cells, UGT1A9、UGT2A3 and UGT2B28, were uridine glucuronyl transferas, and were enriched in metabolic process and hormone biosynthesis. The GATA3 target genes in MCF-7 cells were mainly enriched in process relevant to cell differentiation. The GATA3 target genes in SK-N-SH cells were mainly enriched in process relevant to positive regulation of I-kappaB kinase/NF-kappaB cascade. The GATA3 target genes in T47-D cells were mainly enriched in process relevant to cell adhesion.

Distribution of GATA3 binding sites in relation to genome annotation

In order to investigate the distribution of GATA3 binding site in relation to genome annotation in the four tumor cells. We detected the binding signal ofthe known GATA3 target estrogen receptor alpha gene (ER1) region. As shown in Figure 4(A), it was detected GATA3 could bind to ER1 region in the four tumor cells with great difference in binding site. There were 34 peaks, 21 peaks, 2 peaks and 10 peaks observed for MCF-7, T47-D,A549 and SK-N-SH cells, respectively. Furthermore, GATA3 in MCF-7 and T47-D cells might bind to many ER1 regions including promoter and intron regions. GATA3 in A549 and SK-N-SH cells only bind to 3’-intron region.

Then, we detected binding signal of 3 tumor-specific GATA3 target genes (MCF-7 and T47-D common target gene FOXA1 (forkhead box A1), A549-specific gene AKR1C1 (human aldo-keto reductase), and SK-N-SH-specific gene LGALS1 (Lactose-binding lectin 1)) to investigate distribution of GATA3 binding site. As shown in Figure 4(B), the credible binding signal was detected at 5’-upstream in MCF-7 and T47-D cells, and none of peaks was detected inA549 and SK-N-SH cells. However, there was a slight difference in peaks of binding site between MCF-7 and T47-D cells. For T47-D cells, an apparent peak was observed at FOXA1 3’UTR region, but there was no peaks observed at the same position in MCF-7 cells. In addition, because AKR1C1 and LGALS1 was specific GATA3 target gene for A549 and SK-N-SH cells, respectively, peaks for the specific GATA3 target gene could only be detected in corresponding cells. As shown in Figure 4(C) and 4(D), there were apparent peaks detected for AKR1C1 in A549 cells and for LGALS1 in SK-N-SH cells, respectively, and there was no apparent peak observed in other cells.

Motif analysis

Top 100 peaks with the lowest FDR were used for motif search, and further were compared with the known motif. However, there was no motifs with a P-value of 0.001 enriched in top 100 peaks in A549 cells. Thus, top 200 peaks in A549 cells were used for motif search, and motifs with a P-value of 0.001 were selected successfully. As shown in Figure 5, there was slight difference in motifs of GATA3 binding site identified among the four cells, in good agreement with known motif of GATA3 binding site. In addition, we find significant enrichment of “GAT” motif in the four motifs of GATA3 binding site, and with varying bases sequences around “GAT” motif.

Discussion

In this study, to explore the regulatory mechanism of GATA3 on different tumor cells, we compared the GATA3 ChIP-Seq data in the four tumor cells (A549,T47-D, MCF7 and SK-N-SH). A total of 1,507 peaks were identified in A549, 120,057 peaks in MCF-7, 141,332 peaks in SK-N-SH and 69,153 peaks in T-47D, and most of the peaks in the four cells were located in intron region. We also identified 4839 common GATA3 target genes in the four tumor cells. Pathway enrichment analysis revealed that the common GATA3 target genes were enriched in pathways related with cancer, cell cycle and cell migration, which indicated that these common GATA3 target genes might be the core node of tumor progression related with GATA3. In addition, the common target gene ER1 and 3 tumor-specific target genes were identified, including MCF-7 and T47-D common target gene FOXA1, A549-specific gene AKR1C1, and SK-N-SH-specific gene LGALS. Motif analysis revealed that significant enrichment of “GAT” motif was identified in GATA3 binding site among the four tumor cells.

ESR1 encodes oestrogen receptor alpha, a transcription factor that enhances the response to diverse stimuli, including oestrogen and growth factors, in various tissue types [29]. Researchers have reported that GATA3 transcription factor stimulates ESR1 transcription through multiple binding sites in the ESR1 gene promoter [11]. In this study, we found great difference in binding site of ESR1 and GATA3 in the four tumor cells. Thus, we could suggest GATA may affect ESR1 transcription activity by altering binding site with ESR1 in different cells.

FOXA1 encodes forkhead box A1, a forkhead family transcription factor that could enhance the interaction of ERα with DNA by interacting with cis-regulatory regions of heterochromatin [30]. Many researches have showed that FOXA1 is essential for estrogen signaling in mammary cells and is required for the direct interaction of ER to chromatin sites, and depletion of FOXA1 abolishes ESR1-binding capacity and transcriptional activity [1, 31]. In this study, results showed GATA3 could bind to 5’-upstream of FOXA1 only in MCF-7 and T47-D cells, which was in agreement with the previous study [1]. In addition, our results also showed GATA3 in MCF-7 and T47-D cells could bind to 5’-upstream and intron region of FOXA1 and ESR1, and GATA3 in A549 and SK-N-SH cells only bind to 3’-intron region. Thus, GATA3 could enhance ESR1 transcription activity by acting on upstream of FOXA1 in mammary cells.

AKR1C1 encodes human aldo-keto reductase-1, an enzyme that catalyzes the metabolic reduction and either activate or inactivate severalxenobiotics [32]. Over-expression of AKR1C1 has been reported in A549 cells [32-34]. However, compared with non-stem cells of A549 cells, higher expression levels of AKR1C1 in stem cells of A549 cells [32]. In this study, pathway enrichment analysis and GO enrichment analysis revealed that AKR1C1 was enriched in process related with xenobiotics metabolism. Thus, AKR1C1 might be positively correlated with cancer stem cell properties through xenobiotics metabolism and promote cell differentiation. In addition, we also detected binding signal of AKR1C1 with GATA3 with three apparent peaks in A549 cells. Therefore, we suggest that presence of AKR1C1 might be enhance regulatory effect of GATA3 on A549 cells. Certainly, it is indispensible for further research to verify the relationship between AKR1C1 and GATA3 in A549 cells.

LGALS1 protein belongs to galactoside-binding protein family that could influence tumor progression by modulating interactions between tumor, endothelial, stromal, and immune cells [35]. Researchers have reported expression and secretion of LGALS1 in human and mouse neuroblastoma cells, and LGALS1 could induce T cell apoptosis and inhibit dendritic cell maturation [36]. Furthermore, LGALS1 has been receiving considerable attention, since it plays an important role in cell proliferation, cell apoptosis and cell migration [37-42]. In this study, our results showed that GATA3 could bind to LGALS1 in SK-N-SH with three apparent peaks in SK-N-SH cells. Thus, in neuroblastoma cells, GATA3 might exhibit regulatory effect, which might be associated with LGALS1. Of course, further experimental studies are needed to verify the association between GATA3 and LGALS1.

Motif analysis revealed that the motifs achieved among the four cells were in good agreement with known GATA3 motif. In addition, significant enrichment of “GAT” motif was identified in the four motif, with slight difference in flanking sequence around “GAT” motif. Reasons might be that GATA3 may bind to different regions in the four tumor cells. In addition, significant enrichment of “GAT” motif indicated that “GAT” might play an important role in identification and combination of GATA3 and DNA.

In conclusion, we compared and analyzed the GATA3 ChIP-Seq data in the four tumor cells (A549,T47-D, MCF7 and SK-N-SH). The results include many genes and pathways which are correlated with regulatory effect of GATA3 on tumor directly or indirectly. We suggest ER1, FOXA1, AKR1C1, and LGALS might play an important role in regulatory action of GATA3 on tumor cells. However, due to large amount of specific GATA3 target genes in the four tumor cells, we could not find the specific regulatory mechanism in the four tumor cells. In addition, the specific regulation of GATA3 in the four tumor cells need to be demonstrated in combination with experimental data.


To export a reference to this article please select a referencing stye below:

Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.
Reference Copied to Clipboard.

Request Removal

If you are the original writer of this essay and no longer wish to have the essay published on the UK Essays website then please click on the link below to request removal:


More from UK Essays

We can help with your essay
Find out more
Build Time: 0.0067 Seconds