This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.
High-altitude environments provide natural laboratories for the study of evolution and adaptation because the low ambient oxygen exposes organisms to the well-defined and sustained stress of severe hypoxia. We undertook genome-wide and candidate gene approaches to search for evidence of natural selection on native highlanders of the Tibetan Plateau. Through a genome-wide allelic differentiation scan comparing highland Tibetans at 3200-3500m with low-altitude Han of common ancestry we detected a significant divergence at eight SNP sites near EPAS1, which encodes for a transcription factor involved in regulating hemoglobin concentration. Among a separate cohort of Tibetans residing at 4200m, the most frequent alleles at 31 EPAS1 SNP sites were associated with an average of 0.8 gm/dL lower hemoglobin concentration. These findings were replicated in a third cohort of Tibetans residing at 4300m. The alleles associating with lower hemoglobin concentrations, all in high linkage disequilibrium, were observed at much higher frequencies in the Tibetan cohorts compared with the HapMap Han. In conclusion, these studies of three independent samples of Tibetan highlanders demonstrate natural selection favoring a specific human gene locus. We suggest that the associated trait - lower haemoglobin concentration at high altitude - protects against the excessive red blood cell production that is the cardinal feature of chronic mountain sickness.
The high plateaus of Central Asia and the Andes were among the last areas occupied as Homo sapiens spread across the globe during the past 100-200,000 years. In the case of the Tibetan plateau, early visitors appeared more than 30,000 years ago and colonists and their descendents have been present for the last ~10,000 years, or more (1-2). The Tibetan plateau's low oxygen levels, resulting from extreme altitude, would have presented a formidable biological challenge. Colonists struggle to reproduce at these altitudes (3-4), and suffer from a number of diseases specifically related to high altitude (5). The classic disease associated with long term residence is chronic mountain sickness (Monge's disease). The cardinal feature of this disease is excessive red blood cell production, or erythrocytosis, induced by the hypoxia of high altitude. Tibetans are particularly resistant to developing chronic mountain sickness (6-7). Indeed, Tibetans typically exhibit little or no increase in hemoglobin concentration at altitudes up to 4000m (13,200'), and comparatively little response at higher altitudes (8-9). Overall, Tibetans average about 1 gm/dL (10-15%) lower hemoglobin concentration than their Andean counterparts or acclimatized lowlanders at the same altitude.
The induction of erythrocytosis by hypoxia involves the transcription factors known as hypoxia-inducible factors (HIF), in particular EPAS1 (or HIF2α) (10-11). Here, we report the results of three independent studies identifying a key role for EPAS1 in Tibetans' adaptation to hypoxia. In the first, a genome-wide allelic differentiation scan was performed to compare SNP frequencies of a Yunnan Tibetan population residing at 3200-3500m and the HapMap HAN sample, a model of their common ancestral population (1) . From this study, a signal of selection close to EPAS1 was identified at a genome-wide level of significance. The second study was a candidate gene analysis of EPAS1 in a separate sample from 4200m the Tibetan plateau that identified a significant association between genotype and hemoglobin concentration, with the major (most frequent) alleles associating with a lower level of hemoglobin. These alleles were present at low frequency in the HapMap HAN. The third study replicated that association in an independent sample of Tibetans from 4300m.
Genome-wide Allelic Differentiation Scan. A genome-wide allelic differentiation scan (GWADS) was used to compare a cohort of Tibetan residents (n=35) sampled from four townships at altitudes of 3,200-3,500m in Yunnan Province, China with HapMap Phase III Han individuals (n=84), representing a model for the ancestral population not selected for high-altitude adaptation (1). The scan identified eight SNP sites with genome-wide significance (p values ranging from 2.81x10-7 to 1.49x10-9) located within 235 kb on chromosome 2 (Fig. 1, Fig S4 and Table S1).
All eight GWADS significant SNPs were in high pairwise linkage disequilibrium in the Yunnan sample (0.23< r2 <0.82), forming an extended haplotype with a frequency of 46% in the Yunnan sample but only 2% in the Han sample (estimated via an expectation-maximization algorithm using Haploview software(12-13). The SNPs lie between 366 bp and 235 kb downstream of EPAS1, but as we show below, the region of high linkage disequilibrium extends into EPAS1 itself. In addition to this genome-wide significant finding linked to EPAS1, regions of sub-genome wide significance were in close proximity to other genes of the HIF pathway and present intriguing targets for follow-up studies (see supporting information for further details). I FIND THIS LAST SENTENCE DISTRACTING. IT INTERRUPTS THE FLOW. HOW ABOUT MOVING IT TO THE DISCUSSION IF YOU REALLY WANT TO INCLUDE IT IN THE MAIN TEXT?
Candidate Gene Study for EPAS1. Independent of the GWADS study, a candidate gene study (based on the pathway linking hypoxia, EPAS1 and erythropoietin) addressed the functional consequence of EPAS1 variants by testing for association with hemoglobin concentration in a sample of 70 Tibetans residing at 4200m in Mag Xiang, Xigatse Prefecture in the Tibet Autonomous Region. One hundred and three non-coding SNPs across the EPAS1 gene were selected for genotyping. Of these, 49 had a minor allele frequency ≥5%, and were thus amenable to association analysis that identified 31 SNP sites significantly associated with hemoglobin concentration (Fig. 2 and Table S3). The major (most frequent) allele of every significant SNP was associated with lower sex-adjusted mean hemoglobin concentration (Fig. 2). The genotypic mean differences in sex-adjusted hemoglobin concentration averaged 0.8 + 0.15 (SD) gm/dL, with a range from 0.3 to 1.0. Conditional logistic regression analyses showed that once the most significant SNP (rs4953354) was included, no additional significant gain in association was obtained by adding any other SNP, consistent with a single-causal-variant model. Many of the SNP sites were in high linkage disequilibrium (Fig. 2). Genotypes for the eight GWADS significant SNPs identified in the Yunnan Tibetan population were available on a subset of the Mag Xiang cohort (n=29). They were highly correlated with the SNPs associating with hemoglobin concentration (0.54 < r2 < 1). (Table S4). Thus, the genome-wide and the candidate-gene analyses together indicate a signal of selection in this area of the genome.
Replication of Candidate Gene Study for EPAS1. We confirmed the association of EPAS1 SNP site variants and hemoglobin concentration in another sample of 91 Tibetans residing at 4300m in Zhaxizong Xiang, Xigatse Prefecture. 48 of the 49 SNPs in Mag Xiang with a minor allele frequency ≥5% were successfully genotyped in the Zhaxizong Xiang cohort. Of these, 45 sites had a minor allele frequency ≥5% and 32 sites were significantly associated with hemoglobin concentration. The mean difference averaged 1.0 + 0.14 (SD) gm/dL with a range from 0.5 to 1.2 gm/dL (Fig. 3 and Table S3). Twenty-six SNPs were associated with hemoglobin concentration in both samples and the direction of the effect was the same. Conditional logistic regression again confirmed that, after including the most significant SNP (rs13419896) no further SNPs were significant. I THOUGHT MIKE SAID OTHERWISE FOR THIS SAMPLE? Genotypes for the eight GWADS significant SNPs were available on 89 samples from the Zhaxizong Xiang cohort. Three of these SNPs correlated significantly with hemoglobin concentration (Table S4) and supported the evidence for a signal of selection in this area of the genome.
The size of the effect of genotype on hemoglobin concentration was large - equivalent to 53% of one standard deviation in the Mag Xiang sample and 50% in the Zhaxizong Xiang sample. Linkage disequilibrium (LD) among these 26 SNP sites was elevated in the two Tibetan cohorts compared to the HapMap Han (Fig. S5). Finally, we note that the largest allele frequency differences between the two Tibetan samples and the HapMap Han sample selectively occur at the EPAS1 SNP sites that are associated with low hemoglobin concentration (Fig. 4).
In summary, the analyses of three independent samples all point towards a signal of selection in the EPAS 1 locus and adjacent area of the genome. From the combination of the GWADS and candidate gene approaches, we conclude that directional natural selection has increased the frequency of alleles associated with less vigorous EPAS1-mediated hematopoietic (and possibly other) response(s) to hypoxia.
The alternative hypothesis of neutral evolution, such as isolation by distance, exchange of migrants, or migration, is inconsistent with our findings (14). Our results indicate that the area of the genome near the EPAS1 locus in Tibetan residents of Yunnan province is uniquely differentiated from the Han sample. Neutral evolution is expected to result in a widespread pattern of differentiation throughout the genome rather than at one isolated region: this was not found. We observe an EPAS1-specific signal that is significantly above the level of background genome-wide differentiation (Fig. 1). Furthermore, a comparison between the HapMap Han and Andean highlanders - who have a very vigorous erythropoietic response - did not identify differences in allelic frequency at the EPAS1 locus (15).
Reducing the erythropoeitin-mediated hematopoetic response to hypoxia at high altitude is likely to be advantageous. Although increased hemoglobin concentration at altitude is a normal physiological response among lowlanders that may have benefits in the short term, the evidence suggests that long-term or exaggerated responses may be harmful (16-18). A reduced EPAS1 induction of the erythropoietin system appears to explain why Tibetans have lower hemoglobin concentrations than other populations at matched high altitudes and why they are typically resistant to chronic mountain sickness. Functional studies will be required to identify how the variants work to restrain the hematopoietic response.
Rare, deleterious gain-of-function variants in EPAS1 have been reported to cause pathological elevation of hemoglobin concentration (19-21), but those reported here are the first to be associated with advantageously lower hemoglobin levels in a healthy sample. As EPAS1 has been associated with the transcriptional regulation of more than one hundred genes (22), hemoglobin concentration may be among multiple phenotypes under selection at high altitude. For example, the gain-of-function mutations have been associated with excessive pulmonary hypertension as well as with excessive production of red blood cells (21), that are often features of chronic mountain sickness (5). Allelic variation within EPAS1 has also been associated with particular exercise phenotypes at low altitude (23). Tibetans typically have high exercise capacities (24) than acclimatized lowlanders, and it is therefore possible that this phenotype may also be related in some way to EPAS1.
In conclusion, these studies of three independent samples of Tibetan highlanders provide a clear demonstration of natural selection favoring a specific human gene locus - here associated with a trait that enhances human adaptation to high-altitude hypoxia.
Materials and Methods
Human Volunteers: Ethics and Consent. This study was approved by the ethics committees of the Yunnan Population and Family Planning Institute (Kunming, China); the Beijing Genomics Institute at Shenzhen; the Beijing Institute of Genomics, Chinese Academy of Sciences and Case Western University (Cleveland, Ohio). All work was conducted in accordance with the principles of the Declaration of Helsinki. All participants were recruited after obtaining informed consent.
Sample Collection. Sampling was conducted in three geographic regions of China approximately 1500 miles apart. They were: i) the North Western region of Yunnan province (28°26'N 98°52'E); ii) Mag Xiang, Xigatse Prefecture, Tibet Autonomous Region (29°15'N 88°53'E) and iii) Zhaxizong Xiang, Xigatse Prefecture, Tibet Autonomous Region (28° 34' N 86° 38'E). Genotypic data from the HapMap Han population were also included in this analysis. A fuller description is given in the Supporting Information.
Genotyping. All genotyping was conducted at the Beijing Institute of Genomics. The whole genome genotyping was conducted using the Illumina Veracode platform and 610-Quad high throughput genotyping chips. Genotyping within EPAS1 was conducted using a customer-designed Illumina GoldenGate assay (384 SNP plex) for all samples from Mag Xiang, and some of the samples from Zhaxizong Xiang. The remainder of the samples from Zhaxizhong Xiang were genotyped using MassARRAY assays. Further details of these and the quality control procedures are given in the Supporting Information.
Statistical Analysis: GWADS. In order to identify variation between the Yunnan Tibetan and the HapMap Han populations, we calculated SNP-by-SNP chi- squared statistics for allele frequencies and corrected for background population stratification through a genomic control procedure (13). This approach allows genome-wide significant signals of allele frequency differentiation to be readily declared by examining genomic distributions of chi-squared values in the sample of ~500,000 SNPs. A full description of the method, including a simulation for two populations with a degree of genomic divergence equal to that between the Yunnan and HapMap Han populations, is given in the Supporting Information.
Statistical Analysis: Candidate Gene Studies. Candidate gene association analysis of EPAS1 SNP genotype with hemoglobin concentration phenotype was performed separately in the two Tibet Autonomous Region samples. Mean characteristics for these populations are given in Table S2. For each SNP, a linear model was fitted with hemoglobin as the response variable, the SNP as a predictor under an additive genetic model, and with sex as a covariate. An adjustment for multiple comparisons was implemented by controlling the false discovery rate at less than 0.05 across the EPAS1 gene. The R language and environment (R Project for Statistical Computing http://www.r-project.org ) was used for all related analysis and graphics.
Fig. 1. A genome-wide allelic differentiation scan that compares Tibetan residents at 3,200-3,500m in Yunnan Province, China with HapMap Han samples. Eight SNP sites near one another and EPAS1 have genome-wide significance. The horizontal axis is genomic position with colours indicating chromosomes. The vertical axis is the negative log of SNP-by-SNP p values generated from the Yunnan Tibetan vs HapMap Han comparison. The red line indicates the threshold for genome-wide significance used (p=5x10-7). Values are shown after correction for background population stratification using Genomic Control.
Fig. 2. EPAS1 SNP site variants that associate with an average of 0.8 gm/dL lower hemoglobin concentration in a Tibetan sample from Mag Xiang (4,200m), Tibet Autonomous Region. The top panel shows the results of testing variants at 49 SNP sites with a minor allele frequency ≥5% for genotypic association with sex-adjusted hemoglobin concentration. The middle panel displays the estimated hemoglobin concentration difference (mean + 95% confidence interval) between genotypes at each SNP site. Filled circles represent SNP sites detected as having a significant association with hemoglobin concentration while controlling the false discovery rate < 0.05 across the EPAS1 locus. Open diamonds represent SNP sites without significant association. The bottom panel illustrates the pairwise linkage disequilibrium measured as r2 between SNPs.
Fig. 3. EPAS1 SNP site variants that associate with an average of 1.0 gm/dL lower hemoglobin concentration in a Tibetan sample from Zhaxizong Xiang (4,300m), Tibet Autonomous Region. The top panel shows the results of testing variants at 45 SNP sites with a minor allele frequency ≥5% for genotypic association with sex-adjusted hemoglobin concentration. The middle panel displays the estimated hemoglobin concentration difference (mean + 95% confidence interval) between genotypes at each SNP site. Filled circles represent SNP sites detected as having a significant association with hemoglobin concentration while controlling the false discovery rate < 0.05 across the EPAS1 locus. Open diamonds represent SNP sites without significant association. The bottom panel illustrates the pairwise linkage disequilibrium measured as r2 between SNPs.
Fig. 4. Differences in allelic frequency at SNP sites within EPAS1 between the HapMap Han, Mag Xiang and Zhaxizong Xiang cohorts. The horizontal axis is SNP position according to build 36.1. The vertical axis is allelic frequency, with the allele selected for display as the one occurring most frequently in the Mag Xiang cohort. Blue squares denote data for HapMap Han; red circles denote data for MagXiang Tibetans; green triangles denote data for Zhaxizong Xiang Tibetans. Filled symbols denote those SNP sites where there were significant associations with hemoglobin in both Mag Xiang and Zhaxizong Xiang cohorts; open symbols denote those SNPs without both such associations.