This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.
A Aniridia is a rare human congenital condition affecting the eye and is principally characterized by the noticeable diminishment of iris's development or even complete lack of it (1). It typically manifests in both eyes and is associated with inadequate development of the retina thus imposing on the physiological progress of vision (1). Even though the condition does not always lead to complete blindness, it does lead to multiple complications during eye development. Abnormalities usually associated with aniridia are cataracts, glaucoma, corneal neo-vascularization, as well as sclerocornea, foveal hypoplasia and ectopia lentis (Figure 1); resulting to either critical vision damage or blindness (2).
Figure 1: Aniridia Symptoms and Complications A:Cataracts; B:Glaucoma; C: Corneal Neo-Vascularization; D:Sclerocornea; E:Foveal Hypoplasia; F: Ptosis and Microcornea; G:Ectopia Lentis
It has been estimated that approximately one third of aniridia cases are sporadic, associated either with WAGR syndrome where a predilection to Wilm's tumors, mental retardation and genitourinary irregularities are accompanied by aniridia, or sporadic cases which generally only exhibit isolated aniridia where no other effects apart from those manifesting in the eyes are observed (3). The great majority (approximately two thirds) of the cases are found to be hereditary autosomal dominant with a high degree of penetrance but enormous phenotypic variability (1-4).
Nearly a decade ago classical positional cloning approaches and the production of a very detailed physical map spanning the WAGR region defined a sub-region believed to contain the aniridia gene. Further mapping and positional cloning led to the discovery of a gene responsible for aniridia within the WAGR region, on chromosome 11p13xx (5). This gene is called PAX6 and codes for a highly conserved transcription factor that of the homedomain superfamily (5-6). Concurrently, genetic studies in the mouse mapped the naturally occurring mutation, known as small eye (Sey), to mouse chromosome 2 with the orthologous genetic region of human Chr 11. This positional genetic mapping led to the suggestion that Sey could be the mouse model for Aniridia despite the fact that initially the phenotypes were considered to be somewhat dissimilar. Nonetheless, the Sey mutation was shown to be caused also by a mutation in the Pax6 gene (1,6).
The phenotypic differences between the human condition and the Sey mouse model primarily rested on the fact that, as the name suggests, mice have a reduced overall eye size, whereas in human patients the size of the eye is normal. Apart from this difference, both models appear to be highly comparable and are inherited in an autosomal dominant manner. (5, 6).
The human PAX6 gene is approximately 22kb long and is composed of 14 exons. The translation start codon is found in exon 4 while the termination codon lies in exon 13. The protein itself is 422 amino acids long. Another splice variant exists depending on the splicing of exon 5a where 14 amino acids may be added. Deletion mutations affecting both the PAX6 gene and the WT1 gene cause the WAGR syndrome (1, 7). PAX6 mutations are mostly non-sense mutations amounting to 38.9% of the recorded cases (59% of these are changes of cytosines to thymines in exons 8,9,10 and 11). Frameshift, splice and missense mutations have also been recorded (25.3%, 13.2% and 11.7% respectively), whereas in-frame insertions or deletions and run-on mutations are the least common ones (6.2% and 4.7% respectively) (8). All these mutations result in either the abrupt termination of translation or the inability to produce a functional protein which ultimately causes haploid insufficiency for the PAX6 protein so that not enough is produced to upkeep its indented functions (8).
Aniridia has an incidence of 1/50,000 - 1/100,000 people and is therefore an extremely rare condition. When studying new cases and before embarking on mutation identification it is essential to establish that the patient in question is likely to contain a mutation of the PAX6 gene. This is achieved by establishing as high likelihood that the causative mutation maps very close to the region containing Pax6, that the disease is in linkage disequilibrium with loci close to the Pax6 gene.
Linkage disequilibrium is the phenomenon when genetic linkage arises in instances where one set of alleles of closely linked loci co-segregates (i.e. inherited together) with the disease locus. On the contrary, genetic loci situated on different chromosomes are not regarded to be genetically linked since these segregate independently during meiosis and chromosome recombination (9, 10).
When loci are located on the same chromosome and hence they are physically linked, they can still behave as unlinked loci when they lie far apart of each other. So the farthest two genetic loci are located from each other on the same chromosome the higher the possibility that they will be separated through recombination events. The highest recombination value between two loci is Î¸=0,5 (9,10; Figure 2).
Figure 2: Independent Assortment-Recombination events occurring during the Meiosis I stage results in a novel arrangement of alleles coming from either the maternal or paternal chromosomes.
When considering small distances, the recombination rate is closely proportional to the distance units separating two genetic points. The unit used for this distance is the m.u. (genetic map unit) or the centi Morgan (cM) which essentially define the physical distance between two loci or genes for which one recombination event will occur for every 100 meiotic events. Therefore, 1 m.u. is equivalent to 1% recombination frequency (ï±ï€½ï€°ï€®ï€°ï€±) (9,10).
When studying a disease locus of unknown location it is necessary to apply linkage mapping analysis. Linkage mapping calculates the likelihood of certain set of allelic combinations to be linked. It requires a large panel of polymorphic markers located on evenly spaced positions on each chromosome. In case of unlinked or distant genetic loci, all markers and alleles' combinations should be found in all the expected frequencies. For example, if alleles A and a occur at 75% and 25% frequency respectively and B and b occur at 45% and 55% then all the assumed probabilities would be: AB:33.75%, Ab:41.25%, aB:11.25% and ab:13.75% (these frequencies are independent of the physical proximity of the genes). However, if a mutation occurred close to a genetic marker, then it is very likely that the mutation will be co-inherited with a specific allele of that marker. Linkage mapping techniques have been invaluable in locating disease-causing genes, such as, BRCA1 and BRCA2 (causing breast cancer), LDL receptor gene (causing heart disease) and APC (causing colon cancer).
Geneticists take advantage of LD when attempting to compile specific haplotypes (a series of co-inherited alleles, usually microsatellite bearing alleles) which co-segregate with the disease phenotype. LD studies haplotype associations which are generally employed to create pedigrees will enable connections between the relation of markers and alleles to be drawn. Figure 3 exemplifies the use of genetic markers in order to compile a haplotype map. In this case the family was analyzed for 8 markers at a certain locus and according to the results their genetic profile was assembled.
Haplotype Example Mol Vis 1-2 1995 Fine Mapping of the usher syndrome.PNG
Figure 3: An example of a haplotype map, depicting the arrangement of markers from the parental alleles for a certain genetic locus and their distribution to their offspring (37).
In Cyprus the incidence of Aniridia is about 1.2% amongst the recorded blind population (POT, 2006). These patients have never been examined at the molecular level and therefore it is necessary to set up a screening strategy to identify those patients that carry mutations in the Pax6 gene. Therefore, there is an imperative need to investigate the molecular basis of this disorder in Cyprus since at present no such study has been undertaken. This information may also be compared to international databases and thus trace if Pax6-related aniridia cases in Cyprus are caused by de novo mutations or have been introduced from neighboring countries (founder effect).
Although Aniridia is mainly caused by Pax6 mutations, other conditions such as Baraitser-Winter syndrome (BaWS) are associated with this feature. BaWs is a very rare autosomal recessive multiple congenital abnormality of unknown etiology, since the genes involved with the disease are not known. Aniridia itself however is listed as one of the symptoms. The syndrome itself is characterized by structural eye malformations, droopy eyelids and mental retardation (11).
Another condition is the Chromosome 11 ring syndrome an extremely rare chromosomal disorder caused by deletions and circularization of the ends of chromosome 11 forming a ring-like structure. Both the severity and symptoms caused by this event depend on the amount of information which has been lost or compromised through the formation of the ring structure. Some of the symptoms, including aniridia are, cleft palate, small jaw, misshapen ears, clubfoot, absent thumbs and speech and mental retardation amongst the 75 more listed symptoms (12).
Aniridia incidence worldwide has been reported to range from 1/50,000 to 1/100,000, and as aforementioned, is not only limited to malformations of the iris but is instead a panocular disorder (13, 14). Many countries such as Sweden, Norway, Ireland, England, Japan, China, Taiwan, India and Mexico have in recent years released epidemiological and case studies investigating the kind of mutations affecting the PAX6 gene, mainly concentrating on identifying the disorder causing mutations (14, 16-20). Efforts have also been engrossed with finding further ways to improve the patients' quality of life and possible ways of relieving complications caused by the disorder.
In order to be able to achieve the overall objective the following technological objectives have been indentified:
1. Find as many 2-generation families as possible from the whole of the Cypriot population using a network of clinical geneticists and eye specialists.
2. Develop a panel of at least six highly informative markers that span at least 500 kb either site of the Pax6 gene (three markers either side).
3. Develop haplotypes of the families in question to indentify families with high possibility of carrying a mutation in the Pax6 gene.
4. Develop a fast and reliable molecular screening methodology to scan all Pax6 exons for possible mutations.
5. Develop appropriate methodologies to evaluate if any mutations found are likely to cause the phenotype.
Knowledge concerning the phenotypic outcome of the various PAX6 mutations has greatly increased through understanding how the different mutations affect protein function. Having analyzed the mutations found in the PAX6 locus it was found that premature termination codons (PTCs) are usually found within the coding portion of the gene (caused by nonsense, frame shift insertions or deletions and most splice mutations) (2). A survey carried out in 2005 by Tzoulaki et al., showed that 99% of mutations in PAX6 resulting in PTC give rise to classical cases of aniridia (15).
The closely related phenotypes between PTC mutations and entire gene deletions indicate that these mutations produce a gene product which is essentially lacking physiological function, as if the protein were completely absent as a result of the PTC mutation (2). PTC mutations were considered to give rise to abridged proteins which were believed to either exert a dominant negative effect, antimorphic mutations acting antagonistically to the physiological gene product, or have some limited function within the cells (2, 15). This event though does not account for the fact that mutations occurring at different positions within the gene's open reading frame still seem to have comparable functions. However, it has been recently shown that the mechanism of nonsense mediated decay (NMD) may be used to explain this phenomenon. NMD is the mechanism responsible for keeping a close watch on the produced mRNAs in order to identify species containing nonsense mutations and thus precluding the expression of abridged or erroneous gene products by breaking down mRNAs transcribed from multi-exon genes carrying PTC mutations (2, 21 &22).
Missense mutations, which cause the substitution of a particular amino acid at any locus with another amino acid, are related to a wide array of phenotypes including aniridia, such as microphthalmia, corectopia and Peter's anomaly (2, 15). Studies carried out on Sey mice have shown that these mRNAs are translated to produce proper length proteins but depending on the nature and location of the substitution the protein may be completely non-functional or its function may be diminished or even augmented (15).
Mutations resulting in producing an elongated protein (C-terminal extension -CTE- mutations where the open reading frame is extended into the 3' untranslated region) are not counteracted through the NMD mechanism so that the proteins are successfully produced. These mutations have occasionally been reported concerning PAX6; however their actual molecular mechanism of action is for the most part unknown, even though they are generally considered to amount to a null allele.
As recently indicated by Hingorani et al., who carried out an extensive analysis comparing the causative mutations to the resulting phenotypes of aniridia patients within the UK, every group of mutations exhibited enormous phenotypic variability. It was reported though that missense mutations generate the mildest phenotypes, while PTC and CTE mutations exhibited the most severe phenotypes. Again however, it is obvious that different mechanisms are involved with each class of mutations since PTC and gene deletion mutation result in eradicating the protein, whereas CTE mutations produce proteins which extend into the 3' untranslated region and are thus longer. Furthermore, as was noted within the aforementioned study, PAX6 is not only involved in eye development and several of the patients were also observed to carry anomalies in both brain function and structure.
Since the Cypriot population, at least in its present form, is rather homogenous it is very likely that the mutations present in Cyprus that cause aniridia will be very few (most likely one). It is also very likely that this mutation is unique to this region. Therefore, by investigating the molecular basis of aniridia by searching for new mutations in the Pax6 gene, important information will be gained about the function of this protein. Also by discovering such information diagnosis will not only rest on phenotypic identification alone but will combine molecular studies as well. Particularly since phenotypic diagnosis may not be the most reliable means as aniridia may be only a symptom of another condition and not be the main syndrome itself. Additionally, some families would be able to take advantage of the molecular genetic prenatal or pre-implantation diagnosis concerning this condition.
The implementation of this project will be broken down into several tasks:
Task1: Identification of families
With the help of clinical geneticists at Makarios Hospital and through contacting all eye specialists in Cyprus, we will collect family information for all patients that display non-syndromic aniridia. Priority will be given to those patients where family history is available and suggests autosomal dominat mode of inheritance of the disease.
Task2: Screening of patients for linkage to the Pax6 locus
Those unrelated patients with family history will be selected first in order to identify those which are likely to be caused by mutations in the Pax6 gene. To achieve this task LD analysis will be applied to identify those families with a high likelihood of linkage to the Pax6 locus.
Linkage Disequilibrium Analysis:
Despite the fact that mutations in the PAX6 locus have been shown to cause aniridia there have been 4 cases of aniridia reported recently which could not be traced to mutations in the PAX6 gene. Thus, LD analysis near the PAX6 gene is essential in order to initially establish a relationship between the disease and the gene in question.
As has been aforementioned, the use of microsatellites or short tandem repeats (STRs) as molecular markers to determine whether regions are in LD is extremely widespread. These STRs are small neutral repeating units of no more than 1-6 nucleotides long. Single Nucleotide Polymorphisms (SNPs) are also becoming exceedingly prevalent in LD analyses, however, due to the fact that allelic variation is limited (maximum four alleles), generally more SNPs need to be chosen and examined before drawing informative conclusions (23).
Nevertheless, irrespective of whether STRs or SNPs are to be used, their heterozygosity index (HI) needs to be considered for the former and the minor allele frequency (MAF) for the latter. The heterozygosity index in this context relates to the number of individuals (in a general population) which will be expected to be heterozygous at that genetic locus. Frequency values greater than 0.5 indicate that there's a greater possibility that a random individual which may be analyzed will possess two different alleles and hence it will be informative (9). By analyzing several highly polymorphic STRs it will be possible to establish if the observed haplotypes are the result of close physical linkage between the Pax6 gene and the STRs in question. In this case, two STRs on the 5' and two STRs on the 3' of the Pax6 gene have been chosen. The 3' end STRs are of known heterozygosity indexes, whereas the 5' end ones have not been studied yet. Two additional STRs have been chosen in case some STRs are not very informative (Table 1). Size length variation will be determined using fluorescently labeled primers and analyzed by capillary electrophoresis (fragment analysis) to distinguish the sizes of the amplicons obtained.
Expected Size (bp)
The amplicons obtained using polymerase chain reaction (PCR) for each patient will be compared against a ladder in order to determine the alleles' lengths using automatic estimation by specific software of the genetic analyzer. The data obtained will be more informative however if the patients' profiles indicate that they are heterozygous instead of homozygous in order to make it possible to formulate a haplotype map between the parents and their offspring, thus recognizing the mutated disease-causing haplotype. An example of amplicon analysis is shown in Figure 5 and the respective family is shown in Figure 6.
Figure 5: Example of electrophoresis profiles of a family for 2 of the STRs to be analyzed.(The output which will be obtained from the analyzer will in reality be peak fragment analysis depicting the relative sizes, a gel electrophoresis profile is instead illustrated here for simplicity purpose)
Taking into consideration the example shown in figure 5 relating to a hypothetical family (Figure 6) where the father has been diagnosed with aniridia as has one of the offspring. The electrophoresis profiles for two of the markers (D11S1389 at the 5' end of the gene and D11S2001 at the 3' end) can be used to pinpoint which combination of markers seems to be associated to the mutated allele. Gathering the above information haplotype maps of the family may be compiled as shown in figure 6.
Figure 6: Haplotype association map assembled using the electrophoresis profiles shown in Figure 5. The marker values associated with the disease-causing mutated allele are shown in the red box. (C:ControlF:Father, M:Mother, OS1:Offspring 1, OS2:Offspring 2)
A similar analysis will be carried out on the results which will be obtained from the individuals in this study. Results from the affected parent will be compared to the results of the affected offspring. What we will be looking is a specific set of alleles that segregate always with the disease locus and which set must NOT be present in any unaffected sibling. This analysis presupposes that the parents (ideally) are heterozygous for different size length alleles in order to be able to trace the parental and maternal alleles. If in a family an affected individual is found which, in a four-locus analysis, carries the same parental haplotype on the affected chromosome, then this proband is a very strong candidate for carrying a mutation in the Pax6 locus.
Task 3: Exon searching for putative mutations
As has been reported, PAX6 mutations causing aniridia have been found in almost all of the 13 exons. Therefore, in order to pinpoint the mutations which caused the disease all exons need to be scanned separately. The primers to be used are listed in table 2 (24, 25).
Amplicon Size (bp)
Table 2: Primers suitable for the amplification of each exon (24, 25).
Having established, through the LD analysis, a possible correlation between Pax6 and the disease within each family, only one affected individual from each family needs to be analyzed (not considering additional de novo mutations). Following PCR amplification, mutation detection needs to be carried out. One way is to sequence all amplicons of the proband and both parents and compare the sequence. This is laborious however and costly. Another way is to search of Single Strand Confirmation Polymorphisms (SSCP). In this case non-denaturing gels are used to analyze the amplicons to find the ones that contain two different DNA species resulting from a heterozygous individual. This is also laborious and may miss some mutations that do not cause changes in the conformation of the DNA.
A third and very fast method is to scan the amplicons for the formation of heteroduplexes. This is possible using the plant derived endonuclease called CEL I nuclease. This is isolated from celery and is able to cut one of the two strands of the DNA at the 3' side of DNA deformations and mismatches. A commercial CEL nuclease is now available and is called Surveyorâ„¢ nuclease (Transgenomic, Gaithersburg, MD, USA). It belongs to the CEL family of nucleases and has been found to be equivalent to the enzyme CEL II. The enzyme has the capability to cleave both strands of the DNA at the site of mismatch and consequently yields at least two cleaved products (the enzyme has been known to be able to recognize multiple mismatch sites within the same DNA molecule). Furthermore, Surveyorâ„¢ nuclease will accurately detect and cut at sites of base substitutions, insertions and deletions even within mixed DNA populations where the mismatched species comprises only a small proportion of the entire population (Figure 7) (26).
1 2 3 4 5 6 7 8 9 10 11 12
Figure 7: Agarose gel electrophoresis profile of 632 bp DNA fragments after Surveyorâ„¢ nuclease digestion. The lanes from left to right represent digestion products of mixtures of homoduplex and heteroduplex DNA with the following mismatches: GG and CC, GA and TC, GT and AC and TT and AA, lanes 1-4 respectively. The following lanes show heteroduplex mixtures with insertions (from 1 to 12 nucleotides, lanes 5-10), and finally homoduplex DNA (lane 11) (26).
As discussed above, initially each exon needs to be amplified (along with known wild type controls). Samples will then be treated with Surveyorâ„¢ nuclease (reactions are incubated at 42Â°C for 20 minutes and terminated by adding 0.5M EDTA) (26). The reaction mixtures will then be analysed on 3:1 high resolution agarose gels (2-3% density) and stained with ethidium bromide. The formation of new cleavage products predicts the presence of a mutation. If an individual is a homozygous for all bases of an exon, then the amplicon will resist cleavage by the enzyme and will appear as a homoduplex (lane 11 figure 7). If however an amplified exon is mutated then the amplicon will contain two species and the resulting heteroduplex and will be cleaved by the enzyme. It should also be noted that the resulting fragments will indicate the likely location of the mutation relative to the ends of the amplicon. As shown in figure 7, DNA modifications were carried out 415 bp from the 5' end of the DNA fragment used, thus yielding two bands, one at 415 bp and one at 217 bp.
Even though mutation detection using the Surveyorâ„¢ nuclease is a fairly straightforward and commodious technique it is highly dependent on the quality of the amplicon used. Since the amplicon serves as the substrate for the enzyme, the amplification needs to be optimized to ensure the highest possible purity in adequate amounts. If the PCR product contains high amounts of artifacts, such as smaller products caused by misannealed primers and primer dimers, so that the proportion of the actual amplicon within the solution is not high enough, then the high background will complicate result analysis. Furthermore, low PCR yields will disrupt the enzyme-substrate balance required for the reaction which may also result in high background. Finally, it is imperative to use a high fidelity DNA polymerase with proofreading ability (such as the PhusionÂ® Hot Start High-Fidelity DNA Polymerase) in order to reduce errors during the copying of the template and also reduce background resulting from the digestion of PCR artifacts.
Task 4: Sequencing of candidate exons
The samples which will yield heteroduplexes will be candidates for carrying mutations. Sequencing will be applied only for exons that yielded heteroduplex formation during the Surveyor analysis. These will be further analyzed using sequencing in order to find the mutations. Modern automatic capillary analyzers make use of four different fluorescent tags (representing each nucleotide) so that only a single reaction is required for each sample instead of four reactions to trace the presence of each nucleotide (27).
The general concept behind Sanger sequencing, or chain termination sequencing is that by using the template DNA (in this case the PCR product), DNA polymerase and an appropriate primer to copy the template, the elongation process will be monitored in order to produce a reading of that template DNA. The difference between the process carried out by a thermal cycler during PCR and the sequencing analyzer is that along with the deoxynucleotides used, a lower amount of chain termination di-deoxynucleotides is also included to the mix. Di-deoxynucleotides are referred to as chain termination nucleotides due to their lack of the 3'-hydroxyl group on the deoxyribose sugar. Since these species are missing that hydroxyl group no condensation reaction between the 5' phosphate of the new nucleotide to be added and the 3' hydroxyl group of the preceding nucleotide can occur. Therefore, their inability to form another phosphodiester bond terminates the chain at the di-di-deoxynucleotide. As a result, fragments of varying lengths are formed every time another di-deoxynucleotide is added, eventually all the fragments formed will represent the entire sequence of the template DNA by increments of one (as in the original Sanger technique). These fragments are subsequently separated according to their size as they pass through a narrow glass capillary filled with a glutinous polymer (27, 28).
The sequencing method used here requires labeling of the termination nucleotides with different fluorescent tags (dye-terminator sequencing), which provides the advantage of carrying out the entire set of reactions in one sample instead of four (28). The sequencing reactions of each exon suspected to carry a mutation will be analyzed using the automatic analyzer (ABI) which makes use of an ultraviolet laser to excite the four different tags. These tags will emit back at different wavelengths so that the analyzer may identify each nucleotide as it passes through the machine. The resulting colored profile (chromatogram) is in turn deciphered as a nucleotide sequence which represents the sequence of the template DNA.
Task5: Bioinformatic Analysis
Following the sequencing procedure of the exons suspected to carry mutations, the results need to be cross referenced to the consensus published sequence in order to identify any variations. If any are found then it is important to verify whether this mutation is found in other affected members within the family or other affected individuals analyzed within this study. If a mutation is causing a nonsense mutation (premature insertion of a STOP codon) then it is very likely to be the disease-causing. If the mutation is causing a missence mutation that is not leading to any change in the amino acid sequence (neutral mutation) then this is a non-causative polymorphism.
A third scenario is that a missence mutation is leading to a change in the amino acid sequence and it might be pathological. To determine if this is likely to be the case the mutation needs to be looked for within a sample of the healthy population (usually 50-200 samples are required). If this mutation is found to be shared amongst other affected members within the family but is absent from healthy individuals then this would indicate that the mutation may indeed be a pathogenic one. In the case where a mutation is found to be shared not only amongst members of the same family but by other affected individuals within the population then a founder effect may be observed, meaning that the mutation was introduced into the community at some point in history by a common ancestor and has since spread into the greater population.
Furthermore, the functional significance of an amino acid substitution, or deletion needs to be investigated across other species to evaluate evolutionary conservation. The PAX6 protein (figure 8) has been found to possess two highly conserved domains, the paired box domain (part of the PAX superfamily conserved domains) which is the DNA binding site and the homeodomain region which is crucial in transcription regulation (29). Conserved domains and regions that can be found to be preserved across species indicate that these areas were important enough to be maintained and further signify that possible alterations in this domain may hinder the protein's function or ability to interact with other proteins.
Figure 8: Conserved domains within the PAX6 protein sequence (Pubmed-Conserved Domains Query).
Predicted protein sequences from many species have been used to run a multiple alignment, in order to identify regions within the protein that have been conserved throughout taxonomy. The PAX6 protein shows great conservation across species, as can be seen in table 3, with the greatest variations found in Caenorhabditis elegans (the transparent nematode) where the organization of the gene itself seems to vary greatly, as well as have different regulatory elements. In both Drosophila melanogaster and Caenorhabditis elegans the aforementioned conserved regions are fully conserved.
Protein Length (aa)
C. lupus familiaris