This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.
Kabuki syndrome (also known as Kabuki make-up syndrome or Niikawa-Kuroki syndrome) was first described nearly 30 years ago, but the first Kabuki gene was only described last month (September 2010). What do we now know about this disorder, and why did it take so long?
Kabuki syndrome (KS) is a rare congenital disorder characterized by distinctive dysmorphic facial features and multiple physical malformations. Most of the cases reported to date are sporadic, but instances of familial transmission have been documented with a likely autosomal dominant mode of inheritance. Recently, exome sequencing revealed that mutations in the MLL2 gene underlie a significant proportion of KS cases. However, this discovery was made nearly 30 years after KS was initially described. Reasons for this delay include the rare and sporadic nature of KS and limitations of genotyping techniques available. Exome sequencing has proven to be a worthy method of disease gene discovery, but it is not without its limitations, evident during the study of KS. The discovery of MLL2 mutations is likely to improve our understanding of KS, ultimately translating into better treatment and management of affected individuals. Additionally, this discovery is likely to impact upon our approach towards the study of other rare Mendelian disorders.
Kabuki syndrome (KS; MIM 147920), also known as Kabuki make-up syndrome or Niikawa-Kuroki syndrome, is a rare disorder characterized by distinctive dysmorphic facial features and multiple physical malformations. Affected individuals usually present with signs of mental retardation coupled with skeletal, cardiac, and other visceral abnormalities [1, 2]. The incidence of KS is estimated at 1 in 32,000, and approximately 400 cases have been reported in the literature to date . Although the majority of reported cases are sporadic, more than a dozen instances of parent-to-child transmission have been documented. Analysis of these affected families suggests an autosomal dominant mode of inheritance [4, 5]. Considerable effort has been put into identifying the genetic basis of KS since its initial description in 1981. However, it was only in September 2010 that mutations in the MLL2 gene were described to underlie KS . This discovery was made using exome sequencing, a relatively new method of genotyping which only targets the protein coding regions of the genome.
This essay aims to provide a summary of our knowledge of KS prior to the discovery of MLL2 as a causal gene, the process of gene discovery using exome sequencing, and how this may affect our understanding of the condition. Additionally, this essay will examine possible reasons for the lag between the initial description of KS and the discovery of the first causal gene, how exome sequencing has overcome these limitations, and the likely impact that exome sequencing will have on the study of rare Mendelian disorders.
Kabuki syndrome: Pre-2010
KS was first described independently by two Japanese groups in 1981 [1, 2]. They noted a group of patients with five key features consisting of distinctly abnormal facies, mental retardation, dermatoglyphic abnormalities, short stature and skeletal defects. The characteristic facial features include long palpebral fissures with everted lateral lower eyelids, arched eyebrows which are sparse at the lateral one-thirds, a depressed nasal tip and large prominent ears (fig 1). These features are akin to a Japanese Kabuki actor's makeup, hence the suggestion of the term "Kabuki make-up syndrome", although this has since been abbreviated to KS due to potential confusion or offence caused by the former term.
Individuals with KS can present with a spectrum of phenotypes. In 1988, Niikawa et al. published clinical descriptions of 62 affected individuals. Through careful observation, he derived a list of the five most frequently observed symptoms which he termed as "cardinal manifestations" of KS. The first of these was the presence of peculiar facies, which was noted in all of his patients. The second was skeletal abnormalities including spinal defects and short 5th digits, affecting 92% of patients. The remaining three symptoms involved fingertip pad (dermatoglyphic) abnormalities, mental retardation and postnatal growth deficiency in 93%, 92% and 83% of patients respectively. Additional features have been reported in KS individuals since then, albeit at varying frequencies (table 1). Notable ones include congenital cardiovascular defects, gastrointestinal abnormalities and severe immunodeficiencies which have the potential to develop into life-threatening complications. However, these are relatively rare. With proper treatment and management, the prognosis for KS individuals is generally positive. Additionally, the condition is not known to adversely affect the fertility of KS individuals as most female individuals with KS menstruate regularly . Reports of familial cases of KS further indicate that the condition does not necessarily have any immediate effect on reproductive ability.
Generally, the wide spectrum of clinical features observed in individuals with KS suggests that it is a heterogeneous condition with variable expressivity. It has been noted that the phenotypic manifestations of KS vary according to the ethnicity of affected individuals. For instance, features such as hypotonia and joint laxity were more frequently observed in individuals of non-Japanese origin as compared to Japanese patients [8, 9]. Currently, diagnosis of KS is made solely on the basis of clinical observations, although no consensus exists on the minimum diagnostic criteria .
Although the majority of KS cases reported are sporadic, more than a dozen cases of familial transmission involving a single parent have been reported. The ratio of affected males to females is largely constant at 1.16 to 1, suggesting that KS is unlikely to be a sex-linked disorder . Also, separate reports involving maternal and paternal transmission of KS indicate that genomic imprinting does not underlie the development of KS. In light of the above, it was therefore proposed that KS be classified as an autosomal dominantly inherited disorder [4, 11]. Considerable effort has been put into elucidating the genetic basis of KS, listed in table 2. This essay will summarize the key illustrative studies carried out to date.
In 1993, Jardine et al. reported a case of KS involving a partial monosomy on chromosome 6q and partial trisomy on chromosome 12q . Since then, additional cases of KS involving cytogenetic abnormalities have been described (fig 2). A notable example would involve tandem duplications in chromosomal region 8p22-p23.1, first identified in a cohort of six KS individuals using comparative genomic hybridization (CGH) and fluorescence in situ hybridization (FISH) . However, multiple follow-up studies failed to uncover this defect in other KS individuals, suggesting that it is not causal for KS [14-18]. Review of reported chromosomal aberrations suggests that defects involving autosomal chromosomes are unlikely to cause KS directly as they all involve distinct chromosomal break points. It was proposed that defects involving the sex chromosomes were more likely to underlie KS. This was supported by the observation that certain individuals with KS present with clinical features similar to those of Turner syndrome, a disorder in which individuals inherit only a single copy of the X chromosome (45X) [3, 19]. The similar incidence of KS among males and females implies that any defects involving the sex chromosomes are likely to be present on pseudo-autosomal regions, or regions of high sequence homology between the X and Y chromosomes. However, it was also noted that individuals with a small ring X chromosome tend to suffer from unusually severe manifestations of KS, and are therefore etiologically distinct from most cases of KS .
Microdeletions involving multiple genes were also hypothesized to cause KS, indicated by the multiple phenotypic manifestations associated with the condition. The DiGeorge/ velocardiofacial chromosomal region within chromosome 22q11.2 was identified as a candidate region due to the presence of congenital cardiac abnormalities in KS individuals . A separate region at chromosome 1q32-q41 was also investigated due to overlapping features with van der Woude syndrome, which results from deletions in that region . However, both studies failed to yield any association between deletions at these sites and the development of KS.
It was also suggested that mutations in genes involved in the RAS-mitogen activated protein kinase (RAS-MAPK) signal transduction pathway might underlie KS . This was thought to be likely as mutations in the same genes were found to cause multiple congenital anomaly syndromes including Noonan and cardio-fadio-cutaneous syndromes [23, 24]. However, Sanger sequencing of the relevant coding exons and splice junctions failed to reveal any pathogenic mutations in a cohort of 30 patients, indicating that this was not causal for KS .
In 2009, Kuniba et al utilized high resolution molecular karyotyping aimed at identifying copy number variations (CNVs) in a cohort of 17 individuals with KS, followed by Sanger sequencing of candidate regions in an additional 41 affected individuals . The GeneChip 250k NspI oligonucleotide array afforded a resolution of 30-100kb , significantly higher compared to previous platforms which typically resolve up to 1.2Mb. Seven CNVs were detected in total, of which only one involved a protein coding region. However, this was present in only a single individual and therefore not causal for KS .
Essentially, apart from the possible autosomal dominant mode of inheritance observed, minimal progress was made in identifying the genetic basis of KS prior to 2010. Methods such as array CGH and high resolution molecular karyotyping have uncovered numerous chromosomal abnormalities in KS individuals but these have failed to be replicated in follow-up studies. Investigations of microdeletions involving multiple genes or deletions in candidate genes have also failed to provide any insight into the genetic basis of KS.
2010: Exome sequencing reveals MLL2 mutations as the cause of KS
With the advent of higher throughput "next generation" methods of sequencing coupled with novel target enrichment strategies, exome sequencing was proposed to be a potential means of characterizing the genetic basis of rare monogenic diseases. The first proof-of-principle study demonstrating the robustness of exome sequencing was carried out in late 2009, where the technique was used to identify mutations in a previously described gene which causes Freeman-Sheldon syndrome . Shortly after, the same method was used to identify mutations in the DHODH gene as the genetic basis of Miller syndrome . In light of the proven utility of exome sequencing, it was hypothesized that the same method could be used to elucidate the genetic basis of KS.
Despite the rapidly decreasing cost, whole exome sequencing is still not considered to be a feasible method for routine investigation. This is primarily due to the disproportionate amount of time and money required for a single run, and secondly, the lack of suitable bioinformatics resources to accommodate the sheer amount of data generated . As such, various methods have been designed to capture and enrich candidate regions prior to sequencing. Exome sequencing entails capturing and enriching the entire protein-coding region (exome) of the human genome. Although the exome is 30Mb in size, it represents only 5% of the genotyping required for whole genome sequencing . This is considered to be adequate with regards to rare Mendelian disorders such as KS for three key reasons. First, the vast majority of genetic defects underlying Mendelian disorders are known to affect protein-coding regions . Splice site mutations are also known to have significant effects on protein synthesis and are therefore captured and enriched as well. Second, traditional cloning studies targeting protein-coding regions of the genome have demonstrated satisfactory outcomes in identifying variants for monogenic disorders . Third, variants in non-coding sequences are traditionally known to have neutral or insignificant effects on phenotypic presentation. This is largely true even in well conserved regions of non-coding sequences . This is in contrast to non-synonymous variants in coding regions which are mostly deleterious .
Exome sequencing involves three main steps, namely exome capture and enrichment, massively parallel sequencing, and filtering. There are many methods available for exome capture and enrichment, each with its own associated pros and cons. These may include variable sensitivity, specificity, uniformity of coverage, costs and amount of starting sample required. A detailed comparison between these methods is beyond the scope of this essay and can be found in reference X. Similarly, a variety of methods for massively parallel sequencing are available, in accordance to the various commercial platforms on the market. One of the most frequently used platforms is the Illumina Genome Analyzer, which uses the "sequencing by synthesis" method of genotyping (fig x) . Again, a detailed comparison of the various methods available is beyond the scope of this essay and can be found in reference X.
In the case of the KS study, Ng et al defined their study cohort as ten unrelated individuals from a variety of ethnicities, each with an previous clinical diagnosis of KS. Exome capture was carried out using a hybrid capture based technique involving hybridization of shotgun fragment libraries containing paired-end adaptors to probes on a custom made microarray surface defined by the RefSeq database. Next, massively parallel sequencing was carried out using an Illumina Genome Analyzer II. This achieved an average of 40X coverage of the targeted exome, generating 6.4Gb of sequence reads per sample. Filtering for rare variants was then carried out against existing databases including the 1000 Genomes Project, dbSNP129, and other curated resources. Assuming a dominant mode of inheritance as previously described, filtering yielded only a single hit common to all ten exomes, the MUC16 gene. This was considered a false positive due to its unusually large size, possibly resulting in limited variant calls recorded in existing databases. Subsequently, less stringent parameters were applied when filtering to allow for reasonable genetic heterogeneity. Analyzing candidate genes shared by subsets of individuals revealed 17, 6 and 3 genes shared among 7, 8 and 9 individuals respectively. These candidate genes were ranked using a variety of methods including scoring taking into account the presenting phenotypes of affected individuals and functional consequence of mutations. Eventually, MLL2 emerged as the most likely candidate gene, with seven of the ten individuals with KS harboring either a nonsense substitution or frameshift insertion or deletion mutation (indel). Filtering results from this study are reproduced in table 3. Subsequent Sanger sequencing of the MLL2 gene in all ten individuals confirmed the seven loss-of-function mutations, and uncovered additional frameshift indels in two of the three individuals in which massively parallel sequencing revealed no mutations. Follow up studies revealed loss-of-function MLL2 mutations in 26 out of 43 additional cases of KS. None of these mutations were present in 190 controls matched for geographical ancestry, further indicating that mutations in the MLL2 gene do indeed underlie a significant portion of KS cases .
MLL2 gene mutations in KS: potential implications
Having identified loss-of-function mutations in MLL2 as the genetic basis for KS in a significant portion of cases, the next step would be to elucidate the role MLL2 plays in human physiology and pathology, specific to KS.
MLL2, or myeloid/lymphoid or mixed lineage leukemia 2, has been previously mapped to chromosome 12q12-q14 . It encodes a large protein, Mll2, which is part of the MLL family of proteins. The Mll2 protein is predicted to contain many evolutionary conserved domains including a cluster of PHD (plant homeodomains) zinc fingers at the N terminal end, and a SET domain at the C terminal end (fig 2). The SET domain is of functional significance as it is known to possess methyltransferase activity specific for lysine residue 4 on histone H3 . Mll2 is known to be a crucial regulator of embryonic development and epigenetic control of chromatin states . Mice models deficient in Mll2 displayed decreased growth, increased apoptosis and delayed development, ultimately resulting in premature embryonic lethality . However, mice with a single normal MLL2 allele displayed no phenotypic abnormalities . Additionally, aberrant MLL2 gene expression has been identified in cancer cell lines and tumours derived from human colon and breast tumour samples .
The role of Mll2 in KS pathogenesis is as yet unclear. Review of the spectrum of MLL2 mutations reported in KS reveals that the majority of mutations result in premature truncation of the transcript prior to the SET domain (fig 2). Also, nearly all the reported missense mutations are located proximal to the C terminus, suggesting that this is a crucial region for Mll2 function. It has been proposed that haploinsufficiency is the likely basis for KS rather than gain of function although this is yet to be verified . Interestingly, a single case involving a chromosomal deletion encompassing regions 12q12-q13.2 has been reported in a child with Noonan syndrome . However, the distal breakpoint in this individual was located approximately 700kb proximal to MLL2.
However, it is hoped that this initial discovery of MLL2 mutations in a subset of individuals with KS will serve as a stimulus for functional studies into the role of Mll2 in human physiology and pathology. Additionally, this association will impact upon the management of individuals with KS in the short to medium term. Diagnosis of KS has traditionally been made solely based on clinical features, a potential challenge due to the phenotypic heterogeneity associated with this condition. Tentative diagnoses can now be confirmed by screening for mutations in the MLL2 gene. With regards to familial cases of KS, prenatal testing for known MLL2 mutations can be initiated to ensure that subsequent children are not affected by this condition. As more cases of MLL2 mutations associated with KS are reported, it is likely that any existing genotype-phenotype correlations for the condition can be further refined, providing a valuable resource for genetic counseling for affected individuals. In the long term, it is hoped that this discovery will aid in the development of novel gene therapies for the treatment of individuals with KS.
MLL2 gene mutations in KS: why did it require 30 years?
On retrospect, it has taken nearly 30 years for the discovery of MLL2 mutations since the initial description of KS in 1981. This can be partly attributed to the nature of KS, being a rare and heterogeneous condition. Also, conventional methods for gene discovery including linkage analysis and homozygosity mapping were not suited for studying KS due to the fact that the majority of cases arise spontaneously with only a small number of familial cases reported. However, these constraints were overcome with the advent of exome sequencing. It is thus appropriate to examine the advantages of exome sequencing, and why it was suitable for investigating the genetic basis of KS.
Advantages of exome sequencing
In addition to the time and cost effectiveness described earlier, there are several other merits associated with exome sequencing that are integral to the discovery of MLL2 mutations in KS. In contrast to linkage analysis and homozygosity mapping, prior linkage information is not essential for exome sequencing. The paucity of linkage information has proven to be a major hindrance in identifying the genetic basis of KS, resulting from the fact that the majority of KS cases are sporadic in nature. Indeed, there were no established candidate gene regions described for KS prior to the discovery of MLL2 mutation .
Also, exome sequencing only requires a relatively small number of affected individuals. This is in contrast to the huge number of affected individuals and pedigrees required in linkage analysis for sufficiently powered statistical analyses. KS is a rare condition with a low incidence of familial transmission. It is therefore conceivable that there would have been significant difficulty in obtaining sufficient samples from affected individuals, and in familial instances, family members. Tellingly, only ten affected individuals were required to identify MLL2 mutations in KS using exome sequencing .
Exome sequencing also affords a high resolution of sequencing, down to a single base. This is in contrast to conventional methods of investigation such as FISH, which are prone to ambiguous results causing potential misinterpretation. Specific to KS, the paracentric inversion at chromosomal region 8p22-p23.1 originally reported in an affected individual was subsequently found to be false after re-examination of the original data . This arose because of cross hybridization resulting from a genomic segmental duplication present in the FISH probe used. This example underscores the importance of accurate and unambiguous results afforded by exome sequencing. It has been recognized that there is a potential error rate associated with the massively parallel sequencing involved . However, with sufficient coverage and application of quality control checks, the chance of such errors arising can be minimized.
Another advantage associated with exome sequencing is that it enables relatively hypothesis-free genotyping in contrast to conventional methods of disease gene detection. It may be argued that genotyping only the exome relies on the assumption that the causal variant is located in a protein-coding region. However, the 30Mb genotyped is still considerably more comprehensive compared to the thousands of bases genotyped in candidate gene sequencing. Conventional gene sequencing relies on the identification of candidate regions or genes based on prior linkage data, or hypotheses built upon phenotypic features resembling other known conditions. In the study of KS, chromosomal regions 22q11.2 and 1q32-q41 were identified as candidate regions due to the presence of congenital cardiac abnormalities and features reminiscent of van der Woude syndrome respectively [20, 21]. Genes involved in the RAS-mitogen activated protein kinase (RAS-MAPK) signal transduction pathway were also identified due to their prior association with other congenital anomaly syndromes . Eventually, it was found that none of the above postulated genes were implicated in the pathogenesis of KS. Conversely, the MLL2 gene had no prior relation to any congenital abnormalities or diseases resembling KS, and was only discovered with the use of high throughput exome sequencing.
Limitations associated with exome sequencing
Exome sequencing is clearly an appropriate method for finding genes underlying rare Mendelian disorders when compared to conventional methods. However, it is not without its limitations and some of these were evident during the KS study. It is imperative that these constraints are factored into consideration when interpreting any data generated.
Exome sequencing operates on the premise of a few assumptions. These include KS being a monogenic condition, and that any pathogenic variants will be located within coding regions of the genome . Despite the identification of MLL2 mutations in a subset of affected individuals, results from the study indicate that both of the above assumptions may not necessarily be valid. Exome sequencing failed to reveal any pathogenic MLL2 mutations in three of the ten affected individuals with KS, and Sanger sequencing demonstrated that 17 of the additional 43 KS cases also did not harbor any MLL2 mutations. This implies that there are additional variants, possibly in non-coding regions of the genome that underlie cases of KS. Indeed, contradictory to prior understanding, there is an increasing pool of evidence indicating that variants in non-coding regions can be implicated in human disease .
Also, there are technical limitations inherent in current methods of exome sequencing. Firstly, the scope of sequencing is restricted by the definition of the exome used in the design of probes on the microarray surface for capture purposes. Any genetic variants located outside of the pre-defined exome regions will therefore be undetectable. Most initial studies used the consensus coding sequence (CCDS), a database of highly conserved protein coding genes, to define the protein-coding region of the genome . However, this was expanded to include the RefSeq database when investigating for KS. This proved to be a fortuitous decision as MLL2 was not included in the CCDS database, and would have therefore not been identified as a causal gene for KS using the previous exome definition [6, 38]. Secondly, the existing technology used in massively parallel sequencing is associated with an inherently high error rate of base calling. Misalignments can also adversely affect the accuracy of sequence reads. Hence there is a need for adequate coverage combined with consideration of specific error profiles. In the KS study, exome sequencing failed to reveal MLL2 mutations in three of the ten affected individuals. Sanger sequencing eventually revealed frameshift indel mutations in two of these three individuals, demonstrating the limited accuracy of exome sequencing . Thirdly, filtering strategies used are inadequate in excluding false-positive candidate genes. These may arise due to a variety of reasons including the existence of pseudogenes affecting sequence alignment, misinterpretation of pathogenic variants as polymorphisms, or simply systemic errors involved in variant calling . For instance, the initial round of filtering in the KS study revealed MUC16 as the only candidate gene common to all ten affected individuals. Upon manual inspection, this was discovered to be a false-positive due to its large size, possibly resulting in a high number of variants detected in all affected individuals .
Having identified MLL2 mutations as the genetic basis of KS in a significant proportion of individuals, it is hoped that this will prove to be the catalyst for future work into understanding the pathogenesis of this condition. The establishment of genotype-phenotype correlation will also be possible as greater numbers of mutations are reported, although this is not necessarily straightforward due to the potential presence of genetic or epigenetic modifiers. As gene therapy gradually becomes closer to reality, it is anticipated there will eventually be more effective treatments available for afflictions associated with this condition.
The discovery of MLL2 mutations using exome sequencing is also likely to have implications beyond KS. It is hoped that the application of exome sequencing will accelerate the rate of gene discovery for many Mendelian disorders of as yet unknown etiology. Also, beyond the field of rare Mendelian disorders, exome sequencing can potentially be applied to the study of cancer genomics .
It is important to acknowledge that MLL2 mutations only underlie 66% of the studied cases of KS . The genetic basis for the remaining 34% of KS cases is as yet unknown. It is possible that the causal gene could have been missed out due to one of the limitations discussed earlier. Also, it is plausible that the causal variant is situated in a non-coding region of the genome. If the second scenario is indeed the case, it is unlikely that any of the routinely used sequencing or linkage methods will be able to detect this. Whole genome sequencing is still not a mainstream method of investigation, mainly due to the high cost and time required per run. However, as the cost involved decreases and new strategies for interpreting data are implemented, it is conceivable that whole genome sequencing will eventually replace the need for exome capture and enrichment altogether. It remains to be seen if this will indeed be the case, and if it will uncover the causal variants underlying the remainder of KS cases.
The discovery of MLL2 as a causal gene for KS is significant at two levels. First, this represents a huge improvement in our understanding of KS as a genetic condition. Since its initial description in 1981, considerable effort has been put into detailing the various phenotypic manifestations present in affected individuals across different ethnicities. However, this is of limited use in terms of making diagnoses of KS. Apart from the characteristic facial presentation, few other features are helpful in making a definite diagnosis of this heterogeneous condition. Additional phenotypic manifestations including mental retardation and other physical abnormalities differ in prevalence according to ethnicity and are thus not as reliable. With the discovery of MLL2 gene mutations, diagnoses of the disease can be made by sequencing the gene for variants. Also, prenatal testing for KS can be carried out for familial cases of the condition. Information regarding the nature of MLL2 mutations can also be useful in establishing genotype-phenotype correlation for this condition. This will be helpful in predicting the severity of the disorder and making prognoses for affected individuals. It is also possible that insight into some of the disease pathways associated with KS can be obtained by studying the role of Mll2 in human physiology and pathology. Routine gene therapy is still a distant reality, but it is hoped that this can one day be applied in the treatment of KS. The ultimate goal would be to translate this discovery into methods of treatment which benefit individuals affected with KS.
Secondly, this discovery is significant as it represents a paradigm shift in the way we approach the study of rare Mendelian disorders. The study of KS provides a good illustration of how methods of disease gene detection have evolved. Conventional methods such as FISH analysis, array CGH and candidate gene sequencing were all of little utility in identifying the genetic basis of KS. This can be partly attributed to the low incidence and sporadic nature of KS. Also, these techniques are error prone, exemplified by the misinterpretation of a paracentric chromosomal inversion at 8p22-p23.1 using FISH . Exome sequencing revealed MLL2 mutations as a cause of KS merely 12 months after the first proof-of-principle study was published. Furthermore, samples from only ten affected individuals were required. However, the KS study also highlighted the limitations of exome sequencing, and it is important that these are addressed. While it is conceivable that whole genome sequencing will eventually replace exome sequencing in the future, it is hoped that exome sequencing will nevertheless prove to be a useful tool in elucidating the genetic basis of other unsolved disorders and diseases. The identification of MLL2 gene mutations in KS definitely proves that exome sequencing has the potential to answer these questions.