Cryptic Loci Regulator 4 Clr4 Protein Review Biology Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Cryptic loci regulator 4 is a set domain histone lysine methyltransferase protein, It is responsible for methylation of lysine 9 (K9) of histone 3 (H3) in the fission yeast Schizosaccharomyces pombe. Methylation of H3K9 is an important component for gene silencing, as methylated histones bind DNA more tightly to inhibit transcription, which leads to gene silencing. In this review the structure and how the structure of Clr4 was solved will be discussed.

The first steps to solving the structure of a protein involve the cloning, expression and purification of the protein. The limiting step in determining the three-dimensional structure of a protein structure is obtaining protein crystals that can be useful for structure determination; the protein needs to be of high quality and purity and crystals need to be single and large. The ultimate goal is to produce an electron density map that is then used to build an atomic model of the molecule being studied. The likely mode of function of the protein can then be determined from the structure.

Expression and Purification

In the paper by Min, Zhang, Cheng, Grewal, HYPERLINK "#_ENREF_1"&HYPERLINK "#_ENREF_1" Xu (2002), recombinant Clr4 proteins were expressed in XL1 Blue E. coli cells using a pQE30 Xa vector from Qiagen (Min, Zhang, Cheng, Grewal, & Xu, 2002). Recombinant constructs were produced by placing a 6xHis tag at the N-terminus of the protein of interest. A polyhistidine-tag (6xHis) is an amino acid motif in proteins that consists of at least five histidine (His) residues, often at the N- or C-terminus of the protein. The 6xHis tag is used to bind the protein to the metal nickel which is bound during affinity chromatography (Mohanty & Wiener, 2004).

Figure 1

Vector map for pQE30 Xa

The pQE-30 Xa vector encodes a Factor Xa Protease recognition site which is bracketed by the 6xHis-tag coding region on the 5' side and the multiple cloning sites on the 3' side. 5'-end cloning using the blunt-end StuI restriction site allows insertion of the gene of interest directly behind the Factor Xa Protease recognition site, without any intervening amino acid codons. Factor Xa Protease cleaves off the 6xHis-tag peptide behind the arginine residue of the protease recognition site (Gorlach & Schmid, 1996).

The XL1 Blue E. coli strain allows blue/white colour screening for recombinant plasmids. Blue/white screening is a microbiological technique that allows for the detection of successful ligations in vector-based gene cloning. If ligation is successful bacterial colonies will be white and if not, colonies will be blue. XL1 Blue E. coli cells are endonuclease deficient, which greatly improves the quality of miniprep DNA. Blue E. coli cells are recombination deficient which improves insert stability and they are a slow growing strain which yields high quality DNA (Tu et al., 2005).

Under native conditions the cells can be lysed by sonication or homogenization after treatment with lysozyme. Sonication is high frequency sound waves that shear cell membranes. Lysozyme is an enzyme that functions by attacking peptidoglycans and hydrolyzing the glycosidic bond found in the cell walls of bacteria (QIAGEN, 2003).

Clr4 was purified by Ni-NTA (nickel-nitrilotriacetic acid) metal affinity chromatography. NTA, which has four chelation sites for nickel ions, binds nickel more tightly than metal-chelating purification systems that only have three sites available for interaction with metal ions. The extra chelation site prevents nickel-ion leaching and results in a greater binding capacity and protein preparations with higher purity (Crowe, Masone, & Ribbe, 1996; QIAGEN, 2003).

Clr4 was also purified by Hi-trap Q and Superdex-75 column chromatography. Hi-trap Q is ion exchange chromatography columns which separate proteins with differences in ionic charge. Superdex-75 is gel filtration chromatography and separates molecules based on their size or hydrodynamic volume. Purity of the protein can be determined with SDS-page a single band should be seen if the protein is pure (QIAGEN, 2003).

Crystallization, Diffraction and Solving the Phase Problem

The Clr4 proteins were concentrated to about 10 mg/ml for crystallization, crystals were grown using the hanging drop vapor diffusion method with the well solution containing 0.7-1.5M of ammonium sulfate (Min et al., 2002). Hanging drop vapor diffusion involves a droplet containing purified protein with buffer and precipitant being allowed to equilibrate with a larger reservoir (Figure 2). The benefit of vapour diffusion is that it allows for the increase of protein and precipitant concentrations. The concentration is increased due to vapour leaving the hanging drop to reach equilibrium with its surroundings. This setup does not allow for evaporation taking place although it may be used to encourage crystal growth. The cover slip is often sealed in place using vacuum grease.

Figure 2: Hanging drop set-up

The vapour diffusion method has produced more crystallized macromolecules than any other method and is firmly established as the most widely used in protein crystallization (McPherson, 2004).

X-ray diffraction data was collected at 95-100 K using CCD detectors of the National Synchrotron Light Source at the Brookhaven National Laboratory. Crystals were small, less than 100 microns thick, and only diffracted to 3.0 À (Min et al., 2002).

X-ray crystallography is a method of determining the arrangement of atoms within a crystal, in which a beam of X-rays strike a crystal and diffracts into many specific directions. From the angles and intensities of these diffracted beams, a crystallographer can produce a three-dimensional picture of the density of electrons within the crystal. From this electron density, the mean positions of the atoms in the crystal can be determined, as well as their chemical bonds, their disorder and various other information (Ilari & Savino, 2008).

The crystal structure was first solved using MAD (Multi-wavelength anomalous dispersion) data of the three endogenous zinc atoms. This technique can be used if the structure contains one or more atoms that cause significant anomalous scattering from incoming X-rays at the wavelength used for the diffraction experiment, for example metal ions (Merritt, 2001). SOLVETM was used to locate the positions of the three zinc atoms and for phasing the data to 3.0 À resolutions. SOLVETM is a program that can carry out all the steps of macromolecular structure determination from scaling data to calculation of an electron density map, automatically (Terwilliger, 2006). RESOLVETM was used to further improve electron density map. RESOLVETM is a program that improves electron density maps. RESOLVETM uses a statistical approach to combine experimental X-ray diffraction information with knowledge about the expected characteristics of an electron density map of a macromolecule (Resolve, 2006).

Structure and Function

Histone lysine methylation is a functionally complex process, as it can either activate or repress transcription, depending on sequence-specific lysine methylation site in histones and the methylation state of the functional-amino group of the target lysine. The nucleosome, made up of four histone proteins (H2A, H2B, H3, and H4), is the primary building block of chromatin. Histones have been shown to be dynamic proteins that undergo multiple types of post translational modifications. One modification methylation of lysine residues is a major determinant for formation of inactive regions of the genome, because methylated histones bind DNA more tightly to inhibit transcription. Histone Methyltransferases catalyze the transfer of a methyl group from the co-factor S-adenosyl-L-methionine (AdoMet) to a substrate lysine (Qian & Zhou, 2006).

Clr4 is a SET-domain histone lysine methyltransferase protein, which includes a conserved SET-domain flanked by cysteine rich pre and post-SET domains. This arrangement is unique to the Suv39/Clr4 family of Histone Methyltransferases (HTMases), as no other SET domain proteins have pre-SET domains (Min et al., 2002).

The catalytic domain of Clr4 consists of residues 192-490 and it is 299 residues long. Figure 3 below shows the amino acid sequence (primary structure) of the Clr4 protein. Dark blue to light blue lines above the sequence indicates the N-terminal and pre-SET domains. The yellow lines indicate the SET-domain and red lines indicate the post-SET domain.

Figure 3

Left: The primary amino acid sequence of the Clr4 protein

The sequence is from the RCSB Protein Data Bank


Kouzarides (2002) suggests that the pre-SET domain may provide specificity necessary for the SET domain to methylate lysine 9 of histone 3, rather than any other lysine, due to three other members of this family (in addition to Suv39; the human equivalent protein) being identified as histone 3 lysine 9 methyltransferases (Kouzarides, 2002). The regions flanking the SET domain are not strictly conserved, although they are required for methyltransferase activity (Min et al., 2002). The protein contains N- and C-terminal ends of the primary sequence each terminal contains three-to-four short beta strands, a short helix and several loops that connect these secondary structural conformations (Marmorstein, 2003).

The post-SET domain forms an unusual knot-like structure in which a strand threads through a loop region of the SET-domain (Figure 4). The knot-like structure brings together two conserved regions of the protein, the active site and the C-terminal (Cheng, Collins, & Zhang, 2005). This structural arrangement differs from a true protein knot, known as a 'topological knot', in that the 'pseudo-knot' arises from the restraints imposed by hydrogen bonds between two segments of the protein chain instead of covalent links. (Taylor & Lin, 2003; Xiao, Wilson, & Gamblin, 2003).

Figure 4

Left: A strand diagram of the Clr4 protein showing the knot-like structure, where a strand of the C-terminal (blue) threads under the active site (green) in the SET-domain

There is a zinc binding motif located in the pre-set domain, it contains nine cysteine residues that are grouped into two segments separated by a region of 28 residues (Cheng et al., 2005). These nine cysteines coordinate three zinc ions to form a triangular cluster, where each of the zinc ions is coordinated by four cysteines, with each zinc molecule sharing a common cysteine residue (Figure 5) (Min et al., 2002).

Figure 5

Left: The zinc binding motif located in the pre-SET domain. The three zinc molecules in red and the nine cysteine residues in magenta.

Mutagenesis studies (in vivo) have shown that the SET domain and the cysteine-rich pre and post-SET domains in methyltransferase proteins are required for enzymatic activity (Zhang & Reinberg, 2001).

Residues from the SET-domain form the core of the protein. The SET-domain contains eight β-strands producing a β-barrel and two short helices. The β-strands form several small β-sheets surrounding the 'pseudo knot' structure (Figure 6) (Dillon, Zhang, Trievel, & Cheng, 2005). Within the SET-domain there is a large conserved region, it is where the HTMase active site lies, this active site includes residues 406-412 (Arg-Phe-Phe-Asn-His's-Ser-Cys) (Figure 7). It has been shown that mutations of Histine 410 and Cystine 412 residues (in vivo) in the human equivalent protein (Suv39) results in an inactive protein (Min et al., 2002).

Figure 6

Left: Clr4 Strand diagram showing the β-sheets in the SET-domain as red cartoon arrows.

Figure 7

Left: Clr4 strand diagram showing the active site, green residues are located in a helix, blue residues are located in a β-strand and the black residues are located in a loop.

There is a small conserved region containing residues 319-325 on the opposite side to the active site (Figure 8). They appear to stabilize the positioning of the zinc cluster in the pre-SET region, stabilization is important because it has been shown that mutation of Arg320 to a Histidine (in vivo) in S. pombe results in an inactive protein (Min et al., 2002).

Figure 8

Left: A strand diagram of Clr4 with the conserved residues green, in proximity to the zinc molecules red

An open cleft is found adjacent to the active site, several residues located here are important for efficient enzymatic reactions, suggesting the cleft may be involved in cofactor or substrate binding (Min et al., 2002). The active site and the open cleft are open and solvent exposed, methyl transfer reactions usually don't happen in the open and solvent-exposed grooves as the post-SET domain is proximal to the active site it could fold back and provide a 'cover' above the groove to form a solvent-secluded active site (Min et al., 2002).

Figure 9

Left: Clr4 CPK diagram showing the cleft indicated by the arrow

In other SET-domain methyltransferase proteins, aromatic and hydrophobic proteins form a channel in the open cleft, that bridges two conserved regions, the target lysine is inserted into this channel and the aromatic residues make Van der Waals with the methylene part of the target lysine (Cheng et al., 2005). I hypothesize that the channel in Clr4 could connect the active site residues near the open cleft with the conserved residues that stabilize the zinc binding motif on the opposite side of the protein. The β-barrel could play a part in this channel whereby the β-barrel could act as the channel (Figure 10)

Figure 10

Left: Zinc atoms shown in light blue, conserved residues 319-325 in pink, active site residues in green and the 8 β-sheets forming the β-barrel linking the two conserved regions in Clr4 shown in dark blue strands


From the structure presented in this review it has been established that the Clr4 protein acts as a SET-domain histone lysine methyltransferase protein. With the increasing interest in histone methylation as a mechanism for gene regulation, we will undoubtedly discover other HMTase proteins and get a better understanding of the role they play in gene silencing.