Insulin Gene At The Short Arm Biology Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

The INS gene in the humans codes the precursor of a highly studied hormone in the field of complex endocrine disorders: insulin. This peptide hormone plays a huge role in carrying out a hormonal reaction when changes in blood sugar levels occur (as result of normal consumption of dietary carbohydrates). Through indirect activations and inhibition of key enzymes in glycolysis and glycogenesis pathways, insulin regulates glucose uptake, utility and storage in key specific cells in the body (especially in livers, muscles and adipose tissues). Having a clear understanding of the mechanisms involving synthesis, positive and negative biochemical pathways of insulin can help us better understand insulin-related disorders (metabolic syndrome and insulin-dependent diabetes mellitus) as well as mutations in the gene sequence.


Gene structure: The INS gene location and copy number vary on the eukaryotic species; genome of mus musculus (rat) has 2 non-allelic copies of INS gene on separate chromosomes (Chromosome 6 and 7) whereas the human genome has a single copy on chromosome 11. The INS gene in humans is within the IDDM2 gene locus, specifically located on 11p15.5-p15.1 [1] (On the short-arm of chromosome 11, band 15 sub band 5-1) as seen in fig.1. The exact position on INS in chromosome 11 is roughly 2144-2148 kilobases(ncbi).

The INS gene is located on the short (p) arm of chromosome 11 at position 15.5.

Figure 1: insulin gene at the short arm (p) of chromosome 11(yellow arrow); position 15.5 [2] 

Expression: INS sequence codes for preinsulin (or preproinsulin) which is the first translated product to undergo enzymatic modifications before it becomes active insulin [3] . Human INS pre-mRNA has three exons (coding region) and two introns (non-coding region), similar to the rat insulin gene 2 [4] . The protein gene codes for a 3 chained peptide: alpha (A), beta (B) and connecting (C). The three chains are coded and spliced together by specific enzymes. Exon 1 codes for a signal peptide to facilitate intracellular translocation into the endoplasmic reticulum. Exon 2 codes for the B chain and the front portion of the C chain and exon 3 codes for the remainder of the C chain and the A chain [5] . The entire peptide chain that excludes the signal peptide codes for the proinsulin form, but it is still an inactive form of insulin with the C chain in between A and B. 3 exons make up for about 2200bp the order of the singularly transcribed peptide is NH2-Signal-B-C-A-COOH.

IDDM2 locus: Another gene present in IDDM2 locus is the IGF2 gene (insulin-like growth factor 2) which codes for a growth factor with several functions like growth, development (evident in Wilms tumor where IGF2 expression is elevated [6] and promoting insulin activity. IGF2 flanks INS gene downstream [7] and they are separated by a small non-coding intergenic DNA 12.6 kb long. INS and IGF2 sometimes undergo contiguous transcription due to different

promoter sites and alternate mRNA splicing [8] .

Figure 2: post-translational mechanism of preproinsulin at the endoplasmic reticulum from signal peptide translocation to packaging in the golgi apparatus into secretory vesicles and cleaving of prohormone via enzyme proteases: prohormone convertase (PC) 1, 2, and 3 and carboxypeptidase E (cpE). Picture from

Upstream of the INS lies many key sequences involving enhancement, promotion or repression of the INS gene. These upstream elements enforce the expression and tissue specificity of insulin in organisms. Positive and negative regulation of insulin expression depends on a set of 'boxes' and negative regulatory elements (NRE) respectively and they both involve a combinatorial-type of control to mount a single effect: to up or down regulated insulin transcription.

Figure.2: Diagram showing the post translational mechanisms of preporinsulin by first entering the endoplasmic reticulum via sec61membrance protein translocator, transported through the golgi apparatus and packaged into a secretory vecicles and cleaved by specific proteases: prohormone convertase 1-3 and carboxypeptidase E (cpE) [9] 

At the furthest 5'-flanking upstream regulatory sequence of INS (-363) are highly polymorphic variable nucleotide tandem repeats (VNTR) also known as the insulin-linked polymorphic region (ILPR). The 14bp repeat sequence (ACAGGGGTGTGGGG ) has 11 variants with 3 classes depending on the size of the repeats; Class I having the smallest repeat number (26-63 reps), Class II with intermediate repeats (~80 reps) and Class II having the largest (141-209 reps). Population studies of single nucleotide polymorphisms (SNP) of the ILPR cross several ethnic groups have shown some correlation of VNTRs predisposing individuals to polygenic disorders like obesity [10] . Class I VNTR alleles have shown to be statistically prevalent in Type I (autoimmunity) diabetic patients; most likely involved in low thymic-insulin expression which is crucial for immuno-self-tolerance to pancreatic insulin [11] . Class III alleles have shown correlation in high fat deposition in infants as well as high diastolic blood pressure (hypertension) , studies hypothesize that this correlation has something to do with the high amount of repeats in the ILPR which enhances insulin promotion which ultimately leads to increased uptake and utility of glucose, lipogenesis (fat deposition), glycolysis and glycogenesis [12] .

Read-through Transcription of INS-IGF2: Monk et al. (2006) studied read-through transcription across these two genes using Expressed Sequence Tags Database (EST) and Real-Time PCR confirmation experiments and they discovered 2 alternate transcriptional derivatives or isoforms of INSIGF mRNA; INS exons 1 and 2 become spliced to the 5' end of IGF2 exons (short or long IGF2 sequences depending on how many exons splices came from IGF2 gene). Since INS exon 3 is not included in this splice, a part of the C peptide and the whole of A peptide and the INS translational stop codon is not in the sequence. This explains the open reading frame that proceeds to translate the flanking IGF2 exons. The functionality of a INSIGF transcript is still being studied, the longer isoform with IGF2 exons 2,3,7,8 and 9 is said to be a biscistronic sequence to both genes. The two isoforms are being studied to find it's correlation to cancer and metabolism via genomic imprinting.

Figure.3: Upstream regulatory region of human INS gene. Contains several cis-acting factors of consensus sequence that either enhance or repress the basal transcription machinery when bound to an appropriate transcription factor (TF). Diagram from Hay & Docherty (2006):Comparative Analysis of Insulin Gene Promoters.

INS gene regulation: The INS gene, like most eukaryotic genes, has several transcription factors and regulatory sequences involved (fig.3). They are found upstream of the coding sequences which work synergistically to vary the effective amount of gene product (preproinsulin). The regulation of INS ensures that beta pancreatic cell-specific expression is achieved. Peptide signals (insulin/insulin-like growth factors) triggers various secondary messengers depending on the type of receptors. Steroid hormone-receptor complexes (glucocorticoids) also regulate INS expression which has specific DNA-binding domains to certain response elements on found on the upstream DNA regulatory sequences. INS gene regulation has been studied extensively in humans as well as animal models, most commonly rats (insulin I and II). Human INS and the two rat INS genes have a very differe DNA sequences and some proteins that bind to DNA in rats are not identically bound or

even present in humans i.e. (humans have the ILPR variable sequences and a negative regulatory element between; Z box)

Positive regulation: INS gene in humans has three most significant types' sequences of positive regulation; the E boxes and A boxes and the cyclic adenosine monophosphate (cAMP) response element (CRE). There are two E boxes sequences (CANNTG) present in the humans for INS gene; E1 and E2-like sequence (resembles E2 box in rat genome).

E Box: The E1 box is located up to -170 which bind to a basic helix-loop-helix (bHLH) protein transcriptional activators. These tissue specific proteins form hertodimers and bind to the E1 region. The three type bHLH proteins specific to pancreatic cells are NueroD(BETA2), NeuroD1 and HEB (TF12) which are also crucial for early pancreatic development as well. The co-activator: p300 is required for the transcription factors to be active. P300 binds to BETA2/NeuroD and HEB which then promotes the basal transcription machinery. E2 box-like sequence in humans exclusively binds to other protein motifs: upstream stimulatory factors (USF) 1 and 2. The overall binding effect of these protein motifs to their respective E boxes facilitates tissue-specific basal transcriptional activation since NeuroD protein motif is only expressed freely in pancreatic and neuroendocrine cells.

A boxes: Multiple A boxes that contain AT-rich sequences and a core TAAT consensus sequence [13] (except A2)* exist upstream of a single INS gene and they bind to different motifs that derive from different cascades of intracellular or steroid-receptor messengers, but they all holistically activate the same gene, just in different magnitudes depending on the synergy of the stimuli. The most prominent key motif that has been studied is the pancreatic duodenal homeo-box-1 (PDX-1) transcription factor that binds to A boxes [14] (A1;-82, A3;-216 and A5;-319). A3 box is the most conserved sequence amongst mammal insulin promoters. PDX-1 is also a transactivator of islet-specific genes that assist in glucose uptake (GLUT2 transporter) [15] . The activation of GLUT2 (hepatic/pancreatice glucose transporter 2) genes allows more intracellular glucose to metabolize and stimulate more INS gene expression, as evident to the oxidative phosphorylation end products of the mitochondria that cause transcriptional activity [16] . PDX-1 binds to A boxes as a monomer with p300 co-activator and promotes basal transcriptional machinery processes of INS (like increasing RNA polymerase binding to the operator site and histone/protein acetylation) [17] . Some experiments were conducted in gene disrupted mice that grew to have undeveloped pancreas and mammalian cell culture (Cre-loxP system) yielded cells with fewer insulin positive cells, and reduced INS and GLUT2 expression [18] .

Cyclic AMP response element (CRE): as the name depicts, are affected by cAMP; the secondary messengers of several peptide-ligand signaling mechanisms (G protein cascade). The protein motif that directly binds to CRE belongs to the basic region leucine zipper (bZIP) family [19] : CRE binding protein (CREB). A ligand to trigger this cascade is glucagon-like peptide 1(incretin), secreted from ileal cells of the stomach in the presence of dietary carbohydrates, proteins and lipids. Incretin binds to a G protein receptor which causes a conformational change and activates Gs-alpha subunit and binds to adenylate cyclase, which converts adenosine triphosphate(ATP) to cAMP. The cAMP levels then activate cAMP-dependent protein kinase A (PKA). PKA then phosphorylates the inactive CREB to a reactive form. CREB-PO4 requires a co-activator before it can bind to the CRE and activate transcription. CREB binding protein (CBP) with is from the p300 family like in E box transcription co-activators and it acts as a catabolite activator protein for CREB. The entire complex specifically binds to CRE to cause promotion of basal transcription machinery processes of INS [20] .

Negative Regulation: the expression of insulin is silenced by the negative response element (NRE) located in the Z element (glucose sensing)


The immediate translated product of the INS gene is called preproinsulin (110 amino acids), it is inactive in this state and requires a few biochemical steps before it becomes a functional. Active insulin has a half-life of 10 minutes and is usually degraded in the liver or kidneys [21] .

Biosynthesis: Mature INS mRNA is translated by free ribosomes studded on the rough endoplasmic reticulum of beta-pancreatic cells. The coding sequence of the mRNA is such that the leader/signal peptide sequence (24 amino acids) is translated first, followed by the B, C (connecting) and A chain in a single peptide. The leader sequence is necessary to anchor the prepotein into the ER to be further processed. The signal peptide is taken in by a transposon; a transmembrane protein channel called sec61 [22] . Sec61 undergoes conformational change to when the signal peptide binds to it, shuttling the entire translated product into the ER but the anchored signal peptide remains in the ER membrane. There is a cleavage sequence between and signal peptide and the BCA chain which is recognized by a signal peptidases present in the ER. This cleaves the transmembrane signal peptide from the proinsulin hormone. Now completely in the ER, proinsulin a free to create tertiary protein structures like 2 disulphide bridges between A and B chain and 1 on the C chain itself. There are short cleavage sites present between each chain as well, and are targeted by different enzymes in the ER: Prohormone convertase 1 and 2 cleaves proinsulin at the cleavage sit from B-C and C-A respectively [23] . Carboxypeptidase (exoprotease) [24] also further excises a few amino acids at the B chain which finally produces the active hormone which is the B chain (30 amnio acids) and A chain (21 amino acids) bound together by disulphide bonds. Active insulin which is roughly 6000 daltons has three quaternary structures: monomer (active form) , dimer and hexamer (three dimmers bound by zinc fingers) [25] . Insulin is brought into the golgi apparatus to be packed into secretory vesicles awaiting stimulus (blood sugar levels).

Secretion: Insulin release via exocytosis is triggered by the cells change in action potential coupled to the glucose metabolism from dietary carbohydrates that reach the pancreas from circulation. Dietary glucose enters the beta cells via GLUT2 transporter proteins and undergoes normal glycolysis and TCA cycle which yields adenosintriphosphate (ATP). The increase in ATP/ADP ratio causes the sulfonylurea receptors [26] to activate and close the potassium channels on the cell membrane. The reduced membrane potential results in activation of the voltage-gated calcium channels which in turn cause the insulin vesicles to fuse with the plasma membrane, releasing their contents (active insulin) [27] . With the help of the phospholipase C cascade (triggered by hormones from the pituitary gland), Inositol-1,4,5 triphospahte (IP3) enters the ER and forms ligand-receptor complexes that open Ca2+ channels present on the ER that releases more Ca2+ into the cytosol [28] .

Insulin mode of action: The biochemical objective of insulin is to reduce to blood sugar levels in event of high dietary carbohydrates (fed state). The hormone does this by changing the rates of metabolic pathways: the 3 main types of pathways affected by insulin are carbohydrate, lipid and protein metabolic pathways. Since insulin is a peptide hormone (hydrophilic), it is unable to penetrate the cells of target tissues; it must interact with specific receptors to create secondary messengers to carry out its function.

Insulin receptors: Insulin binds to tyrosine kinase receptors on the membranes of target tissues. Tyrosine kinase is a transmembrane-protein dimer with an alpha and beta subunit on each monomer. The two alpha subunits face the outside of the cell membrane that binds to a compatible ligand, the beta subunits is embedded inside the phospholipid bilayer with tyrosine residues that prong inside the cytosol. When an insulin-receptor complex is formed, it creates a conformational change and the beta subunit becomes active, and autophosphorylates it's tyrosine residues [29] . The beta subunit then proceeds to phosphorylate other intracellular proteins that bring about insulin transcription [30] .

Carbohydrate metabolism: All cells in our body use glucose as a staple source of ATP, brain cells' only ATP source is glucose (except during times of chronic starvation where ketone bodies partially substitute glucose to feed the brain). Insulin promotes to cellular uptake of glucose by promoting the transcription of glucose transporter proteins in muscles and adipose tissues (GLUT4. GLUT1 to 3 are insulin-independent transporter proteins) [31] . GLUT2 in hepatocytes have a high Km for glucose which are responsive to high levels of it [32] , majority of the dietary glucose (60%) that goes to through the hapatic portal system is absorbed by the liver and stored as glycogen. Insulin promotes glucokinase (a liver exclusive glycolytic enzyme with high Km for [glucose] [33] ) which is the not subjected to negative feedback inhibition of by its immediate product: glucose-6-phosphate (G6P) (unlike hexokinase in other non-hepatic cells that have a low Km for glucose and is allosterically inhibited by G6P [34] ). Hence glycolysis becomes imminent.

Glycogen: Insulin promotes the activity of glycogen synthesis by allosterically activating protein phosphatase-1 (PP1) [35] ; this protein cleaves phosphate groups of many enzymes. PP1 dephosphorylates glycogen synthase B (inactive) into glycogen synthase A (active). Glycogen synthase then proceeds to catalyze UDP-glucose into the glycogen polymer in the liver [36] .

Insulin also has another pathway of glycogen synthesis, which involves the protine kinase B secondary messengers. Insulin triggers its tyrosine kinase receptors and this leads to the phosphorylation of many intracellular proteins, insulin receptor substrate-1 (IRS1) in particular to PKB cascade [37] . IRS1 recruits phosphatidylinositol-3-kinase (PI3K) to the cell membrane. PI3K phosphorylates phosphoinositol-4,5-bisphosphate (PIP2) into phosphoinositol-3,4,5-trisphosphate (PIP3) [38] . PIP3 allosterically activates PIP3-dependent protein kinase-1 (PDK1) which phosphorylates PKB into its active state. PKB phosphorylates glycogen synthase kinase-3 (GSK-3) into its inactive state [39] . Hence glycogen synthase remains mostly unphosphorylated (active), this is more so when in combination with PP1 activity, overall resulting in glycogen synthesis and deposit in to liver cells.

Protein metabolism: Insulin promotes protein synthesis by increasing active transport into cells as well as inhibiting protein degradation (gluconeogenesis). Gluconeogenesis: Insulin binds to G proteins as well, but it does not cause a cAMP cascade via Gs (shuttle) subunit, instead it activates GI (inhibitory) subunit, this inhibits the activity of adenylate cyclase and hence the cAMP level in the cells is reduced [40] . Low cAMP levels mean low activity of cAMP-dependent protein kinase A which mean less inactivation of phosphofructo-2-kinase (PFK2), an enzyme that synthesizes fructose-2,6-bisphosphate (F26P) [41] (an allosteric inhibitor of fructose-1,6-bisphosphatase (FPBase1) and activator of phosphofructo-1-kinase (PFK1), two glycolysis and gluconeogenesis pathway intermediates. This means when insulin elevates the level of F26P via PFK2 inhibition, glycolysis is promoted (activated PFK1) and gluconeogenesis is repressed (inhibited FPBase1). Insulin also inhibits the activities of some gluconeogenic enzymes like pyruvate carboxylase (converts pyruvate to oxaloacetate (OAA)), phosphoenolpyruvate carboxykinase (PEPCK) (which converts OAA to phosphoenolpyruvate (PEP)) [42] .

This makes sense in a broad perspective since the role of insulin is the reduction of blood glucose levels to prevent hyperglyceamia, which correlates in times of fed state (high carbohydrate diet), hence it's is redundant to produce more glucose via gluconeogenesis in the expense of protein degradation of the body.

Lipid metabolism: Insulin also has a role in lipid metabolism, since there is a high influx of glucose during the fed state, the body must find a more efficient way to store all these glucose, apart from the glycogen in the liver. Insulin promotes the conversion of glucose to triacylglycerols (TAG) by increasing the uptake of glucose into adpocytes (via GLUT4 mentioned earlier) and activating pyrvate dehydrogenase (PD) (which creates acetyl CoA) and acetylCoA carboxylase (creates malony CoA, a precursor of lipogenesis) [43] .

Since there's a high source of glucose during fed state, there's no need for a less economical energy source like TAGs undergoing lipolysis and beta-oxidation in adipose tissue. Insulin thus inhibits lipolysis by decreasing the activity of hormone-sensitive lipase (HSL) in adipocytes [44] .

Clinical aspects

Animal Models: Akita mouse models [45] , developed by Yoshioka and colleagues. A mongenic mouse that has a beta-pancreatic cell dysfunction, produces an autosamal dominant trait of diabetic disorder similar to that of maturity onset diabetes of the young (MODY). Wang and colleagues found a missense mutation of mouse insulin 2 gene (chr 6); having an amino acid change from cysteine to tyrosine on codon 96, which caused a disulphide-bond disruption of the hormone quaternary structure. This accumulation of misfolded protein was observed to diminish the amount of functionally mature insulin the hormone in beta pancreatic cells in mouse models [46] . Further studies by Wang et al. also showed reduction in pancreatic secretory granules and a larger lumen of the endoplasmic reticulum.

Diabetes Mellitus: Some mutations in the structural part of INS gene have been associated with permanent neonatal diabetes mellitus (PNDM). Stoy et al. in 2007 sequenced the INS gene in PNDM patients and found 9 heterozygous missense mutations. One of the 9 mutations (C96Y) was studied and tested by inducing the same mutation in Akita mouse models. Stoy suggested that C96Y mutation in the structural gene region leads to a poor folding of the prohormone molecule and become fated to be degraded in the ER. The outcome would be increased stress on the ER which leads to beta-cell death [47] .

The importance of identifying INS gene mutations is to help screen individuals that are genetically predisposed to nonimmune diabetes. Mutations that have been discovered such as the ones found in the study conducted by Stoy et al., are found to be inherited in an autosomal dominant manner, which makes it expressed even in heterozygous alleles [48] , [49] .