Mapping Of Secondary Metabolite Genes Biology Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

This report has been made in the frame of the special course, entitled "mapping of secondary metabolite genes in Aspergillus nidulans "at DTU. It is based on the identification of genes that are involved in Aspergillus nidulans secondary metabolite production. The gene mapping was done by using Broad institute ansd Aspergillus genome data bases

Finally, I would like to thank my supervisor Michael Lynge Nielsen for his valuable guidance to carry out this work.

Archana Bhanu Kiran



In filamentous fungi, the genus Aspergillum comprises more than 150 species that are distributed in different ecological niches. Among them the well-studied species is Aspergillus nidulans. Due to the presence of sexual life cycle, availability of complete genome sequence makes this as model organism in research of eukaryotic biology. These fungi synthesize a range of secondary metabolites including antibiotics like penicillin that greatly serves the mankind.

This project is an attempt to give an overview of all characterized as well as annotated secondary metabolite genes of Aspergillus nidulanss by mapping them on respective chromosomes. In addition, well characterized individual gene clusters and annotated gene clusters, based on GO terms are also mapped.


The genus Aspergillus constitutes a large group of fungi with over 185 well-known species. The name Aspergillus was first coined in 1729 by an Italian priest and biologist Pier Antonio Michele (Bennett 2010), because of their appearance of holy water sprinkler under microscope. Now, the name Aspergillus is used to describe a group of moulds that share a common asexual spore forming structure, aspergillum. However, around one-third of species in this genus also have sexual life cycle. This genus has profound importance in ½the fields of medicine agriculture and biology. Members of this family are widely distributed in different ecological niches through worldwide. Most of them are saprophytes that are grown on various substrates from plant and animal waste to pesticides and plasticizers (Joseph Heitman). However, some of these members are potent pathogens to humans and plants. For instance Aspergillus fumigatus, that causes invasive pulmonary infections in humans, animals and birds. In addition, they also cause fungal sinusitis, asthma, and allergic broncho pulmonary infections in immune compromised inviduvals. Generally in healthy individuals, the diseased state was induced after long term exposure. (Reference)

Secondary metabolites of Aspergillus:

Filamentous fungi exhibit many unique characteristics that make them of great interest in the research

of eukaryotic molecular biology. One among them is production of secondary metabolite (N. P. Keller et .al, 2006). Secondary metabolites are the compounds which are not directly involved in normal growth, development and reproduction of organisms (reference). Instead they often play central role in protection against various environmental stresses and other surrounding inhabitants. In contrast to primary metabolites, lack of these compounds doesn't cause death but they influence long term impairment of organisms' survival and productivity. Sometimes lack of these compounds cause no significant change. In fungi they often confined to a narrow set of species with in a phylogenic group. The role of these compounds in host organisms is often obscure (reference) but they are useful to mankind in various ways

According to literature, during 1993-2001 1500 secondary metabolites were isolated and more than half these natural products have anti-bacterial, antifungal and anti-tumor activities. In 1940 the discovery of penicillin (N. D. Fedorova et al, 2010), is a major breakthrough in the research of fungal secondary metabolites. This discovery drew the attention of scientists on fungal secondary metabolites. After that, many other promising medicines have been isolated from various fungal species including immune suppressants, cholesterol lowering agents (lovastatin), anti viral drugs, and anti-tumor drugs

However they also produce harmful mycotoxins, such as aflatoxin, fumonisin, trichothecene (reference) which is a major threat in food industry. Apart from mycotoxins, most of the fungal secondary metabolites greatly serve mankind.

The sequencing of fungal genomes has disclosed several interesting things about fungal secondary metabolism. The foremost thing is the genes involved in the production of particular secondary metabolites are clustered together. Second, fungi have more metabolic pathways than earlier expected. This finding suggested that not all secondary metabolites are expressed under standard laboratory growth conditions.

The secondary metabolites are classified into the following four major groups, based on their biosynthetic origin:

Non ribosomal peptides (NRPs):

These are a diverse group of secondary metabolites that have a wide range of biological activities and pharmacological activities. For example cyclosporine A (immune suppressant), vancomycin (antibiotic) and indigo dine (pigment). One exceptional feature of this group is, is that they are not synthesized by normal protein synthesis pathway, i.e. translational of mRNA into peptides or proteins. Instead, they were synthesized by a large multi enzyme complex, non ribosomal peptide synthatases, NRPS. These enzymes provide both mRNA free template and necessary enzymes for the synthesis of NRPs (Nuno Banderia ,et al ).They are capable of synthesizing both proteogenic and non -proteogenic amino acids. They arranged as multiple modules and each module is capable of catalyzing one complete cycle of chain elongation, by means of three basic reactions: substrate recognition, activation as acyl adenylete and covalent binding as thio ester. These reactions are carried out in distinct domains that are present within the module. Each module consists of adenylation (A) domain, peptidyl carrier protein (PCP) domain and condensation(C) domain. The A domain recognizes the substrate and activates it as acyl adenylate at the expense of ATP. The activated amino acid subsequently transported to thePCP domain, where it forms a thioester bond with cysteamine thiol group of phosphopantetheine. Finally, the adjacent C domain catalyzes the formation of amide bond between amino acid that is bound to the PCP domain with amino acid of the previous module. (Dirk, H., and Keller, N.P. 2007). The peptide chain undergoes further modifications by means of optional enzymes such as epimerase, cyclase and N-methyl transferase.


They belong to another large family of secondary metabolites with high structural diversity. Most of them are produced pharmaceuticals and some other also produces mycotoxins. They are synthesized by repetitive condensation of acyl co enzyme A, starter unit with malonyl -coA extender unit, by a reaction known as claisen condensation. (Julia)

This biosynthetic reaction was catalyzed by a group of multi functional enzymes polyketide synthases, PKS. Based on their catalytic activity PKS are categorized into(Hsien -chun Lo) type I, type II and type III . In most of the fungi, PKs biosynthesis was catalyzed by type I PKS. These PKS molecules are organized into several domains with distinct functions.

However, every PKS molecule consists of four essential domains, keto synthase (KS) domain, the acyl tranferase domain(AT) domain, the acyl carrier protein (ACP) domain and thioesterase domain (Te) (dirk)

The PKS synthesis is initiated by uptake of activated substrate, by AT domain. Subsequently , the starter unit was transported to active site of KS domain through ACP domain . A malonyl unit , (extension unit) which is attached to the corresponding ACP domain gets decarboxylated by the KS domain ,reacts with thioester group of starter unit that is attached to the KS domain.after this, the elongated ketide chain is transported to the the Te domain, which is responsible for the release of PKS.(dirk)

Further β- keto processing reactions are catalyzed by auxillary domains such as keto reductase (KR) , the dehydratase (DH) domain and enoyl reductase (ER) domains.

In addition to thses auxillary enzymes and chain length derivatization of polyketide carbon skeleton also accounts for the structural diversity of polyketides(Julia)


Terpenes are large group of hydrocarbons which are well known as primary constituents of essential oils. They are biosynthetically derived from a basic5 carbon unit, isoprene which are linked together in head-tail or tail -tail fashion. They are classified as hemi, mono,sequi, di and triterpenes ,based on the number of isoprene units intheir back bone. Terpinoids, are a sub class of terpenes, that are formed by chemical modifications basic carbon skeleton. For example,vxxxx

Terpenes are large group of hydrocarbons which are well known as primary constituents of essential oils.They are biosynthetically derived from a basic 5 carbon unit, isoprene that has the molecular formula C5H8. Terpenes are formed by the multiples of isoprene units , with a general formula of (C5H8)n , where n refers to the number of isoprene units. This is also called as isoprene rule or C5 rule. These isoprene units are linked together in head-tail or tail -tail fashion. Terpinoids are a sub class of terpenes that are formed by the chemical modifications such as oxidation, cyclization and rarrangement of the of the basic structure of terpene. The biosynthesis of cyclic terpenes are catalyzed by a group of branched enzymes known as terpene cylases(Identifying functional domains within terpene cyclases using adomain-swapping strategy)

Indole Alkaloids:

Indole alkaloids are another class of secondary metabolites which are derived from the amino acid tryptophan. For example ergot alkaloids. However, other amino acids are also involved in the biosynthsesis of these group.

In addition to these four major groups , A.nidulans also produce some hybrid metabolites which include NRPS-PKS, (emericellamide) meroterpinoids (Austin and austinol)and terpinoid alkaloids which accounts further structural diversity in secondary metabolites.

A. nidulans as model organism:

Apart from the diseases, mainly Aspergillus species are used for fermentation purposes from ancient times. Especially the industrial work horses Aspergillus niger, Aspergillus oryzae [2] are used for the production of citric acid and enzymes that are used in food and feed industry. These species were mainly exploited by three major fermentation industries [1], Novozymes (A.oryzae), DSM (A. niger) and Danisco Genencor (both organisms). With the advancement of recombination technology these species has been developed as host organisms for over production of enzymes and pharmaceutical proteins [1]. Nevertheless, the well-studied species in this genus is Aspergillus nidulans [2]. It also called as Emericella nidulans due to the existence of perfect sexual life cycle. It is an aerobic filamentous fungus that belongs to the class Ascomycetes. It has been used in research since 1953, because of the pioneering work done by Guido Ponte Carvo. It has been widely used to unravel the biology of cell cycle, pathogenicity, drug resistance, primary and secondary metabolisms in eukaryotes. Because of its phylogenic similarity with other Ascomycetes species, the studies on this organism were greatly useful to know the molecular genetics of other pathogenic species in aspergillus. It has 8 chromosomes, with a relative genomic size of 30Mb. The genomic sequence of this species was publicly available in 2003, by the Broad institute ( Further, the availability useful vectors, DNA libraries, selectable markers, and the presence of sexual life cycle make this more amenable for both classical and molecular genetic analysis.

In aspergillus nidulans the geomic sequencing revealed that this species has 50 gene clusters which are supposed to synthesize secondary metabolites. According to bioinformatics, It has been estimated that this species has 27 poly ketide,PKS,14 nonribosomal peptide,NRPs,6 fatty acids , 1 terpene and 2indole alkaloid gene clusters. To date , only a few number of secondary metabolites, aspyridoneA and B, aspoauinolones A-D pencillin,asperlin, sterigmatocystin,triacetyl fusarinine,ferricrocin, shamixanthone,variecoxanthone,terriquinoneA, emericellin,dehydroaustinol, ergo sterol, peroxiergosterol,cervisterol have been identified.

Aim of the project:

Aspergillus species are very diverse in various aspects including habitation and secondary metabolite production. As mentioned earlier, A.nidulans produces a plethora of secondary metabolites with diverse range of activities. Unfortunately, most ofss the genes involved in secondary metabolite production are not fully explored. It is because of that most of the genes are not expressed under laboratory growth conditions. However, with the advancement in bioinformatics, genomic sequencing, it is possible to annotate gene functions, based on sequence similarities. The main objective of this work is to give an overview about the all known metabolites of Aspergillus nidulans, including individual gene clusters.

Broad institute:

To understand the evolutionary relationship between Aspergillus species and their adaptation to various ecological niches, broad institute, a non-profitable genomic research organization developed a comparative data base for Aspergillus species . The primary aim of this database is to provide an online genomic resource for scientists, who are doing research on genetics and molecular biology of Aspergillus species. In collaboration with Monsanto First, they sequenced the model organism A. nidulans.After that they provide sequence information for other Aspergillus including A.clavatus, A.fumigatus, A.flavus, A.niger, A.oryzae and tern.fischerispecies

Gene information in this data base:

The data base provides an extensively curated dataset of Aspergillus gene, protein and sequence information and web based tools for analyzing and exploring the data. In this database, the genomes are sequenced, using 454 whole genome shot gun methodology and the genes are annotated by combining manual annotations without put from multiple gene prediction programs. The putative functions are assigned to genes based on sequence homology to genes that are annotated on other genomes. In addition to gene information the data base also provides the following web based tools for comparative analysis between the genomes.

Synteny map: Synteny map displays co-localization of genes on chromosomes from different species.

Dot plot: Dot plot allows to compare two different genomes

Gene families :This feature allows to select genes in the member genomes

Genome map: this option diplays gene map with different para meters.

BLAST: blastp (Protein query/Protein database) ,blastx (Nucleotide query/Protein database) blastn (Nucleotide query/Nucleotide database) ,tblastn (Protein query/Nucleotide database)

Each gene has an information page that contains details about gene mane, location (contig region), length ,sequence and transcript information .Tools like BLAST, synteny map can also be accessed from this page for further analysis

In addition to broad comparative data base , several other data bases , The Aspergillus/Aspergillosis Website , Fungal Genetics Stock Center, The Aspergillus nidulans Linkage Map ,The Aspergillus nidulans Physical Map ,Aspergillus fumigatus resources at TIGR NIAID Microbial Sequencing Centers , Central Aspergillus Data Repository (CADRE) are also involved in multiple genomic sequencing of Aspergillus species. Broad institute supplement these data bases by updating the sequence information time to time through iterative comparative analyses.

In the present work, annotated genes were identified by using feature search option in broad institute database. The information about secondary metabolite gene clusters are collected from literature and are verified by using Broad data base and another promising data base aspergillus genome database.


Identification and characterization of novel secondary metabolites, produced by fungi can be a great source in therapeutic drug discovery. The Secondary metabolites produced by A.nidulans are of greatly diverse in structure and pharmacological activity. These fungi mainly express genes for the four major groups of metabolites, as mentioned earlier. Most of these genes are seems to be located in sub-telomere regions.

In this study; genes are annotated by considering the Pfam domains. Genes which contain β-keto acyl synthase (N), acyl transferase and β-keto acyl synthase (N) domains are annotated as PKS genes whereas genes which contain phosphopanteheine attachment, AMP- binding enzyme and condensation domain are annotated as NRPS genes. No conserved domains are found in genes that are involved in terpene and indole alkaloid production. These genes are annotated, based on their sequence similarity with characterized alkaloid and terpene genes in Broad database. The BlAST search of these genes was represented in Table 1. Genes which contain at least one conserved domains of NRPS, PKS genes are identified as hybrid metabolites of NRPS-PKS genes.

As a result of gene search mentioned above, 27 PKS, 14 NRPS, 10NRPS-PKS, 1indole alkaloid and 8terpene genes were identified. In addition, Toxin biosynthesis genes, AN9220, AN3661, AN11154 and AN10044 are identified that are closely related to toxic genes in other fungal species. The BLAST search of these genes with other fungal species is represented in Table 2. Moreover, genes that are involved in the biosynthesis of rapamycin, ferricrocin, triacetyl fusarinin, and erythromycin are identified, based on Go terms.

To get an over view of all these genes as mentioned all these annotated genes are mapped on chromosomes as shown in scheme1.In addition to 8 chromosomes , unmapped scaffold was also taken into account. Most of the genes are located at telomere regions. The search for characterized Gene clusters in literature ,gave information about penicillin, aspyridone, emericellamide, asperfurnone, monodictyphenone, sterigmatocystin, penicillin, terriquinone, asperthecin and F9775 gene clusters. Scheme 2 represent the mapping of above mentioned clusters with respective contig regions. These gene maps are given an idea about the distribution of secondary metabolite genes

Gene ID

Putative gene function


E -value











Indole-diterpene biosynthesis protein

Scheme: 1 Gene map of secondary metabolite genes, produced by A .nidulans

Chromosomes are represented by black solid bars. PKS genes are highlighted with red; NRPS gens are highlighted with green; NRPS-PKS genes are highlighted with violet; terpene genes are highlighted pink and alkaloid genes with brown colors. NRPS-like genes are indicated with *mark. The genes with exact contig regions are described in appendix.


The present work dealt with mapping of genes involved in the secondary metabolite production of A .nidulans. . The gene annotation was mainly based on the presence of pfam domains and their sequence homology with other characterized genes in Broad database. However, terpene cyclases are highly variable in sequence and it would be difficult to annotate those genes, based on bioinformatics methods. The gene clusters, shami xanthone, asperlin,austinol, fusarinine and aspoquinolone are not identified by BLAST search. However the present work was useful to identify 36 PKS genes, 15 NRPS and 2 terpene 1indole alkaloid genes along with 4 novel toxic biosynthesis genes.