Primary receptor for the HIV

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.


CD4 antigen is important to act as the primary receptor for the HIV. In order to understand the human CD4 gene and to determine its evolutionary aspects, we characterized this gene in detail in six different organisms. A comparative study was made of nucleotide length variations, intron and exon sizes and number variations, differential compositions of coding to non-coding bases, etc., to look for similarities/dissimilarities in the CD4 gene across all six taxa. Phylogenetic analysis showed the pattern found in other genes, as Homo sapiens and Pan troglodytes were placed in a single clade, and Rattus norvegicus and Mus musculus in another. We further focused on the two primates and aligned the amino acid sequences; there were small differences between humans and chimpanzees; both were more different from the other organisms.


Comparative analyses of genome sequences will be a major part to improve the health of individuals and society after the completion of Human Genome Project (Collins et al. 2003). It is the direct comparison of genomic information of one organism against that of another to gain a better understanding of how species evolved and to determine the function of genes and noncoding regions in genomes (Sivashankari and Shanmughavel. 2007). It is believed to be the important aspect of understanding the evolutionary relationship across different taxa. With the availability of genomic information of different organisms, it's now easy to compare them in different angles which also help in finding the unknown genes and mutations in them that can affect the function. Such study can be done with looking at the homologous and conserved regions in the sequences of different taxa in addition to the gene length, number of introns exons, GC contents, CpG islands and such type of other information can be obtained, that can help in understanding the evolutionary effects on genes.

The CD4 antigen is a 55 kD membrane glycoprotein molecule in human blood present on the surface of 65% of human T cells, also known as T4 antigen. The most important property of the CD4 antigen is to act as the primary receptor for the AIDS virus where CD stands for "cluster of differentiation". (Clark, et al. 1987). When the HIV virus attaches CD4 surface proteins, it diminishes the number of T cells, B cells, natural killer cells, and monocytes in host blood. The human immunodeficiency virus (HIV-1) infects T lymphocytes via an interaction between the virus envelope glycoprotein gp120 and the CD4 antigen of T helper cells (Schockmel, et al. 1992). CD4 binds to relatively invariant sites on class II major histocompatibility complex (MHC) molecules outside the peptide-binding groove, which interacts with the T-cell receptor (TCR).

From the literature it was found that T4 RNA is expressed not only in T lymphocytes, but also in B cells, macrophages, and granulocytes (Maddon, et al. 1987). It is also expressed in a developmentally regulated manner in specific regions of the brain, which make CD4 gene very important as AIDS is concerned and demands detailed evolutionary genomic understanding. Therefore in this project, we used a comparative approach to investigate the importance of CD4 gene in comparison with in six taxa.

Materials and Methods:

The nucleotide sequences of CD4 antigen of six different taxa i.e. Homo sapiens (Human)= NM_000616.3, Pan troglodytes (Chimp)= NM_001009043.1, Canis familiaris (Dog)= XM_850488.1, Bos taurus (Cow)= NM_001103225.1, Mus musculus (Mouse)= NM_013488.2, Rattus norvegicus (Rat)= NM_012705.1 were retrieved from the NCBI Genebank database ( during the month of December 2009. Gene length, number of introns and exons, GC contents, coding and non coding regions and such other characters were obtained. Different bioinformatics tools and algorithms were used for analyzing the nucleotide sequences. CpG program ( was used to predict the CpG islands and the interspersed repeats (SINEs, LINE's, LTR elements, Simple repeats) were identified by using the RepeatMasker (Smit and Green; unpublished data). Secondary structure of CD4 antigen was predicted to determine the effect of amino acids change, using SOPMA library (Geourjon and Deléage.1995), which can be accessed by going to their freely available server ( For phylogenetic analysis, we considered only the amino acid sequences of the selected species. We used ClustalW (Thompson, et al. 1994) for multiple sequence alignment with default settings and PHYLIP 3.5 (Felsenstein. 1981) was used to construct the neighbor joining phylogenetic tree.

Results and Discussion:

This study on the characterization and comparative analysis of Cd4 antigen in six different organisms specialy focused on human and chimpanzee. CD4 is known to be the important gene responsible for interaction with HIV envelope glycoprotein gp120. A schematic representation of the detailed coding versus non-coding contents of this gene is shown in Figure 1. The variability in number and size of the gene, exons and introns in each taxon was observed in addition to the difference in location of gene (Table 1). In four out of six taxa the number of exons was 10, including human and chimpanzee. The difference between human and chimpanzee CD4 gene was approximately 1.22%. 9 introns were present in both primates each. According to the previous studies that the common chimpanzee (Pan troglodytes) are human's closest evolutionary relatives (Goodman. 1999). Chimpanzees are thus especially focused by many scientists to teach us about humans, both in terms of their similarities and differences with human (Tarjei, et al. 2005). The size of CD4 gene in each taxa and its respective exons and introns distribution is shown in (Figure 2). Because of the major portion of introns in nucleotides of the CD4 gene, it was believed that the size of the gene is somehow depended on the size of the introns. The chimpanzee CD4 gene showed the genomic size of 31,712 bp which was 1.22% different from that of human CD4 gene i.e. 31,326bp. Human and chimpanzee have the same number of exons=10 and introns=9 (Table 1). The obvious difference between them is that the size of introns in human is 28,223 bp and in chimpanzee it is 30,159 bp. Total 277 Transcription Factor Binding Site (TFBS) were identified in upstream sequence of human CD4 which was conserved in chimpanzee. A TFBS in the promoter region of target genes in a sequence-specific way, but this contact can tolerate some degree of sequence variation (Mao and Zheng. 2006). Such information can help in predicting and verifying noncoding RNA genes is a hot issue in computational biology (Rivas and Eddy 2001). The program used for finding TFBS was rVISTA 2.0 (Loots and Ovcharenko. 2004). Overall, not much difference was noted in the total gene size of H. sapiens and P. troglodytes, whereas other seven taxa were clearly different in all aspects.

In order to infer the evolutionary position of each individual taxon, a neighbor joining phylogenetic tree was constructed, showing branch distances for the CD4 gene, considering all six taxa (Figure 3). It's quite clear from the Figure that P. troglodytes and H. sapiens are closely related to each other at the CD4 gene, as are R. norvegicus and M. musculus. Its because the mouse genome is about 14% smaller than the human genome (2.5 Gb compared with 2.9 Gb) and approximately 40% of the human genome can be aligned to the mouse genome at the nucleotide level (Waterston, et al. 2002). Further, it is clear that Bos taurus and Canis familiaris recently diverged from the H.sapiens- P.troglodyte's clade.

Given the importance of human CD4 for disease pathogenicity and the evolutionary closeness with mammalian species that we included in our study, we aligned the amino acid sequences of this gene in human and chimpanzee to locate variations in amino acid sequences. Amino acid changes in the CD4 polypeptide chain were also detected. It is apparent from molecular evolutionary studies in mammals that small changes in amino acid compositions between species can result in large phenotypic variation.

Secondary structure was predicted using SOPMA server, shown in the Figure 4. It was observed that human protein structure has 17.25% alpha helix, 8.52% beta turns and 42.14% were coils showing a small change comparative to chimpanzee (Table 2). The result was approximately similar in both the human and chimpanzee's sequences. As there was no big difference in them, it was thought that the conserved regions maintained the special structure of CD4 in both primates.


In this study we present a thorough comparative genomics analysis and evolutionary relationship of the CD4 gene among the sequenced genomes of Homo sapiens, Pan troglodytes, Canis familiaris, Bos taurus, Mus musculus, Rattus norvegicus. Specifcaly focused on Homo sapiens and Pan troglodytes. The CD4 gene in both primates showed small differences but both were more different from the other organisms. The analysis of the CD4 gene in the genomes of all selected taxa, constitute a source for future functional genomic studies.