Phylogenetic Analysis Of Human Immunodeficiency Virus Biology Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.


The genome and proteins of Human Immunodeficiency Virus (HIV) have been the subject of large amount of research since the discovery of the virus in 1983, two years after the first major cases of Acquired immune deficiency syndrome (AIDS) associated illnesses were reported.


Figure Structure of HIV

HIV is a retrovirus, but a little different in structure from others. It is around 120 nm in diameter and roughly spherical. HIV-1 is composed of two copies of single-stranded RNA enclosed by a conical capsid comprising the viral protein p24, typical of lentiviruses. The RNA component is more than 9000 nucleotides long. This is in turn surrounded by a plasma membrane of host-cell origin.

Human immunodeficiency virus type I (HIV-1) is an enveloped RNA virus, belongs to the Lentivirinae of Retroviridae. Analysis of HIV-1 env genes of virus strains from different geographic regions reveals that HIV-1 can be divided into three main groups: M (major), O (outlier), and N (new). HIV-1 group M has been further subdivided into genetically equidistant clusters of HIV-1 env genes, comprising subtypes A to J.

Figure Structure of HIV genome

HIV has several major genes coding for structural proteins that are found in all retroviruses, and several nonstructural genes that are unique to HIV. The gag gene provides the basic physical infrastructure of the virus, and pol provides the basic mechanism by which retroviruses reproduce, while the others help HIV to enter the host cell and enhance its reproduction. Though they may be altered by mutation, all of these genes except tev exist in all known variants of HIV

gag (group-specific antigen): codes for the Gag polyprotein, which is processed during maturation to MA (matrix protein, p17); CA (capsid protein, p24); SP1 (spacer peptide 1, p2); NC (nucleocapsid protein, p7); SP2(spacer peptide 2, p1) and p6.

pol: codes for viral enzymes reverse transcriptase, integrase, and HIV protease.

env (envelope): codes for gp160, the precursor to gp120 and gp41, proteins embedded in the viral envelope which enable the virus to attach to and fuse with target cells.

Transactivators: tat, rev, vpr

Other regulators: vif, nef, vpu

tev: This gene is only present in a few HIV-1 isolates. It is a fusion of parts of the tat, env, and rev genes, and codes for a protein with some of the properties of tat, but little or none of the properties of rev.

Figure Envelope glycoprotein gp120

The envelop protein gp120 and its variable regions

Envelope glycoprotein gp120 is a glycoprotein exposed on the surface of the HIV envelope. The 120 in its name comes from its molecular weight of 120 kilodaltons. The envelope protein gp120 initiates the process of cell entry by interacting with the main receptor CD4 and one of the chemokine receptors CCR5 or CXCR4. It is derived from the polyprotein gp160, which also contains the transmembrane protein gp41. This polyprotein is encoded by the env gene, present in all retroviruses. Inconsistent and inaccurate numbering, caused by frequent insertions and deletions, has been a serious problem in the literature on gp120.

The Human Immunodeficiency Virus (HIV) can mutate frequently to stay ahead of the immune system. There is however a highly conserved region in the virus genome closed to its receptor binding site. The glycoprotein gp120 is anchored to the viral membrane, or envelope, via non-covalent bonds with the transmembrane glycoprotein, gp41. It is involved in entry into cells by binding to CD4 receptors, particularly helper T-cells. Binding to CD4 is mainly electrostatic although there are van der Waals interactions and hydrogen bonds. Several studies have shown the usefulness of phylogenetic methods to confirm known or to determine unknown relationships among HIV isolates. The V3 and V4 region are two variable regions of pg120 that most researches focused on.In nearly all published studies a region in the env gene containing the V3 loop was used for comparison alone or in combination with the p17 region located in the gag gene or the RT region located in the pol gene. In one study only the p17 region was analysed, in another p17 and RT, and in yet another all three regions.

Phylogenetic Analysis


HIV Type 1



























HIV Type 2

















AGM / clone GRI-1







In this study, phylogenetic treed of HIV env gene and the protein gp120 coded by this gene were reconstructed. Also, the phylogenetic analysis of them was implemented. 24 samples were collected from the Pfam database of gp120 (, including 13 strains of HIV type 1. For comparison, 6 strains of HIV type 2 and 5 strains of SIV (Simian immunodeficiency virus) were also included. Alignments of V3 loop and V4 loop was obtained. Then the phylogenetic tree of env gene and gp120 were reconstructed based on the alignment. The 24 isolates were shown in the left table.


Aligned the nucleotide sequence of 24 samples with ClustalX software, results were acquired and used for phylogenetic tree analysis after artificial adjustment. The phylogenetic tree was reconstructed with Neighbor-Joining method.

To obtain an estimate of genetic variation in each domain (V3, V4) of the env gene, an analysis was performed on the complete protein sequence alignment. This analysis shows the number of polymorphic sites in an overlapping window of nine amino acids, applied to each of the protein sequence alignment.

Figure 5 Multiple alignment of V3 loop derived from the 24 samples.

Figure The figure shows an alignment of 24 samples. In the alignment, we can see some conserved regions and variable regions, and the V3 and V4 loop were boxed.

The V3 loop of the outer membrane gp120 of HIV-1 is the domain with the highest variation rate; however, it is also conservative relatively, especially at the beginning and the end of it. In all of the HIV-1, HIV-2, and SIV, there is invariant cysteine at both the beginning and the end, which can form a disulfide bridge. Also, a difference between HIV-1 and HIV-2, SIV can be observed.

The cysteine and the disulfide bridge play an important role in the process of cell entry. The V3 loop binds to the cell surface in a conformation dependent manner and its N-terminal domain is responsible for the interaction. It has been observed that V3 loop can enhance the entry of its own HIV strains. Pre-treatment of the target cells with V3 peptides followed by removal of the peptides also enhanced infectivity, indicating that the binding of the peptides to the target cells also plays a role in this enhancement. The V3 stem is responsible for gp120 binding to the CCR5 N-terminus. Both the V3 crown and stem are required for soluble gp120 binding to cell surface CCR5. The V3 crown interacts with residues in the EC of CCR5, most likely ECL2. The V3 crown alone is necessary and sufficient to direct exclusive usage of CCR5 or CXCR4. The V3 stem, despite being able to mediate specific binding to CCR5 Nt sulfopeptides, is not the main determinant of coreceptor Figure Multiple alignment of V4 loop derived from the 24 samples.


The V4 loop, similar with V3, also has cysteine at both there is invariant cysteine at both the beginning and the end, which form a disulfide bridge. A consistent feature of this analysis is that for most MIPs, higher levels of variation were observed in the C3 and V4 regions than in the V3 loop, and length variation also was more frequently encountered in V4.

Phylogenetic tree of phylogenetic tree of env gene

The phylogenetic trees of env gene and gp120 were reconstructed by using clustalW ( and phylodraw with Neighbor-Joining method.

Figure 7 Phylogenetic tree of env gene.F:\Users\Leon\Documents\My Dropbox\BIOL 550\Data\env gene.bmp

Figure 8 Phylogenetic tree of gp120.F:\Users\Leon\Documents\My Dropbox\BIOL 550\Data\gp120.bmp

Discussion and conclusion

The V3 loop of the outer membrane gp120 of HIV-1 is the domain with the highest variation rate; however, it is also conservative relatively. Furthermore, the V3 loop is one of the HIV-1major antigenic epitopes and is the principal neutralizing determinant (PND). The variation of the same virus strain in different individuals is limited. To some extent, the variation of V3 region could not threat the existence of HIV, while variation out of selection pressure makes virus difficult to live. Tetramer on the tip of V3 loop is the functional and immunogenical domain, any little change of which could diminish the antibody combination significantly.

Since CD4 receptor binding is the most obvious step in HIV infection, gp120 was among the first targets of HIV vaccine research. Efforts to develop HIV vaccines targeting gp120, however, have been hampered by the chemical and structural properties of gp120, which make it difficult for antibodies to bind to it. gp120 can also easily be shed from the surface of the virus and captured by T cells due to its loose binding with gp41. A conserved region in the gp120 glycoprotein that is involved in the metastable attachment of gp120 to CD4 has now been identified and targeting of invariant region has been achieved with a broadly neutralising antibody, b12.

Research presented at the 17th International AIDS Conference in Mexico City provided the possibility of a new vaccine based on antibodies that hydrolyze or cleave apart the gp120 protein, rendering it incapable of binding to lymphocytes. This binding is the first step in the process of HIV infection. The antibody, IgA, is present in all human beings, but its potential for combating HIV was first recognized in patients with lupus, who exhibited both an abnormal resistance to HIV infection and an abnormally high concentration of IgA. Scientists confirmed that IgA purified from the blood plasma and saliva of HIV-seronegative subjects cleaved gp120 more effectively than the more naturally abundant IgG did, which had little or no effect. To combat HIV, IgA could be administered in large doses as a drug to people already infected. Researchers are yet to make a vaccine which stimulates the body to increase its own production of IgA.

NIH research published in Science reports the discovery of antibodies that bind 91% of HIV-1 strains at the CD4bs region of gp120 potentially offering a therapeutic and vaccine strategy.