The Mechanism Behind The HIV Resistance

Published:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

A very small proportion of people can remain negative after repeated HIV-1 viral exposure and this is called as HIV-1 resistance. Understanding the mechanism of HIV-1 resistance is important for HIV-1 vaccine design and Acquired Immune Deficiency Syndrome (AIDS) therapy. In this study, we analyzed the gene expression profiles in CD4+ T cells of HIV-1 resistant individuals and HIV susceptible individuals. 185 discriminative HIV-1 resistance genes were identified using Minimum-Redundancy-Maximum-Relevance (mRMR) and Incremental Feature Selection (IFS) methods. The virus protein targets enrichment analysis of the 185 HIV-1 resistance genes suggests that the HIV-1 protein nef may play important roles in HIV-1 infection. Moreover, we identified 29 infection information exchanger genes from the 185 HIV-1 resistance genes based on virus-host interaction network analysis. The infection information exchanger genes are on the shortest paths between virus target proteins and they are important for coordination of virus infection. They may be useful for AIDS prevention or therapy. Intervention of them may jam the communication of virus targeted proteins and sabotage the HIV-1 infection.

Introduction

Different people have different responses to HIV-1 infection. Only a very small proportion of people are resistant to HIV-1 Infection and they remain negative after repeated HIV-1 viral exposure [1,2,3]. The HIV-1 resistant individuals shed light on design of HIV-1 vaccine which is crucial for containing the spread of HIV. Microarray technology makes it possible to measure the expression of thousands of genes. Accumulation of knowledge on protein interaction can help to understand the mechanisms of biological problems [4,5,6]. These two technologies together may help understanding the mechanism of HIV-1 infection and HIV-1 resistance.

In this study, we analyzed a published data set which included 85 samples from HIV-1 resistant individuals and 50 samples from HIV low-risk negative individuals [7]. The gene expression profiles in CD4+ T cells were measured using NIA/NIH Human Focused Immune Array 4600. 185 discriminative genes were identified with Minimum-Redundancy-Maximum-Relevance (mRMR) principle and Incremental Feature Selection (IFS). The prediction accuracy of the 185-gene signature using Nearest Neighbor Algorithm (NNA) was 85.2%, evaluated by Leave-One-Out Cross-Validation (LOOCV). To interpret relevance of the 185 genes with HIV-1 Resistance, we investigated the virus-host protein interaction network which integrated the HIV-1, human protein interaction database [8] and the STRING database [9]. We found that the 185 genes were enriched on targets of HIV-1 protein nef. This may suggest that nef has important role in HIV-1 infection. What��s more, we identified 29 genes from the 185 genes which may disturber the communication of virus targeted proteins based on network analysis. These genes are on the shortest paths between virus targeted proteins. They are important for exchanging information between virus targeted proteins and coordination of virus invasion. Intervention of such genes may jam the communication of virus targeted proteins and sabotage the infection. They may serve as drug targets for Acquired Immune Deficiency Syndrome (AIDS) therapy or prevention.

Methods

Microarray Dataset

The microarray data used in this work were from Paul J. McLaren��s study [7] of HIV-1 resistant individuals and HIV-1 susceptible individuals. Their data are publicly available at GEO (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE14279). There are 85 samples from HIV-1 resistant individuals and 50 samples from HIV low-risk negative individuals. NIA/NIH Human Focused Immune Array 4600 were used to measure the gene expression profiles in CD4+ T cells of those samples. After averaging the duplicate probes to gene and quantile normalization, we obtained the expression profiles of 1868 genes in 85 HIV-1 resistant samples and 50 HIV-1 susceptible samples.

Minimum-Redundancy-Maximum-Relevance (mRMR) feature selection

As one of the widely used feature selection methods, Minimum-Redundancy-Maximum-Relevance (mRMR) [10] is designed to select the features that can best classify the target variable. The selected features by mRMR are as similar as possible to the classification variable, but meanwhile as dissimilar as possible to each other.

The goal of mRMR is to select features that have minimum redundancy within the features and maximum relevance with the target variable. Both the redundancy and relevance is described by Mutual information (MI) which is defines as:

(1)

where and are vectors; and are the marginal probabilistic densities; is the joint probabilistic density.

Let��s use to represent the whole vector set, as the selected vector set with vectors, and as the to-be-selected vector set with vectors. Relevance of a feature in with classification variable can be calculated by:

(2)

Redundancy of a feature in with all the features in can be calculated by:

(3)

To maximize the relevance and minimize the redundancy, mRMR function integrates Equation (2) and Equation (3):

(4)

Based on the mRMR principle, a feature set is selected:

(5)

in which the index reflects the importance. The feature that fits the Equation (4) better will have smaller index. So is considered to be better than, if a < b.

Nearest Neighbor Algorithm (NNA)

In this study, we used Nearest Neighbor Algorithm (NNA)[11] to classify the samples into HIV-1 resistant groups and HIV-1 susceptible groups. NNA classify a new sample into groups by calculating the similarity of this sample with the samples with known classes. A new sample is assigned into the group of its nearest neighbor which has the largest cosine similarity. The cosine similarity between two vectors and is defined as [12,13,14]:

(6)

Leave-One-Out Cross-Validation (LOOCV)

Leave-One-Out Cross-Validation (LOOCV) has been widely used to evaluate prediction performance [12,13,14,15,16]. During LOOCV, each sample in the data set is used as testing sample in turn and predicted by the model trained by the other samples. The prediction accuracy was used to evaluate the prediction performance:

(7)

where TP, TN, FP and FN stand for the number of true positive, true negative, false positive and false negative samples, respectively.

Incremental Feature Selection (IFS)

mRMR can only sort features according to their importance, but how many fore features should be selected is still unknown. In this study, Incremental Feature Selection (IFS) [12,13,14,16] was used to decide the optimal number of features. By testing all possible top feature sets, the feature set that can achieve the highest prediction accuracy is chosen as the optimal feature set. The possible feature subset can be expressed as:

(9)

where N is the number of all features. The leave-one-out test is used to obtain the prediction accuracies of different feature sets. The feature set that achieves the highest prediction accuracy is the optimal feature set. To visualize the IFS process, we can plot an IFS curve in which x-axis is the number of features and y-axis is the prediction accuracies.

Communication between virus targeted proteins in weighted interaction network

To investigate the virus-host interaction, we downloaded the HIV-1, human protein interaction network from National Institute of Allergy and Infectious Diseases (http://www.ncbi.nlm.nih.gov/RefSeq/HIVInteractions/) [8] and the human protein interaction network from STRING (http://string.embl.de/) [9] which is a large database of known and predicted protein interactions. Since the HIV-1, human protein interaction database used the Entrez Gene ID and STRING used Ensembl Peptide ID, we transformed the Entrez Gene ID in HIV-1, human protein interaction database into Ensembl Peptide ID and Gene Symbol using BioMart [17]. The ID transformed HIV-1, human protein interactions was available in Table S1.

Dijkstra��s algorithm [18] was applied to get the shortest paths between virus targeted human proteins on weighted interaction network. The weight in the protein interaction network was defined as one minus confidence score in STRING v8.3. Inspired by Freeman��s betweenness [19] which measures information flow through network, we calculated the modified betweenness of node in graph :

(10)

where and are nodes from the virus targeted proteins in the network, and is whether the shortest path between node and node go through node . If a node has high modified betweenness, it may be important for exchanging information between virus targeted proteins. Intervention of such node may jam the communication of virus targeted proteins and sabotage the coordination of virus infection.

Results

Identifying the HIV-1 resistance genes

To get the HIV-1 resistance genes, firstly we applied mRMR method to the expression profiles of 1868 genes in 85 HIV-1 resistant samples and 50 HIV-1 susceptible samples. Then all the 1868 genes were ranked by mRMR according to their importance for discrimination. After the mRMR ranked gene list was obtained, we used Incremental Feature Selection (IFS) method to determine the optimal discriminative gene set. Figure 1 shows the IFS curve for optimal gene set selection. The fore 185 genes in mRMR gene list formed the optimal discriminative gene set and the prediction accuracy of them using nearest neighbor algorithm was 85.2% evaluated by Leave-One-Out Cross-Validation (LOOCV). Table S2 gives the complete list of these 185 HIV-1 resistance genes.

The biological roles of the HIV-1 resistance genes

To investigate the function of the 185 HIV-1 resistance genes and their relevance with HIV invasion, we did Gene Ontology (GO) enrichment and HIV protein targets analysis. Table S3 shows the Gene Ontology enrichment results of the 185 genes using GATHER [20] (http://gather.genome.duke.edu/) with adjust p value smaller than 0.01. It can be seen that these 185 genes were significantly enrich onto response to stress, defence response, immune response, cell communication and signal transduction. The Gene Ontology enrichment results were consistent with previous report that the immune responses contributed to protection against HIV-1 infection [21].

There are nine HIV-1 proteins: env, gag, nef, pol, rev, tat, vif, vpr and vpu [8]. We want to know which HIV proteins were crucial for HIV invasion or associated with HIV-1 resistance. We did virus target enrichment of the 185 genes using hypergeometric test [22]. The virus target gene sets were defined as the human protein targets of each HIV-1 protein according to the HIV-1, human protein interactions [8]. The virus target enrichment results of the 185 genes can be found in Table S4 and the 185 genes were enriched onto targets of HIV-1 protein nef. This may suggest that nef has important role in HIV-1 infection. In fact, it was reported that in the early stages of HIV-1 viral life cycle, the expression of nef ensures two major attributes of HIV infection: T-cell activation and the establishment of a persistent state of infection [23]. It was observed in Sydney that the patients infected by nef-deleted virus will take much longer time to progress to AIDS [24]. This clinical observation confirmed that nef has important role in HIV-1 infection. Intervention of the human protein targets of nef may weaken the virulence and infectivity of HIV-1 and strengthen the HIV-1 resistance of normal people.

Network analysis of virus-host interaction

To analysis the virus-host interaction, we integrated the HIV-1, human protein interaction network and the STRING human protein interaction network. We calculated the modified betweenness defined in the Methods section and identified the important communication proteins which transmit information from one virus target protein to another. 29 genes out of the 185 HIV-1 resistance genes were such infection information exchangers. For instance, CBL was on 19,820 shortest paths between virus target proteins and IL2 can control 12,879 virus target communication paths. They are important for exchanging information between virus targeted proteins and maintaining the coordination of virus infection. Table S5 gives the 29 infection information exchanger genes from the 185 genes.

Discussion

In this study, we identified 185 HIV-1 Resistance genes that can discriminate the HIV-1 resistant samples and HIV-1 susceptible samples. The prediction accuracy was 85.2% evaluated by Leave-One-Out Cross-Validation (LOOCV). The virus target enrichment of the 185 genes suggests that the HIV-1 protein nef may play important roles in HIV-1 infection. A novel method to analyze virus-host interaction was proposed and modified betweenness was used to measure the information exchange between virus target proteins. 29 genes out of the 185 HIV-1 Resistance genes were infection information exchangers which are on the shortest paths between virus targeted proteins and may disturber the communication of virus targets. They are important for the coordination of virus infection. Intervention of such infection information exchanger genes may jam the communication of virus targeted proteins and sabotage the HIV-1 infection. They may act as drug targets for AIDS therapy or prevention and may be the key for understanding the mechanism of HIV-1 infection.

Writing Services

Essay Writing
Service

Find out how the very best essay writing service can help you accomplish more and achieve higher marks today.

Assignment Writing Service

From complicated assignments to tricky tasks, our experts can tackle virtually any question thrown at them.

Dissertation Writing Service

A dissertation (also known as a thesis or research project) is probably the most important piece of work for any student! From full dissertations to individual chapters, we’re on hand to support you.

Coursework Writing Service

Our expert qualified writers can help you get your coursework right first time, every time.

Dissertation Proposal Service

The first step to completing a dissertation is to create a proposal that talks about what you wish to do. Our experts can design suitable methodologies - perfect to help you get started with a dissertation.

Report Writing
Service

Reports for any audience. Perfectly structured, professionally written, and tailored to suit your exact requirements.

Essay Skeleton Answer Service

If you’re just looking for some help to get started on an essay, our outline service provides you with a perfect essay plan.

Marking & Proofreading Service

Not sure if your work is hitting the mark? Struggling to get feedback from your lecturer? Our premium marking service was created just for you - get the feedback you deserve now.

Exam Revision
Service

Exams can be one of the most stressful experiences you’ll ever have! Revision is key, and we’re here to help. With custom created revision notes and exam answers, you’ll never feel underprepared again.