The Advent Of Molecular Typing Techniques Biology Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Tuberculosis is an infectious disease affecting one third of the global population through latent infection with one quarter of TB burden being contributed by India (Singh et al., 2007b) (Singh et al., 2007b;Sola et al., 2003;Caws et al., 2008). Global estimates of mortality due to TB are 1.1 to 1.7 million deaths among HIV-negative people and an additional 0.45-0.62 million among HIV-positive people (WHO, 2009b;WHO, 2009a).Diseases like HIV infection (Goldfeld and Ellner, 2007) and diabetes mellilitus (Jeon and Murray, 2008) does have a negative effect on people with latent infection and can result in chances of developing active TB increased from 1 in 10 to 1 in 3 (WHO, 2009a;WHO, 2009b;Stevenson et al., 2007).

The advent of molecular typing techniques has however improved our understanding of the pathogen that causes disease, Mycobacterium tuberculosis. This review looks at the genetic, transmission and geographic description of ancient Mycobacterium tuberculosis lineages responsible for the cause of tuberculosis in the Indian sub-continent and the Far East on account of their huge contribution to the global tuberculosis burden. More emphasis was placed on strain families other than Beijing which has already been described in detail in other published reports.

The members of the Mycocacterium tuberculososis Complex (MTBC) are closely related to each other and consist of the species Mycobacterium tuberculosis (Mtb), Mycobacterium africanum, Mycobacterium bovis, Mycobacterium canettii, Mycobacterium microti, Mycobacterium pinnipedii, dassie bacillus, oxy bacillus and Mycobacterium caprae. The MTBC members have evolved from the same common progenitor(Gutierrez et al., 2006;Helal et al., 2009;Wirth et al., 2008) and share 99% similarity in their genome. This is a result of the bug evolving and expanding clonally after undergoing an evolutionary bottleneck some 15,000-20,000 years ago which led to speciation(Sreevatsan et al., 1997). Recent published works have now estimated that speciation occurred between 20,000-35,000 years ago from a common ancestor termed M. prototuberculosis (Gutierrez et al., 2005a;Wirth et al., 2008). Following speciation from prototuberculosis, different strains of Mtb evolved and are evident from the different genotyping methods and cluster analyses done on Mtb (Filliol et al., 2006;Flores et al., 2007b;Brosch et al., 2002b). The bacterium is said to mutate naturally and a number of mechanisms have been postulated in this regard (Dos Vultos et al., 2008).


A number of genotyping methods have been used in epidemiological, phylogeny and evolutionary studies to study strain variation and distribution. These methods are based on genomic changes which can be either be single base involving synonymous or non synonymous nucleotide polymorphisms (SNP) or large sequence changes involving insertions and deletions. Synonymous SNPs (sSNPs) does not result in amino acid changes whilst amino acid changes occur in the non synonymous SNPs (nsSNPs)(Filliol et al., 2006;Sreevatsan et al., 1997). Changes observed in large sequence regions can be in the form of insertions, duplications or deletions. Each method based on these genomic changes has its own merits and disadvantages and a combination of methods are often used in studies in order to increase the amount of information gained (Sola et al., 2003;Gutierrez et al., 2006;Narayanan et al., 2008;Mathuria et al., 2008). The genotyping methods which are used in TB research include IS 6110 insertion element based Restriction Fragment Length Polymorphism (RFLP), SNP Analysis, Large Sequence Polymorphism Analysis (LSPs) Spoligotyping and Mycobacterial Interspersed Repetitive Units (MIRU) (van Soolingen et al., 1991;Sreevatsan et al., 1997;Kamerbeek et al., 1997;Flores et al., 2007b;Filliol et al., 2006).


RFLP has been the gold standard for genotyping mycobacteria and it is based on the location and number of the insertion sequence IS6110 (van Soolingen et al., 1991;Narayanan et al., 2008). The marker has a relative short molecular clock and can differentiate isolated strains in epidemiological studies. The marker however, does not have good resolution for isolates that have less than 5 bands of the sequence (Singh et al., 2007b;Gutierrez et al., 2006).


Spoligotyping is based on the analysis of the Direct Repeat (DR) locus for observation of spacer sequences amongst repeat sequences (Kamerbeek et al., 1997). Deletion of these spacers at specific positions results in an unambiguous 43 digit pattern based on the presence or absence of a spacer. The spacers are numbered 1-43 and each spacer has got a unique number within this range associated with it. Spoligotype names are based on these patterns and a database of these has been created termed the SpolDB4/SITVIT database (Brudey et al., 2006a;Institut Pasteur de Guadeloupe, 2010). The occurrence of two or more strains in the database having the same spoligotype pattern constitute a "shared type (ST)" with numbers designated to each pattern (Brudey et al., 2006a). A group of STs with similar patterns constitute a clade or strain family. Spoligotyping is particularly useful in identifying strain families that are present in a particular geographical setting and has been utilised in multiple studies as is evidenced in the SpolDB4/SITVIT database(Ferdinand et al., 2005b;Kibiki et al., 2007;Institut Pasteur de Guadeloupe, 2010). A drawback of this technique is that it can exhibit convergence evolution as a result of different spacers being deleted at different times which can result in two different strains showing similar patterns to that observed in their past(Filliol et al., 2006;Flores et al., 2007b).


Mycobacterial Interspaced Repeat Units/ Variable Number Tandem Repeats (MIRU/VNTR) technique is based on the variation of micro-satellite repeat sequences at different loci within the bacterium genome (Frothingham and Meeker-O'Connell, 1998;Supply et al., 2001;Oelemann et al., 2007). These are scattered throughout the whole genome of M. tuberculosis and different systems have been developed which utilise different sets and numbers of loci (Supply et al., 2000;Oelemann et al., 2007). Combinations on the number of repeats at each locus are descriptive of a strain.


Large variable regions due to insertions and deletions give rise to regions of difference (RD) among M. tuberculosis strains families (Brosch et al., 2002a). LSPs make use of specific DNA sequences that have been deleted over time and the ones that coincide with unique events are particularly useful in phylogeny and evolutionary studies (Flores et al., 2007b). A particular deletion termed "M. tuberculosis specific deletion 1 (TbD1)" is used to classify strains into modern or ancestral types based on whether the deletions as occurred or not (Brosch et al., 2002a).


The sSNPs represent changes that have not resulted from selective pressure and can thus be used as markers to show evolutionary relationships among different strains (Gutacker et al., 2002). From an evolutionary point of view, the M. tuberculosis can be grouped into 3 Principle Genetic Groups (PGGs) based on combination of polymorphisms in the gyrA and katG (Sreevatsan et al., 1997). Group 1 has the combination katG codon 463 CTG (Leu) and gyrA codon 95 ACC (Thr); group 2 katG 463 CGG (Arg) and gyrA codon 95 ACC (Thr) and group 3 katG 463 CGG (Arg) and gyrA codon 95 AGC (Ser) (Sreevatsan et al., 1997). PGG1 is the most ancestral followed by PGG2 with PGG3 being the most recently evolved.

M. tuberculosis strains have been differentiated using these techniques and strain lineages based on SNPs, Regions of Difference (RD)s, RFLP, MIRU types and Spoligotypes have been constructed (Brudey et al., 2006a;Gagneux et al., 2006a;Baker et al., 2004;Filliol et al., 2006).


Through the various stages of evolution driven by stressful environments and genetic drifts among other things, the tubercle bacilli became an obligate parasite of the early hominids of the Horn of Africa(Gutierrez et al., 2005a;Hershberg et al., 2008;Wirth et al., 2008). This was contributed to by the small group numbers of the hominids who survived by hunting and gathering who were infected by the bug and the stressful environment inside the same host to the bug (Gutierrez et al., 2005b). It can be inferred that the bug had a limited host range in this regard as the host range existed in small groups that were scattered. Considering that there is 99.9% similarity in the genome of MTBC members with the exception of M. canettii, the members must have come about as a result of clonal expansion from a successful progenitor (Dos Vultos et al., 2008;Gutierrez et al., 2005b). This subsequently led to speciation and the distinct host ranges that the members of the MTBC have (Sreevatsan et al., 1997;Brosch et al., 2002a). It has been hypothesised that there was an out of Africa migration by the early hominids into Asia followed by a back migration back into Africa and the rest of the world (Wirth et al., 2008;Hershberg et al., 2008). As man dispersed globally and evolved as a result of the different environments he encountered(Campbell and Tishkoff, 2010), the obligate tubercle bacilli co-evolved with him(Gutierrez et al., 2005a;Hershberg et al., 2008). The migration into the southern parts of the Indian sub-continent further spread with time to the northern parts of India, Europe and East Asia establishing what we now know as the modern lineages of Mtb (Hershberg et al., 2008). The advent of farming resulted in population and population density increments compared to the small hunter-gatherer groups of the early hominids (Wirth et al., 2008). This resulted in an increase of infection by the bug coupled to the associated spreading out of people in densely populated area(Hershberg et al., 2008)

With the continued relationship of the tubercle bacilli with its human host, a stable association developed and is evident in the geographical distribution of the bacterium which correlates with earlier human migrations, or origin of birth, both in the near and distant past (Gagneux et al., 2006a;Aaron E.Hirsh et al., 2004). Studies have revealed that certain strains of M. tuberculosis are more adapted to cause disease in people of particular origin of birth. An example of this was the association that was found in the human host for a polymorphism in the TLR2 gene with disease caused by the Beijing strain(Caws et al., 2008)


As mentioned earlier, PGG1 group members are the most ancestral of the 3 PGGs and are composed of 4 strain families following the spoligotyping classification system. These are MANU, East African India (EAI), Central Asian (CAS) and W-Beijing. The clades are predominant in specific geographical areas and in particular in the Indian sub-continent and East Asia (Brudey et al., 2006b;Sun et al., 2004;Singh et al., 2004b;Singh et al., 2007b). A search for information on the strains as a group was absent from literature reviewed. Furthermore, with the exception of the Beijing family, there was also a dearth of information on individual strain families. This was especially evident for the MANU family. Considering that the Indian sub-Continent and the Far East contribute more than a 3rd of global TB burden (WHO, 2009a;WHO, 2009b), an understanding of these strain families is an important step in the global control of the disease (Sola et al., 2003).

Estimated epidemiological burden of TB, 2008 (Adapted from WHO. Global tuberculosis control: a short update to the 2009 report)


Lineage Description

The MANU lineage was initially described in India from a study that incorporated spoligotyping (Singh et al., 2004b). The name is derived from the name of a Hindu mythological figure supposed to be the world's first king and father of the human race(Singh et al., 2004b). It is described as an ancestral strain on account of being TbD1+ (Flores et al., 2007a) and has been divided into 3 sub-lineages according to the SpolDB4 database. The strain family can also be considered to be part of the Indo-Oceanic lineage on account of being RD239 positive according to Gagneux et al (Gagneux et al., 2006a). It should however be noted that the geographical distribution of MANU does not follow the typical one of the Indo-Oceanic lineage for which the name "Indo-Oceanic" is associated (Gagneux et al., 2006a). MANU 1 has single spacer missing at position 34 whilst MANU 2 has spacers deleted at positions 33 and 34 with MANU 3 having 3 spacers deleted at positions 34, 35 and 36 (Brudey et al., 2006a).

Family Shared Type No. Spoligotype Pattern

MANU 1 100 

MANU 2 54 

MANU 3 1378 

Fig 1: MANU Spoligotypes (Brudey et al., 2006a)

In a recent study in Egypt, a single isolate of a MANU ancestor harbouring all 43 spacers was isolated in Egypt in addition to a variant of the same (Helal et al., 2009). The absence of deletion analysis on this study in Egypt can however raise doubt as to whether the strain identified as MANU ancestor was indeed such. This is because homoplasy can exist for spoligotypes with a full set or few deletions of spacers (Flores et al., 2007b). Beijing strains with a full set of spacers have been observed and these were classified as Beijing based on the deletion of RD105 (a marker for all Beijing strains) and being TbD1- . In the absence of such deletion analysis and basing classification only on spoligotyping patterns, there is a chance that the identified MANU ancestor could be a Beijing strain ancestor

In addition to being TbD1+, the RD239 deletion is present in all MANU strains. It is also likely that it would have 2 alleles at locus 24 in MIRU/VNTR analysis as others have shown that isolates harbouring TbD1+ have 2 alleles at locus 24 in MIRU/VNTR analysis (Sun et al., 2004).

IS6110 Copy Numbers

To or knowledge, the number of IS6110 elements in the MANU family has not been described. This needs to be done as it might provide greater insight into whether IS6110 can better discriminate isolates in epidemiological studies.

Geographical Spread

MANU strains have been have been reported in different countries but it is only in Egypt where it was found to be the predominant strain and the city of Delhi (Helal et al., 2009;Singh et al., 2004b;Singh et al., 2007a;Eldholm et al., 2006). These countries include India, Saudi Arabia, Madagascar, Tanzania, Tunisia, and USA (Eldholm et al., 2006;Al-Hajoj et al., 2007;Ferdinand et al., 2005a). A few isolates have also been isolated from South Africa and these have mainly been MANU 2 strains (Institut Pasteur de Guadeloupe, 2010)

Host Pathogen Association and Transmission

As mentioned above, there are very few geographic locations where MANU is the predominant strain. It is only in Egypt where this has been observed and it is here that there can be an association between the host and pathogen (Helal et al., 2009). It is possible that the strain has been in Egypt for a long time. It is known that Mtb strains were found in mummies and termed M. africanum like (Helal et al., 2009). It would be interesting to do further analyses on the strains to see how close they are to the MANU strain family.

Transmission of the strain family was established in the Egyptian study and was evident from the clustering and high proportion of MANU observed (Helal et al., 2009). No data analysis was however done on how these strains segregated according to age cohorts. This would have given an idea on whether the predominance of MANU was due to reactivation in the older population or active transmission in the younger or possibly a combination of both. A table showing distribution amongst age group was shown but this was a composite of all isolates in the area which makes it difficult to ascertain the transmission of a single family within the studied community. A similar reporting of results was done in India with respect to age where MANU was found to be a predominant strain in Dehli (Singh et al., 2004b). Studies have shown that there can be strain association with age in an area as was shown in India for EAI and CAS lineages (Singh et al., 2007b) and it can be inferred from these studies whether a strain is emerging in an area by transmission through affecting the younger population, or predominantly reactivating in the older population.

The MANU 2 sub-lineage has been observed to be the most prevalent of this family (ST54) as has been reported in different studies (Singh et al., 2004b;Helal et al., 2009;Al-Hajoj et al., 2007) and also from the entries in the SITVIT database(Institut Pasteur de Guadeloupe, 2010)

A number of variants of MANU 2 were also observed in Egypt and the variant types were more in this sub-lineage than for either MANU 1 or 3. These were not input into the SITVIT database at the time the study in Egypt was published (Helal et al., 2009)



MANU 2 Variants in Egypt

 `


Fig 2: MANU 2 Variant Spoligotypes (Helal et al., 2009)

In the absence of a second genotyping method besides spoligotyping to describe MANU clustering, it is difficult to ascertain whether there is transmission of a particular strain. This is due to the fact that spoligotyping can overestimate clustering of some lineages as is the case for Beijing (Flores et al., 2007b;Warren et al., 2002). More research in this regard needs to be done on MANU.

East African Indian (EAI) [Part of the Indo-Oceanic (lineage according to Gagneux et. al)]

Lineage Description

The East African Indian (EAI) clade is divided into 9 sub-lineages having common deletions at spacers 29-32 and spacer 34 (Brudey et al., 2006a). EAI_5 is the most ancestral of the spoligotypes of the clade with only the above mentioned spacers deleted.

Family Shared Type No. Spoligotype Pattern

EAI-5 236 

EAI 1-SOM 48 

EAI2-Manilla 19 

EAI2-Nonthaburi 89 

EAI3-IND 11 

EAI4-VNM 139 

EAI6-BGD1 591 

EAI6-BGD1 1898 

EAI8-MDG 109 

Fig 1: EAI Spoligotypes (Brudey et al., 2006a)

The other sub-classes exhibit further deletions from the EAI_5 type in different numbers and positions. RD analysis shows the lineage to also have RD239 (shared with MANU) deleted and is named the Indo-Oceanic lineage under the LSP classification system of Gagneux et al. (Gagneux et al., 2006a). Additionally, the lineage is TbD1+ and has 2 alleles at locus 24 in MIRU/VNTR analysis (Sun et al., 2004). The lineage can thus be considered to be of an ancestral nature (Brosch et al., 2002a).The group also belongs to PGG1 based on SNPs at the katG and gyrA genes (Sreevatsan et al., 1997). Holistically taken, the lineage is part of an ancestral sub-set of PGG1with MANU.

Geographical Spread and Transmission

The Indo-Oceanic lineage is predominant in South East Asia, East Africa and some parts of Europe and Oceania (Brudey et al., 2006a;Singh et al., 2007a). The sub-lineages have got different frequencies in different geographical locations. The highest numbers of ST11 of EAI3_IND for example has been reported in the Indian sub-continent and South East Asia. In India, this is found in the southern parts of the country and is the responsible for the majority of TB cases (Singh et al., 2004b;Singh et al., 2007a). Some areas on the other hand can have a mixture of sub-lineages predominating with each sub-lineage being composed of different ST. This scenario is evident in Bangladesh (Rahim et al., 2007) and Myanmar(Phyu et al., 2009) and Madagascar(Ferdinand et al., 2005b).

Host Pathogen Association and Transmission

India provides the classic case of host-pathogen association of the lineage. It has been observed that the strain is prevalent amongst people found in the south of India as opposed to the north. This has been deduced to be a result of the people in the north having a different origin and introduction to the bug when compared to the people in the south (Singh et al., 2004a;Singh et al., 2007b;Arora et al., 2009). Additionally, the strain family is associated with people who are older than 46 years old and in those people who had never taken TB treatment before (Arora et al., 2009;Singh et al., 2004b;Singh et al., 2007b). EAI also demonstrated an association with people in the northwest of Madagascar who are believed to have originated in South East Asia where the strain is prevalent (Ferdinand et al., 2005a). As mentioned earlier, different strain types are predominant in different geographical areas and in the south of India, ST11was the main causative agent of disease. Contrasting the Indian situation where the older people of society were associated with disease, Myanmar had most people > 45 having active disease (Phyu et al., 2009). It should be borne in mind though that this statistic was given for the combined transmission of Beijing and EAI. Additionally, the EAI STs transmitting were different from the ones found in South India.

IS6110 Copy Numbers

The EAI strain family has been characterised by low copy IS6110 numbers in India (Das et al., 2005). This scenario is seen in a number of other countries including Singapore and Madagascar, (Sun et al., 2004;Ferdinand et al., 2005b). It was observed in India that whereas RFLP based on IS6110 had shortfalls in discriminating between isolates with low copy number of IS6110, spoligotyping was highly efficient (Narayanan et al., 2008). In the south of India with up to 80% of TB cases being accounted for by this strain family with low copies of IS6110, the ST of the family was ST11. On the other hand, ST89 and ST292 in Yangon, Myanmar were observed to have IS6110 numbers ≥7 (Phyu et al., 2009). The situation in Madagascar where ST109 was found, the ST had low copies of IS6110 (Ferdinand et al., 2005b). Taken together, the EAI strain family was generally observed to have low copies of IS6110 and that high copy numbers are the exception. This was also only observed for ST89 and ST292.

Central Asian (CAS) [East African Indian lineage (according to Gagneaux et. al.)

Lineage Description

The Central Asian lineage is also known as the East African Indian lineage under the naming system by Gagneux et al based on LSP (Gagneux et al., 2006b). Under the spoligotyping description, the clade has spacers 4-7 and 23-34 deleted (Singh et al., 2004b;Brudey et al., 2006a).

Four sub-lineages are described in the SpolDB4 database with deletions as indicated in the figure below.

Strain Shared Type No. Spoligotype Pattern

CAS1-Delhi 26 

CAS1-Kili 21 

CAS1-variant 25 

CAS2 288 

The CAS clade is TbD1- (Brosch et al., 2002a) and has has a characteristic RD 750 deletion (Gagneux et al., 2006a) . Additionally, a study in India found that a silent mutation in codon 65 of pncA C→T mutation is lineage specific for all CAS strains. However, this was not observed for 2 isolates of ST26 out of a total of 16 isolates (Stavrum et al., 2009) indicating that their might be some CAS strains that might not harbour this mutation.

Geographical Spread

Information from the SITVIT database which is an upgrade from the SpolDB4 database (Al-Hajoj et al., 2007) does show that most isolates of CAS in the database come from the Indian sub-continent. Studies have also indicated that the CAS lineage is predominant in India, Pakistan, Central Asia(Brudey et al., 2006a;Singh et al., 2007a;Stavrum et al., 2009). The sub-lineage found in the Indian sub-continent is typically the CAS1_Delhi (ST26) having prominence in Northern India and Pakistan (Singh et al., 2007a;Stavrum et al., 2009;Tanveer et al., 2008). CAS1_Kili (ST21) has got prominence in Tanzania (Kibiki et al., 2007) whilst CAS2 has got prominence in Madagascar (Ferdinand et al., 2005a). Other geographical areas with a large proportion of the strain include the Middle-East (Al-Hajoj et al., 2007). Furthermore, areas that have had large proportions originating from the Indian sub-continent, middle East, Central Asia are also seen to harbour a significant proportion of their TB burden attributed to this lineage(Singh et al., 2004b;Singh et al., 2007a;Brudey et al., 2006a;Brown et al., 2010) These include areas where the movement of people occurred in the distant past as well as the near past in areas such as Africa, Europe and the United States(Singh et al., 2004b;Singh et al., 2007b;Al-Hajoj et al., 2007)

Host Pathogen Association and Transmission

The classic case of host-pathogen association for the CAS lineage has best been presented in (Arora et al., 2009) India where the North and South have differences in the origin of their human populations (Arora et al., 2009) . People in the south of the country have been less associated with the CAS lineage whereas the ones in the North have been positively associated (Singh 2004, 2007; (Arora et al., 2009). The prominence of CAS in Madagascar and Saudi Arabia can also be attributed to the movement of people from the sub-continent and central Asia in the distant and relative near past ( Ferdinand et al., 2006; Hajoj 2007). The situation is also seen in the western world where the lineage is associated with people of Asian origin even when they were born in the USA (Aaron E.Hirsh et al., 2004). Taken together it is clear that the human and pathogen have co-evolved and established a transmission relationship as reported in different studies by others (Hirsh et al., 2004).

Studies from India had indicated that the CAS strain family was mostly transmitting in the age group below the age of 45 years and this was in contrast to EAI which affected the older population (Singh et al., 2004b;Singh et al., 2007b). ST26 was the predominant shared type responsible for CAS transmission in the Indian sub-continent (Singh et al., 2004b;Singh et al., 2007b) and in general, particular shared types greatly contributed to CAS transmission where CAS was a major strain. In Tanzania the major strain type was CAS1_Kili (ST21).

IS6110 Copy Numbers

CAS members generally have greater than 5 IS6110 copies as is evidenced from the major areas where the lineage is found (Singh et al., 2004b;Singh et al., 2007a;Stavrum et al., 2009;Tanveer et al., 2008;Arora et al., 2009). This would thus make IS6110 an appropriate typing method for distinguishing strains for epidemiological studies.

Beijing [East Asian (Gagneaux-LSP Name)]

Lineage Description

The classic spoligotype pattern of the Beijing lineage is characterised by the deletion of spacers 1-34(Brudey et al., 2006b). The lineage is specific for deletion RD105 and additionally RD207 which leads to the classic spoligotype pattern through the loss of spacers 1-34(Flores et al., 2007b). Deletion of RD105 alone encompasses all Beijing types whereas the deletion RD207 entails ancestral forms of the strain without the classical deletion of spacers 1-34 in the DR region(Flores et al., 2007b). The ancestral forms include strains with full complements of all 43 spacers as well those with 1 spacer missing(Flores et al., 2007b)

Strain Shared Type No. Spoligotype Pattern

Beijing 1 

Fig 6: Classical Beijing(Brudey et al., 2006a)

Strain Shared Type No. Spoligotype Pattern

Beijing 777777777777771 

Beijing 777777767777771 

Fig 7: Ancient Beijing(Flores et al., 2007b)

Geographical Spread and Transmission

The lineage is ubiquitous in distribution with particular predominance in China, Central Asia, Far East and Eurasia(Brudey et al., 2006a;Mokrousov, 2008;Rahim et al., 2007;Phyu et al., 2009). It has also been demonstrated to be associated with people of a young age and can thus be envisaged to be an emerging strain(Singh et al., 2004b;Singh et al., 2007b;Ferdinand et al., 2005b;Phyu et al., 2009). Additionally, the strain is associated with hyper virulence, drug resistance and the ability of drug resistant strains to transmit without the loss of fitness due to the acquisition of drug resistance (Stavrum et al., 2009;Buu et al., 2009).

Host Pathogen Association

The Beijing lineage is said to have originated in China and is associated with people of Chinese origin both in the near and distant past(Phyu et al., 2009;Sun et al., 2004;Baker et al., 2004;Filliol et al., 2006). This comprises both populations that originated directly from China and from secondary countries that had populations of Chinese immigrants(Mokrousov, 2008). The frequency in Singapore of people with Chinese origin being infected with disease is an example of the association of Beijing with Chinese host type (Sun et al., 2004).


Mycobacterium tuberculosis, the causative agent of tuberculosis in man, is part of the MTBC whose members are 99.9% similar their genome with the exception of M. canetti. This is a result of clonal expansion following an evolutionary bottleneck some 20,000 -35,000 with subsequent speciation. In spite of this, MTb is diverse in its genome with its different strain families exhibiting specific characteristics when analysed with different genotyping tools. Furthermore, strain families co-evolved with their human host in different environments for the human hosts. As different human populations evolved differently in diverse geographical locations, MTb strain families had formed stable associations with them that resulted in different expressions of disease both at transmission level and virulence. This has been demonstrated in different geographical locations and implications exist for this in the global control of tuberculosis.

The Indian Sub-continent and the Far East contribute extensively to the global tuberculosis burden and the predominant strains that are predominant in the area are the evolutionary older PGG1 members. Of significance in this area, was the observation that specific Shared Types of strains are predominant in transmission and that there was an association between these and host populations which encompassed origin of host. Association also existed at age cohort level and was exhibited differently for different strain types. EAI for example, was associated with the older generation and was more associated with the southern parts of India. Additionally, the human host in these areas had inefficient BCG efficacy. This was in contrast to CAS which was more associated with younger cohorts and was thus seen as an emerging strain in India. Looking at the genotyping methods, the discriminatory power was different for some STs within strain families. ST11 for example had low IS6110 copy numbers which makes RFLP a less appropriate genotyping tool in areas where ST11 is predominant. It was noted that strain families harbour specific deletions or SNP that differentiate them from each other.

Looking holistically at global TB control, the Indian sub-continent and the Far East make a huge contribution to the global TB burden through predominantly ancient strains. The strains associate with different host populations based on origin and age in some cases exhibit differences in the how TB manifests. Considering that the world is becoming more global and taking into account the above statement, it is imperative that appropriate genotyping tools be used in specific geographical areas to aid in decision making on the transmission and control of strains. This is of use in determining whether progression in an individual is due to reactivation or recently acquired in the individual as well as whether an epidemic is spreading. Considering the developing world where the threat of TB is greatest in conjunction with HIV/AIDS, fast and effective genotyping methods are necessary which can discriminate strains to the levels of determining activation, infection and re-infection. Cost and reliability of methods would be important factors in this case as well as the transfer of knowledge of these methods to these areas. Furthermore, differences in hosts who have had particular predominant strains in their area to the current vaccines should also be considered when developing new vaccines. This is particularly important in the case of PGG1 in the Indian Sub-continent and the Far East bearing in mind the contribution these strains make to the global TB burden. Considering the MANU strain in particular, much more genotyping work needs to be done considering the potential evolutionary position the strain may have in MTb. One might ask whether MANU gave rise to certain strain families and what characteristics were lost or gained in this evolution. This could contribute to the overall evolution of the bug.

In conclusion, studies have demonstrated that though MTb is 99.9% similar in its genome, diversity exists among the strain families which have implications in transmission and association with the human host. Appropriate markers or genotyping methods are important in the understanding and control of tuberculosis and where these may be deficient, appropriate tools need to be developed. In addition, host- pathogen phylo-geography association needs to be considered when developing vaccines and high burden TB countries need to be considered in this regard. In the Indian sub-continent and Far-East, PGG1 members are the predominant strains which need to be incorporated.