a study of mitochondrial dna

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.


D-Loop mtDNA

Mitochondria are organelles found in all eukaryotic cells, whose prime function is the production of adenosine triphosphate (ATP), which is utilised as a source of chemical energy. In addition to its role as a "cellular power-plant" the mitochondrion is involved in maintenance of cellular metabolism. Each mitochondrion has a chromosome composed of DNA that is significantly different to the DNA found in the nucleus. The mitochondrial chromosome is a smaller, circular structure and its DNA is fundamentally different, in that is exists outside the nuclear genome, (unlike Microsatellite DNA) and possesses a number of characteristics that lend itself to phylogenetic analysis. It has a five-fold higher rate of nucleotide substitution compared to nuclear DNA and has the unique property of only being inherited along the maternal pathway. These properties make it immensely useful for studying both the evolution and division of species and uncovering the matrilineal haplotype of breeds.

In equines mtDNA exists as a closed circular molecule of approximately 16,660 base pairs . They demonstrated that the number of base pairs is highly variable, with between 2 and 29 copies of the GTGCACCT motif. It is inherited clonally, as a single copy and does not undergo recombination. Thus genetic variation develops almost exclusively as single point mutations. Mitochondrial DNA evolves at a much greater rate than nuclear DNA, with certain regions evolving more rapidly than others. In particular, the control loop ( displacement or d-loop) evolves fastest of all regions and this lends itself to studies of livestock biodiversity .

The D-loop is a short segment of the normally double stranded DNA molecule, in which the two strands are separated by a third strand of DNA. This third stand has a base sequence that replicates the pairs of the heavy strand (H-strand) of the main molecule, which it displaces. Instead it is lightly hydrogen bonded to the light strand (L-strand). This D Loop occurs in the main, non-coding area of the mitochondrial DNA molecule.

Phylogeny of horse breeds

The variation in the D-loop has been used by a large number of studies to examine matrilineal diversity within equine breeds . In addition it has been used to study the evolutionary relationships between equine breeds . As well as contemporary samples, Vila et al (2001) used mtDNA obtained from archaeological samples of Alaskan wild horses preserved in permafrost to shed light on the history of horse domestication. Vila et al. examined mtDNA control region sequences from 191 domestic equines and found considerable matrilineal diversity. They suggested that this explains the high diversity observed in equine mtDNA haplotypes. They were able to identify six divergent sequence clades, which they labelled A to F. This nomenclature has been adopted by subsequent studies. The majority of Northern European domestic equines appear in Clade C whilst the most frequent globally is D1.

Jansen et al (2002) used a larger set of 652 equine mtDNA d-loop sequences to discover 93 different mtDNA haplotypes, which grouped into 17 distinct phylogenetic clusters. Many of the sequences used in this study were obtained from the GenBank database, which acts as a central repository for results of nucleotide analysis. Jansen et al were not the first to envisage such a study of the diversity of equine breeds, but their identification of the 17 distinct sequence motifs that define each of the clusters of mtDNA has laid the framework of definition of equine mtDNA haplotype for subsequent research.

The National Center for Biotechnology Information database (GenBank) acts as a globally accessible repository for nucleotide seqences and population sets. Amongst this resource are many domestic equine sequences that now number in the thousands. These include d-loop sequences from many of the principal equine studies, amongst which are five Cleveland Bay sequences. These had been obtained as part of an unpublished Master's research project at the University of Kentucky, USA . The first study to include these Cleveland Bay sequences in any comparative analysis was one examining mitochondrial d-loop diversity in Zemaitukai horses .After truncating the sequences to the standard 247 bae pairs for genetic cluster analysis, they placed the Cleveland Bay horse in cluster C3, along with Akhal-Teke, Belgian, Haflinger, Zemaitukai, Polish Heavy, Garrano and Noriker horses.

The Flannery Cleveland Bay mtDNA sequences are freely downloadable from the GenBank nucleotide database (sequences AY246209.1 to AY246213.1). When the sequences are aligned using MEGA4 and subject to cluster analysis with Clustal W three haplotypes can be identified amongst the five samples. Unfortunately there are no records of which specific animals these samples were taken from, so no association can be made between the haplotypes and Cleveland Bay matrilines. However, it is possible that detailed analysis of mtDNA haplotypes might shed further light on the nature of matrilineal diversity within the breed, and provide new information about the relationship of Cleveland Bay horses to other domestic equines. This chapter reports on the d-loop sequence analysis of 96 unique Cleveland Bay samples of known pedigree, and goes on to investigate within breed and between breed relationships that can be deduced from the study.

Materials and Methods

Sample Collection

Mitochondrial DNA was obtained from mane hair follicles of 96 Cleveland Bay animals. Samples were obtained from the UK, France, USA, Canada and Australia, from breeders who had been contacted through their various Societies and also online discussion forums. Permits were obtained from the UK Department for Environment, Food and Rural Affairs (DEFRA) for importing hair samples from outside of the European Union.

A total of 183 samples were offered, of which 125 were submitted. These were screened to determine female ancestry and a final selection of 96 samples was chosen to best represent the living Cleveland Bay population. The distribution of ancestry lines across the 96 samples tested is shown in .

Female Ancestry Line


Number of Samples

Sample References

% total


Stainthorpe's Star


CB004 - CB017



Depper 39





Daisy 318


CB019 - CB031



Marvellous 72





Depper 42


CB032 - CB051



Trimmer 268


CB052 - CB079



Brilliant GR


CB080 - CB093



Church House Queenie GR60


CB094 - CB095



Curlew GR








CB001 - CB003


Table : Female Ancestry Line Representation in 96 samples mtDNA tested

Extraction, Amplification &PCR

Mitochondrial DNA testing was conducted on a commercial basis, at the Dublin laboratory of Source Bioscience Ltd, who were contracted to carry out all of the processes of extraction amplification PCR and sequencing.

DNA Extraction

Hair samples were subjected to Proteinase K extraction using the following method.

300 μl Buffer ATL, 20 μl proteinase K, and 20 μl 1M DTT were added to a 1.5 ml micro-centrifuge tube.

A 0.5-1 cm section containing the hair follicle was cut from the base of approximately 6 hairs from each sample, and transferred to the micro-centrifuge tube. The tube contents were mixed by pulse vortexing for 10 seconds.

The resulting samples were incubated at 56°C until they were completely lysed. Samples were vortexed occasionally during incubation to aid dispersion. Complete lysis of the hair follicle was observed after approximately 2 hours.

Once lysis was complete, the samples were vortexed again for 15 seconds.

Following this, 300 μl Buffer AL was added to each sample, and thoroughly mixed by vortexing. 300 μl ethanol (100%) was added, and the samples were mixed again thoroughly by vortexing.

The resulting mixture (including any precipitate) was transferred by pipette to a DNeasy Mini spin Plate, which was placed on a 2 ml collection plate. Samples were centrifuge at ≥6000 x g (8000 rpm) for 1 min. Any flow through was discarded...

The DNeasy Mini spin plate was placed onto to a new a 2 ml collection plate, to which was added 500 μl Buffer AW1. This was centrifuged for 1 min at ≥6000 x g (8000 rpm), and the flow-through discarded.

The DNeasy Mini spin plate was placed onto a fresh 2 ml collection plate and 500 μl Buffer AW2 were added. This was centrifuged for 3 min at 20,000 x g (14,000 rpm) to dry the DNeasy membrane and once again any flow-through was discarded.

The DNeasy Mini spin column was placed in a clean 2 ml microcentrifuge plate and 200 μl Buffer AE were added directly onto the DNeasy membrane. This was incubated at room temperature for 1 min, and then centrifuged for 1 min at ≥6000 x g (8000 rpm) to elute.

To maximise DNA yield the previous stage was repeated a further time. 1µl of the resulting elute was used as the template for PCR reactions.

PCR Reactions:

Polymerase chain reactions (PCR) were carried out on a MJ Research DNA Engine Tetrad PTC-225. Forwards and reverse primers were designed according to previously published work for the D-loop equine Reference Sequence X79547 .



PCR amplification of mtDNA was carried out in a 96 well microtitre plate, with each well containing:

5µl Primer 1 (10nM)

5µl Primer 2 (10nM)

10 µl Qiagen HotStarTaq Plus Master Mix.

1µl template.

The PCR reaction took place under the following thermal cycling sequence: The reaction mixture was heated to 95oC for 5 minutes followed by 30 cycles of denaturing at 94oC for 40 seconds; annealing at 55oC for 45 seconds; extension at 72oC for 45 seconds. Thermocycling was concluded with extension at 72oC for 10 minutes following which the product was held at 12oC. Following the completion of the PCR reaction all PCR products were cleaned using a Zymo Research ZR-96 DNA Clean & Concentratorâ„¢-5. Samples were eluted in 50µl of DNA free water.


Sequencing of PCR products was carried out using Big Dyeâ„¢ Terminator Cycle Sequencing Kit (Applied Biosystems Inc.)

3µl Sequencing primer 1 (3.2nM), 3µl Sequencing template (purified PCR product) and 4µl Big Dye terminator v 3.1 (ABI) were thermocycled under the following regime

95oC for 5 minutes

Then 96oC for 10 seconds

Then 50oC for 5 seconds

Then 60oC for 3 minutes

This cycle was repeated 25 times after which the product was heated to 72oC for 10 minutes, and subsequently maintained at 12oC for prior to sequencing.

All products of the sequencing PCR were cleaned via passage through individual Sephadex clean up columns, to remove any unincorporated dye terminator products and diluted with 10µl of DNA free water. The resulting samples were run on an ABI 3730xl 96 capillary DNA analyser to produce individual forwards and reverse AB1 sequence files for each of the 96 samples.

Sequence Analysis

The forwards and reverse AB1 sequence files for each sample were assembled into contigs using the software Geneious v 4.8 . All 96 contigs were then aligned using the equine mtDNA reference sequence (GenBank X79547). Haplotype and DNA polymorphism analysis was conducted using DNAsp .

To investigate haplotype sharing with other equine breeds BLAST searches of each of the haplotypes identified in this study were conducted against the GenBank nucleotide database using Geneious .

To compare the Cleveland Bay sequences with other modern and ancient horses. 928 previously published equine mtDNA d-loop sequence, with standard 247 base pair lengths (15495 - 15740) were obtained from the NCBI database http://www.ncbi.nlm.nih.gov.Genbank). These samples represent 76 separate breeds of horse. (Sequences were aligned to the Cleveland Bay mtDNA sequences and neighbor joining trees were constructed using Geneious 4.8 .


Cleveland Bay Mitochondrial Haplotypes

Sequence analysis of 421 base pairs across the 96 Cleveland Bay contigs demonstrated 11 different haplotypes with 27 variable positions. These are shown in. Haplotype diversity (h) across the sample set was determined to be 0.7973 whilst nucleotide diversity (Ï€) 0.1537. The average number of nucleotide difference (k) was 7.363. 3 different tests for neutrality were conducted:

Tajima's D test: 1.218 Not Significant, P >0.10

Fu and Li's D test: 1.175 Not Significant, P >0.10

Fu and Li's F test: 1.426 Not Significant, P >0.10

Fu's F statistic: 6.183

Stroeck's S statistic (Probability that NHap <= 11) = 0.006

(Probability that NHap = 11) = 0.004

Two Cleveland Bay sequences shared the haplotype of the reference sequence. These animals ( CB001 and CB003) are of Grading Registry origin and so are descended from animals that have been brought into the studbook from outside the breed, being selected for reasons of pedigree or phenotype.

CB Hap 1 is shared by 25 individuals, representing 26% of the animals sequenced. This haplotype is shared by members of both Female Ancestry Lines 1 and 3, indicating that they are of common maternal origin.

CB Hap 2 is shared by 27 of the horses sequenced, representing 28% of the sample. This haplotype is unique to animals from Female Ancestry Line 6.

CB Hap 3 common to 13 horses representing 13.5% of the sample. All of these animals have maternal origins in Female Ancestry Line 7.

CB Hap 4 has 21 members representing 21.9% of the sample. This haplotype is unique to members of Female Ancestry Line 5.

These four Cleveland Bay haplotypes represent 89.5% of all of the animals sampled.

CB Hap 5 is unique to one animal (CB018) who is the only representative of Female Ancestry Line 2.

CB Hap 6 is only found in one animal (CB002). This horse has an application for Grading Register status pending with the Cleveland Bay Horse Society and so is of unconfirmed pedigree or maternal origin.

CB Hap 7 is unique to one animal (CB017) whose pedigree places her in Female Ancestry Line 1. This being the case one would expect it to display Haplotype 1. The reason for the discrepancy may be explained by sequencing errors as there is only one base difference (at position 15597) between these two haplotypes. Site 15597 is also reported to be mutational hotspot , which could equally explain the haplotype difference. Both of these haplotypes are also from cluster C, as defined by the original work on equine mitochondrial haplotype sharing .

CB Hap 8 is shared by two seemingly unrelated animals (CB026 and CB082). These horses trace back to Female Ancestry Lines 3 and 7 respectively. Logically they would be expected to be of haplotypes 1 and 3. A Blast search on the GenBank nucleotide database, for similar sequences, shows that this haplotype equates to D1, which is globally the most common of all domestic horse haplotypes. There are 9 variable positions between this haplotype and the reference sequence which is suggestive that the difference may not be down to sequencing errors. The reasons for these two seemingly unrelated animals appearing to share a non-Cleveland Bay haplotype warrants further investigation.

CB Hap 9 is shared by two individuals (CB094 and CB 095). These two animals trace back to Female Ancestry Line 8, which is a grading registry line of relatively recent origin. They are the only representatives of Line 8 in the sample.

CB Hap 10 is unique to one animal (CB096). This horse is the only representative of the most recent female ancestry line - Curlew - identified in the earlier study . Again the origins of this line trace back to the Grading Register, and to equine mtDNA Clade B . shows the Neighbour Joining tree of relationship between each of the 96 animals sequenced and of the haplotypes they display.

Alignment of the 96 samples from this study with the 5 Cleveland Bay sequences downloaded from Genbank (AY246209 - AY246213) demonstrates that 4 of the sequences share two of the more frequent Cleveland Bay Haplotypes. Samples AY246209, AY24610 and AY246213 share CB Hap1 (Lines 1&3 - Clade C1), already common to 25 of the samples from this study. Sample AY246211 is of CB Hap4 (Line 5 which is also Clade C). Sample AY246212 is a singleton, possessing a haplotype that has not been found amongst the sequences tested. However, this sequence also sits with Clade C in the Neighbour Joining tree shown in .

Table : Polymorphic sites of Cleveland Bay horses and the Reference Sequence (GenBank X79547) in control region of horse mtDNA D-loop sequences. * sites 15585 15597 & 15650 have been previously identified as mutation hotspots

Figure : Neighbour Joining tree of individual Cleveland Bay mtDNA contigs.

Figure Neighbour Joining Tree showing alignment of the96 sequences from the present study with 5 Cleveland Bay mtDNA sequences in Genbank.

Relationships with other breeds

To investigate relationships of the Cleveland Bay horse with other breeds, BLAST searches were conducted on each of the CB haplotypes to identify sequence sharing with those held in the GenBank database.

The four main Cleveland Bay mtDNA haplotypes produced significant matching with other domestic horse breeds. CB Hap1 gave 100% matches in both pairwise identity and identical sites with four Kerry Bog Pony sequences. There were no complete matches for CB Hap 2. However there was 99.8% matching with sequences from Irish Draught, Arab and Akhal-Teke horses. CB Hap3 was a complete match for two Irish Draught sequences ,as well as three from Orlov horses. CB Hap4 showed 99.6% identity with three Irish Draught horse sequences and one from a Zhongdian horse .

Of the minor haplotypes, the reference sequence and two Cleveland Bay grading register animals gave 100% matches in both pairwise identity and identical sites with three Irish Draught Horse sequences and with three Kerry Bog Pony sequences .CB Hap5 was a 99.6 % match to four Irish Draught horse sequences as well as one of Mongolian origin . Again no exact match was found for CB Hap6, but there was >99.4% matching with Kerry Bog Pony, Irish Draught, Mongolian and Zhongdian horse sequences. CB Hap7 was best matched at 97% similarity with 5 Kerry Bog Pony sequences. There were 100% matches between CB Hap 8 and Ahkal-Teke Irish Draught and Chinese Guan Mountain horses. CB Hap9 showed identity >99.8% with 5 Irish Draught Horse sequences, whilst there was >99.8% matching between CB Hap10, Irish Draught, Kerry Bog Pony, Polish Arabian and Orlov sequences .

To further analyse genetic relationships between Cleveland Bay horses and other domestic horses, a range of mtDNA sequences were downloaded from GenBank, including representatives of common British and European Cold and Warm Blood, Asian and Ancient animals from archeological remains. In addition 12 sequences representing the six made clades (A-F) previously described for the horse mitochondrial D-loop were included as references.

Figure Neighbour Joining tree showing relationship between Cleveland Bay samples and previously identified Equine haplotypes.

These sequences were truncated to 247 base pairs in order to compare homologous regions and haplotype networks were constructed with SplitsTree v4.1.1.3.. The neighbor joining tree showing the relationship between the individual Cleveland Bay samples and the twelve reference sequences is shown in .

All of the animals from CB Female Ancestry Line 6 (Cleveland Bay Haplotype 2) cluster with the two reference sequences from Clade A, one of which is a Danish Horse, the other being of an unrecorded breed. The two horses with CB Haplotype 0 (sharing haplotype with reference sequence X79547) do not share a common branch with any of the haplotype reference sequences. Horses from CB Female Ancestry Lines 7 and 9 share a branch of the tree with the reference sequences from Clade B. These reference sequences are from horses of Arab and Thoroughbred origin. The animals from CB lines 1, 3 and 5 cluster with the Clade C reference sequences. These are from Exmoor and Icelandic Ponies. A skeleton median joining network of previously defines equine clades is shown in and illustrating how the Cleveland Bay haplotypes fits this model in .


Figure Median Joining Network showing previously defined Equine Clades . Skeleton network after McGahern

Figure - Median Joining Network showing relationship of Cleveland Bay Haplotypes to previously defined Equine Clades

Relationships with Ancient Horses

To investigate possible ancient origins of the breed 26 mtDNA sequences from ancient horse remains, previously deposited in Genbank were aligned with the Cleveland Bay samples. The ancient samples were from bone and dental fragments, obtained from ancient horse skeletons recovered from archaeological digs in Ireland, England (Derbyshire) and in the Iberian Peninsula. An additional 5 samples of Viking age were drawn from an earlier study

Two of the ancient Irish samples, from Waterford and Sligo, cluster with CB Female Ancestry Line 6. Grading Register samples CB001 and CB003 share a branch of the tree with ancient samples from Iberia and Clare in Ireland. The two CB Female Ancestry line 8 samples cluster with the ancient sample from Derbyshire in addition to sample ATA07 from Portalón, on the Northern Iberian plateau. Radio carbon dating of the latter places it 1010 +/- 40 years before present . A further ancient sample from the Iberian set (ATA04) was found to cluster with the modern Cleveland Bay samples from female ancestry lines 1 and 3. This ancient sample has been dated by radiocarbon dating back to 3690 +/- 40years before present .

The Viking age sample AF326678 clusters with CB056 in Line 6. Other Viking age sequences have less robust relationships with other sequences. It should be notes that these Viking age sequences are not of the standardised 247 base pairs, typically being 160bp. As such any relationships inferred to Cleveland Bay sequences must be interpreted with caution. A neighbour joining tree illustrating the relationships to ancient horse samples is shown in .

Figure Neighbour Joining Tree showing the relationship between Cleveland Bay and Ancient Horse Samples.


Mitochondrial DNA analysis of the Cleveland Bay horse has allowed new insight into the diversity of the female founders of the breed. Whilst the studbook records dating back to the late 18th Century suggest that as many as 17 different female founders contributed to the breed, it is probable that only four female ancestors contribute to the modern day population. Whilst this study has identified eleven different haplotypes, four account for in excess of 89% of all of the samples tested and the other 7 minor haplotypes have been shown to be linked to relatively recent introductions into the breed. Indeed two of the latter are from animals that still have registration pending status.

The four female founders are:

Line One/Three: Stainthorpe's Star (foaled circa 1850 by Grand Turk 138). This mare predates Dais(y) 318 (the previously recorded founder of Line3 ) by some 26 years, and as both share the same mtDNA haplotype we can deduce that both lines share a common female ancestor. This haplotype was found in 26% of the samples tested, and projection from pedigree records indicates that it is present in 33% of the reference population.

Line Two / Seven: Depper 39 (foaled 1855 by Ottenburgh 222). Whilst Line Two is virtually extinct in direct descent in the reference population (n=3) this study has shown that a very similar haplotype is shared with the more recent and more numerous Line Seven. The latter was established by the breeding of Mr J Sunley of Gerrick House, in the 1930's, from the Grading register mare Brilliant. This mare will have carried the very close haplotype (CB Hap 5) which is only 2 mutations different from the haplotype carried by Line 2 animals (CB Hap3). It is probable that these lines have a common maternal origin, but we are unable to deduce from the mutational differences whether the link is in recent or historic generations. Also of very similar haplotype is the very recent grading register addition of Female Line 9 (Curlew). This has only one mutational difference with Female Line Two and belongs to the same mtDNA haplotype Clade B . CB Haplotype 3 was found in 13.5% of the samples tested and is present in 15.2% of the reference population.

Line Six: Trimmer 269 (foaled 1880 by Wonderful 359) carries the unique CB Hap 2. This haplotype was found in 28% of the samples tested and by pedigree analysis is present in 28.6% of the reference population. This haplotype matches the previously defined type A1 .The only other haplotype from the same Clade found in the samples tested was CB Hap 0 - shared by two Grading Register animals and the Reference sample. This equated to Jansen's type A5.

Line Five: Depper 42 (foaled 1880 by Barnaby 21) carries CB Hap 4, which is of Jansen's Clade C origins. 22% of the animals tested carry this haplotype, which is reflected in 20% of the reference population.

Of the minor haplotypes CB Haps 8 and 9 originate from Jansen's Clade D. Two animals demonstrate haplotype D1 and the remaining two haplotype D2. CB Hap 9 corresponds to Female Ancestry Line 8 - to which only 1% of the reference population belong. This ancestry line is of recent grading registry origin, tracing back to Church House Queenie GR60 by Kingmaker 1807.Whilst this mare will have been of Cleveland Bay phenotype, in order to satisfy the inspectors and breed committee members of the Cleveland Bay Horse Society, the evidence from the mitochondrial dna analysis suggests that this was introgression of a female that was of non-Cleveland Bay origins into the breed. As she was placed in the grading register, and subject to an upgrading process, the influence of Line 8 on the breed will have been limited, with dilution of the non-Cleveland genes at each generation. Despite this process of upgrading, because of the strictly maternal inheritance of mitochondrial DNA, descendants of this line are will maintain this non-Cleveland haplotype. Two other animals, registered in the pure studbook, demonstrate CB haplotype 8, which originates in Jansen's Clade D1. This is the most common Clade amongst domestic equines. However, the appearance of this haplotype amongst animals of supposedly pure bred origin casts doubt upon the accuracy of some studbook records. The fact that two animals share this haplotype within the breed suggests that it is not an issue of sequencing error, but warrants further investigation to determine a common female ancestor of these two animals. According to pedigree records the first of these two animals (ref CB082 of this study - studbook number withheld pending discussion with CBHS) traces back to Female Line 7 and should carry CB Hap5 and be of Clade B origin. It is possible that a registration error at the dam, grand dam or previous generations has led to introgression of non-Cleveland Bay horses in this pedigree on the female side. Examination of parentage testing reports based on microsatellite markers, for the animal concerned and for her dam suggest that any errors have occurred at least three generations back in the pedigree.

Further possible pedigree errors, this time involving discrepancies between female ancestry lines have been identified in five samples. Females CB038 and CB039 both share a common dam. Pedigree records show them as belonging to Female Ancestry Line 5. As such they would be expected to share haplotype CB Hap4 and be of Clade B origins. However, they both share CB Hap 2/Clade A1 and appear to be of Line 6 ancestry. Conversely, three animals (CB053, CB064 and CB077 - all males) which pedigree records suggest should belong to Line 6 carry the haplotype associated with Line 5. Two of these animals are licenced stallions and therefore have considerable potential to influence the breed. However, because of the maternal inheritance of mitochondrial DNA the haplotype discrepancy will not be passed to future generations. There is concern though that there have been errors in the recorded pedigree, and whilst mtDNA has strictly maternal inheritance most other genetic material is passed on in the normal diploid manner. One of these two stallions has recently been subject to re-registration following anomalies that have come to light on parentage testing of progeny. It may well be that in the light of evidence from mitochondrial DNA analysis , further investigation into the true breeding of this stallion is warranted.

When the sample set is considered as a whole, the haplotypic diversity calculated for the breed is significantly lower than that determined for the majority of other domestic equines ( h = 0.793). For example Avar horses h= 0.93, Hungarian ancient horses h = 0.989, modern Akhal Teke h = 0.945, ; Hispano-Breton heavy horse h = 0.975 & Pre horse h= 0.878 ; Lusitano h=1.0, Asturcon h = 0.80, Argentine Crillo h = 1.0, Barb h = 0.933 . Breeds with lower haplotypic diversity include Caballo de Corro h = 0.733, Paso Fino h = 0.60, Florida Cracker h = 0.667 and Sulphur Mustang with the lowest reported h= 0.333 . It must be noted that each of the breeds with reported haplotype diversity lower than that found in the Cleveland Bay has been from a much smaller sample sets, with n=6 in each case, which may have significant influence on the results.

Blast searches of the four principal Cleveland Bay haplotypes against those sequences held in the NCBI Genbank nucleotide database reveal substantial haplotype sharing with the Irish Draught horse . Three of the previously identified 35 Irish Draught haplotypes are shared with the Cleveland Bay ( CB Hap 2,3 & 4 whilst the fourth main CB haplotypes (CB Hap1) displays significant common identity to one of the Kerry Bog Pony lineages.

The estimated rate of mutation of equine mitochondria DNA control region is estimated at 2-4Ã-10−8 per site per year . This equates to approximately one mutation per 100,000 years . However, several authors have identified mutational hotspots within the control region of equine mitochondrial DNA . In particular positions 15585, 15597 and 15650 are now recognised as being subject to mutation at significantly greater rates than other sites. Within our sample two singleton haplotypes occur that are unique because of mutations at these hotspots. CB Hap6 varies from CB Hap4 by a single mutation at 15585. Similarly CB Hap7 varies from CB Hap1 by a single mutation, in this case at 15597. In addition, if mutation at 15585 is ignored, then CB Haplotypes 8 and 9 become identical - both clustering in the most numerous of all the domestic equine clades - D1.

The distribution of the four main Cleveland Bay haplotypes across Clades A - C is consistent with the association of these Clades with horses of Northern European origin. Clade C1 has previously been associated with Exmoor, Fjord, Icelandic and Scottish Highland Ponies. This cluster is geographically restricted to central Europe, the British Isles and Scandinavia, including Iceland . Some horses of Iberian origin have previously been associated with Clade A , and this is consistent with the historical evidence for the Cleveland Bay breed. Horses of Lusitano, Pre and Sorria origins have been shown to belong to Clade B whilst cluster D1 is considered as representative of Iberian and North African Breeds .

Of the Cleveland Bay animals that fall outside each of the four main haplotypes, the vast majority can be traced back to the Grading Register, as opposed to the full studbook. This register was established with the dual purposes of recording animals that were of clear Cleveland Bay phenotype, which had been missed out at the time the studbook was established, and more recently to provide a source of diversity at a time when breed numbers were critically low. Blast searches of these minor haplotypes against those sequences held in the NCBI Genbank nucleotide database reveal that they are significantly different from the four main Cleveland Bay haplotypes, and have more in common with other breeds such as the Thoroughbred and the Arab horse.


Sequencing of the control region of Cleveland Bay mitochondrial DNA has for the first time cast real light onto the matrilineal diversity and possible origins of the breed. Four major haplotypes have been identified, with multiple female ancestry lines being associated with some of these. Horses that are more recent entry into the studbook via the Grading Register have been shown not to be associated with these main haplotypes. Whilst grading register animals have been selected because of Cleveland Bay phenotype, their suitability for maintaining the genetic health of the breed must now be brought into doubt because of the introgression of what would appear to be non- Cleveland Bay genetic material , as opposed to that of "lost" Cleveland Bay origin.

The four main Cleveland Bay haplotypes are associated with multiple clades of domestic horses A - C. This pattern of origin in multiple clades has been seen in many other breeds of horses . However, there is now growing evidence of bio geographical distribution and regional association with the main Clades. Traditionally the Cleveland Bay horse is said to have evolved on the maternal side from the now extinct Chapman pack horse. If, as reported in Chapter One, the Chapman has evolved from the native British pony, then Lines One, Three and Five, belonging to Clade C, support this theory. There are Blast associations with the Exmoor and Kerry Bog Ponies, both very ancient breeds, pointing to evolution from ponies that were native to post-glacial Britain. Clades A and B have associations with horses of Iberian and North African origins. The historical evidence shows that Stallions from Spain and North Africa were brought back to North East England and used on local mares. It is not improbable that good quality mares were imported from these same origins at that time, and that these were covered with Cleveland Bay stallions. If this is the case then mares of Line 6 and 7 (and by association the almost extinct Line 2) may not be of maternal Chapman descent, but originate from Iberian and Barb mares.

A better understanding the maternal origins of the breed may contribute to future developments in breed management. Whilst management through minimizing kinship remains the "gold standard" solution, being able to divide the breed up into lines associated by haplotype, and controlling inbreeding within these lines, may well prove to be a more acceptable model for breeders to follow, and in turn lead to a more sustainable maintenance of genetic diversity in the future.