Studies of human genetic history using mtDNA variation

At fertilization, the genetic contribution of the oocyte to the zygote differs from that of the spermatozoon since the latter does not contribute viable mitochondria. These cytoplasmic organelles harbor numerous copies of a circular genome (~16 570 bp), which is characterized by a much higher evolution rate (10-20 folds) than that of the average nuclear gene. Thus, human mitochondrial DNA (mtDNA) is maternally transmitted and its sequence variation, which cannot be reshuffled by recombination as in autosomal genes, has been generated exclusively by the sequential accumulation of new mutations along radiating maternal lineages. Because this process of molecular differentiation is relatively fast and occurred mainly during and after the recent process of human colonization and diffusion into different regions and continents, the different subsets of mtDNA variation tend to be restricted to different geographic areas and population groups.

As a result of these peculiar features, the human mtDNA has a single genealogical history, which can be reconstructed as a gene tree (or network) (Bandelt et al., 1995); migrations (instances of gene flow between regions) can be detected by incorporating the geographical origin of subjects into the tree; and time depth within the tree can be estimated using a molecular clock (Richards and Macaulay, 2001). The application of these principles, now referred to as the phylogeographic approach (Avise, 2000) was pioneered in humans by Douglas C. Wallace in the early 1980s.

Before providing some examples of the results obtained in the last 20 years by mtDNA studies, it should be pointed out that, despite its unique features for studying human genetic history, mtDNA remains a single locus – prone to genetic drift and possibly under selection – thus, it should not be used as the only tool for inference in human evolution. Conclusions based on mtDNA data require corroboration by other genetic systems (Y chromosome (see Article 4, Studies of human genetic history using the Y chromosome, Volume 1) and autosomal DNA) and other disciplines (for instance, linguistics, archaeology, and even climatology).

The earliest mtDNA work began by digesting the entire genome with a few restriction enzymes (often five or six) on fairly large sample sets (Johnson et al., 1983). However, this approach attracted public attention only in 1987, when

Rebecca Cann, Mark Stoneking, and Allan Wilson using high-resolution restriction analysis (12 enzymes) obtained a much more detailed mtDNA phylogeny. Their analysis of 147 mtDNAs from the different continents identified what was improperly defined as “the mitochondrial Eve” and led to the hypothesis that all modern mtDNAs descend from a woman who lived in Africa about 200 000 years ago (Cann et al., 1987). This proposal generated a fierce debate between the proponents of a recent African origin of anatomically modern humans, and those who favored multiregional alternatives. Although the African root was also supported by the sequencing data from the fast-evolving first and second hypervariable segments (HVS-I and HVS-II) of the mtDNA control region (Vigilant et al., 1991), the debate lasted for almost a decade, and was for the most part resolved only when the control region of the first Neanderthal specimen was finally PCR amplified and sequenced (Krings et al., 1997), revealing that its sequence fell outside the variation of modern humans and did not contribute to the current mtDNA pool.

During the early 1990s, in parallel with the studies addressing the origin of Homo sapiens sapiens, the high-resolution restriction analysis also began to be applied to individual continents on large numbers of samples, with the objective of determining human origins in each major geographic area. This resulted in a much more refined picture of the mtDNA world phylogeny – one in which the haplotype clades, or haplogroups, characterized by one or several diagnostic restriction markers, were usually restricted geographically, some to sub-Saharan Africans, others to Europeans, and yet others to East Asians. Many of these haplogroups could not be distinguished by control-region data alone (despite the popularity of this approach in the 1990s and its wide application in forensics), although they could often be picked out from control-region data after an exploratory combined control-region/restriction analysis (Torroni et al., 1993, 1996). The haplogroups could, moreover, be subdivided into smaller evolutionary units by including control-region information (Macaulay et al., 1999).

The first large comprehensive population studies were carried out in Native Americans and focused on the origin, timing, and numbers of ancestral migrations from Asia. These revealed that virtually all Native American mtDNAs belonged to haplogroups A, B, C, and D (Torroni et al., 1993) – later joined by the uncommon haplogroup X – and that only one or two founder haplotypes for each haplogroup were shared between the Native Americans and their Asian counterparts. This indicated that a limited number of mtDNAs arrived in the Americas, in one or two population expansions from Siberia/Beringia. A conclusion that appears to be entirely supported by the data recently obtained from the analysis of entire mtDNA sequences (Bandelt et al., 2003).

In Europe, mtDNA variation was studied for the first time by a number of groups in the early 1990s, mostly focusing on the HVS-I region. Initially, it seemed that the European mtDNA landscape might be so flat as to be almost entirely uninformative with respect to European prehistory suggesting that mtDNA may not be a useful demographic marker system (perhaps due to selection or high rates of female gene flow in recent times). However, high-resolution restriction analysis studies showed that this was not true. Indeed, by supplementing HVS-I data with additional informative variants from the coding region, mtDNA variation was dissected into these major haplogroups (H, I, J, K, T, U, V, W, and X) (Torroni et al., 1996; Macaulay et al., 1999), which are now fully supported by the sequence analysis of entire mtDNAs (Finnila et al., 2001). The first results from European mtDNA (Richards et al., 1998) suggested that only a small minority of lineages dated to the Neolithic, with the remainder dating back to between 15 000 and about 50 000 years ago. The majority appeared to descend from founders of Middle or Late Upper Paleolithic origin. These clades were strikingly starlike, indicating dramatic population expansions, which suggested that they were mainly the result of reexpansions in the Late Glacial or Postglacial period. The results were, however, rather tentative, because they were reliant on comparisons with a very small and inadequate sample from the Near East. However, further work (Richards et al., 2000, 2002) showed that, with a sufficiently large sample size and a better resolved phylogeny, clades of mtDNA do indeed exhibit gradients similar to those of other marker systems (Cavalli-Sforza et al., 1994), and provided evidence that more than three-fourths of the present-day European mtDNAs could be from indigenous Paleolithic ancestors. This conclusion is supported by some analyses of the paternally transmitted Y chromosome (Semino et al., 2000; Rootsi et al., 2004), but overall rejected by other studies that infer admixture coefficients considering different potential parental populations (Dupanloup et al., 2004). In this debated context, first the molecular dissection of one autochthonous European mtDNA clade (haplogroup V), and much more recently that of haplogroup H – the most common haplogroup in Europe (40-50%) – were particularly informative in revealing that there was indeed a dramatic Late Glacial expansion from the Franco-Cantabrian glacial refuge that repopulated much of the western part of the continent of Paleolithic mtDNAs from about 11 000-15 000 years ago (Achilli et al., 2004). To determine whether the refuge area(s) of Eastern Europe played a similar role on the other side of Europe would require, in phylogeographic studies, the identification of similarly informative mtDNA subhaplogroups, and, in admixture studies, the evaluation of Eastern Europe as a potential homeland for a parental population.

After having passed through a number of technological and methodological stages, the analysis of mtDNA variation is now in the era of complete mtDNA sequences, and this has opened new, interesting perspectives. This procedure is allowing the identification of new subhaplogroup markers that can be very informative also at the microgeographical level. In addition, the phylogeny of complete sequences has shown that some internal clades are disproportionately derived, compared with others – a result not consistent with a simple model of neutral evolution with a uniform molecular clock – hinting at a role for selection in the evolution of human mtDNA (Torroni et al., 2001). This notion has been further developed by Ruiz-Pesini et al. (2004) who proposed that the relative frequency and amino acid conservation of internal branch replacement mutations has increased from tropical Africa to temperate Europe and Siberia, and the same haplogroups correlate with increased propensity for energy deficiency diseases as well as longevity. Thus, specific mtDNA replacement mutations permitted our ancestors to adapt to more northern climates, and these same variants influence our health today.

Next post:

Previous post: