Hitchhiking mapping (Genomics)

1. The principle of hitchhiking mapping

Hitchhiking mapping is one approach toward the identification and characterization of genes with a beneficial effect in a given context (Schiotterer, 2002; Schiotterer, 2003). The underlying principle of hitchhiking mapping is that a beneficial mutation will either be lost or increased in frequency until it becomes fixed in the population. The spread of a beneficial mutation also affects neutral variation linked to the beneficial mutation (“hitchhiking”; Maynard Smith and Haigh, 1974)). As a consequence, the pattern of sequence variation in the affected genomic region differs from neutral expectations. Population genetics has provided a large repertoire of statistical tests for the identification of genomic regions deviating from neutral expectations (Kreitman, 2000; Otto, 2000; see also Article 7, Genetic signatures of natural selection, Volume 1).

One of the possible consequences of the spread of a beneficial mutation is a reduction in variability. Figure 1 depicts the reduction in variability around a selected site, obtained from an average over 100 independent computer simulations of a selection event at the same site. In this simulation, the target of selection was unambiguously identified as the genomic region with the most pronounced reduction in variability.

2. Different phases of a hitchhiking mapping study

Hitchhiking mapping studies are carried out on a genome-wide scale to identify those parts of the genome that carry a recent beneficial mutation. In the first phase, a large number of loosely linked markers are analyzed. On the basis of this primary screen, a number of loci are identified, which show the most extreme distortion in allele frequency spectrum. Given that a very large number of loci could be tested in such a primary screen, additional testing is required to distinguish false positives from genomic regions subjected to directional selection.


The second phase of hitchhiking mapping focuses on the genomic region flanking one of the candidate regions identified in the primary screen. As linked sites are more strongly correlated after a selective sweep than under a neutral evolution scenario, the pattern of variation at linked genomic regions could be used to verify genomic regions subjected to a recent selective sweep.

Mean gene diversity determined for 35 evenly spaced microsatellites over 100 simulation runs. For each of the simulations, a selective sweep was assumed to have occurred at the microsatellite No. 10, which shows the most pronounced reduction in variability. Computer simulations were performed with a computer program written by Y. Kim and modified for microsatellites by T. Wiehe. Simulation parameters were: microsatellite spacing = 12 kb, t = 0.001, 5 = 0.001, e = 5, r = 5 x 10-9

Figure 1 Mean gene diversity determined for 35 evenly spaced microsatellites over 100 simulation runs. For each of the simulations, a selective sweep was assumed to have occurred at the microsatellite No. 10, which shows the most pronounced reduction in variability. Computer simulations were performed with a computer program written by Y. Kim and modified for microsatellites by T. Wiehe. Simulation parameters were: microsatellite spacing = 12 kb, t = 0.001, 5 = 0.001, e = 5, r = 5 x 10-9

After the successful verification of a candidate region, the final step of a hitchhiking mapping study involves a detailed analysis of the genomic region affected by the selective sweep. A comparison of multiple populations with and without a selective sweep could be highly informative for the identification of the molecular changes responsible for the selective sweep.

3. Which marker to use?

The primary screen of many, unlinked markers is greatly facilitated if a highly informative and cost-effective marker is used. Microsatellites are highly polymorphic markers present at a moderate density in most eukaryotic species, making them a good marker choice for first pass genome scans (Schlotterer, 2004). However, SNP (Akey etal., 2002) or DNA sequence analysis (Glinka etal., 2003) based genome scans have been performed. Microsatellites remain the best choice; the information content of single SNPs is lower than that for a microsatellite locus, and DNA sequencing is more expensive and complicated by the presence of indels.

The second phase requires polymorphism data for several linked genomic regions. Very often, microsatellites are not available at a high enough density. Therefore, DNA sequencing of short (400-800 bp) genomic regions is often the best strategy for the second hitchhiking mapping phase. High-density SNP analysis has also been shown to be informative (Sabeti et al., 2002).

The final phase of a hitchhiking mapping project requires a detailed analysis of the polymorphism in the candidate region, which is best achieved by DNA sequencing. Thus, different classes of markers and methods are preferable at the various stages of a hitchhiking mapping study.

4. Potential and limitations of hitchhiking mapping

Recent studies in yeast suggested that even the loss of gene function often does not result in a phenotype that is easily recognized under laboratory conditions (Winzeler et al., 1999). Thus, a large fraction of genes cannot be studied by classical genetic approaches. This applies, in particular, to ecologically relevant genes, which, by definition, are highly dependent on the ecological context in which an organism resides. Through the comparison of two groups of individuals adapted to different conditions (e.g., habitat, resistance against diseases, parasites, etc.), hitchhiking mapping provides the opportunity for the identification of genes that recently acquired a mutation, resulting in the phenotypic difference of interest. When the groups are unambiguously defined, hitchhiking mapping offers the advantage that no phenotype needs to be scored in the laboratory. Rather, natural selection has recognized the advantage of the beneficial mutation, which results in the typical molecular signature of a selective sweep. Therefore, hitchhiking mapping can identify even mutations with a subtle or environment-dependent phenotype.

One further advantage of hitchhiking mapping is that no experimental genetic crosses are required. Like linkage disequilibrium mapping, hitchhiking mapping builds upon meiotic recombination events that have occurred in natural populations. As a larger number of meiotic recombination events have occurred in natural populations, hitchhiking mapping could result in a higher mapping precision than quantitative trait locus (QTL) studies requiring experimental crosses.

The signature of a selective sweep is gradually lost as new mutations accumulate (Wiehe, 1998). Hence, hitchhiking mapping is limited to beneficial mutations that occurred in the recent past. Markers with a high mutation rate (such as microsatellites) are better suited for more recent selective sweeps than DNA sequence data. Nevertheless, in Drosophila, hitchhiking mapping was successfully applied to the detection of selective sweeps that occurred about 10 000 years (50 000-100 000 generations) ago. Both microsatellites and DNA sequence analysis detected the signature of the same selective sweep (Harr et al., 2002).

Probably the most challenging aspect of hitchhiking mapping is the functional verification of the identified alleles. As the phenotypic effects of these alleles are difficult to study, a comparison of putatively functionally diverged alleles is not straightforward. Nevertheless, at least for some of the identified genes, a sensitized background could be used to test the functional impact of naturally occurring alleles.

Next post:

Previous post: