Synteny mapping (Genomics)

Comparisons between genomes reveal homologous sequences that reflect their common evolutionary origin and subsequent conservation. Segments of DNA that have function are more likely to retain their sequence than nonfunctional segments, as they are under the constraints of natural selection during evolution. Therefore, DNA segments that are conserved between species are more likely to encode similar function. Sequence comparisons between species provide information on gene structures and may reveal regulatory elements. Experience has shown that such comparisons benefit from the use of sequences from a variety of species representing a range of evolutionary divergence. Sequence conservation between species, within genic and nongenic regions, can be utilized for the construction of physical maps. These clone-based maps can underpin the generation of genome-wide sequence, provide regional coverage for directed sequencing efforts, or provide resources for genomic interrogation, for example, using fluorescence in situ hybridization (FISH), comparative genomic hybridization (CGH), or array CGH.

Similarity between genomes is evident at the level of long-range sequence organization where the order of multiple genes on a single chromosome is conserved, or where the chromosomal location of multiple genes, but not necessarily their precise order, is conserved (Nadeau and Taylor, 1984; DeBry and Seldin, 1996; Nadeau and Sankoff, 1998). In general, the degree of similarity at all levels is higher between species that are more closely related on an evolutionary scale, that is, diverged more recently from a common ancestor. Ultimately, comparison of the finished reference sequence of each organism is required to detect every conserved segment, and from this to deduce all the chromosome rearrangements (translocations, inversions, duplications, deletions, and gene conversion events) that have occurred between species. The ability to align the different genome maps over their entire length simultaneously defines the syntenic relationship between them at a new level of resolution and accelerates the process of sequence generation and other biological studies.


The recent revolution in large-scale genomic analysis has already yielded near-complete DNA sequences of a diverse range of organisms, including bacteria, yeast, worm, fly, dog, mouse, and man (Fleischmann et al., 1995; Churcher et al., 1997; The yeast genome directory, 1997; The C. elegans Sequencing Consortium, 1998; Adams etal., 2000; Lander etal., 2001; Venter etal., 2001; Waterston et al., 2002; Kirkness et al., 2003). Assembly of each large genome sequence to date has been underpinned by production of a comprehensive map of overlapping large-insert bacterial clones (e.g., cosmids; Collins and Hohn, 1978) or bacterial artificial chromosome (BAC) clones (Shizuya etal., 1992; Coulson etal., 1986; Olson etal., 1986; Bentley etal., 2001; McPherson etal., 2001; Gregory etal., 2002) for sequencing, and in some cases also for integration with whole genome shotgun sequence data (Adams etal., 2000; Venter etal., 2001). Mapped clones provide invaluable information to identify and help eliminate incorrect assemblies between repetitive sequences, to provide substrates for targeted finishing (e.g., to >99.99% accuracy; Green, 2001; Dunham etal., 1999; Waterston and Sulston, 1998), and as a resource for experimental studies such as FISH (du Manoir et al., 1993) and metaphase and array-based CGH (Kallioniemi et al., 1992; Pinkel et al., 1998; Ishkanian et al., 2004). The study of other large genomes, particularly those with high levels of repetitive sequence (like that of the mouse), requires physical maps of a similar standard as a prerequisite for the production of finished sequence, either on a genome-wide scale or to provide access to any region of interest, which may be located in the map using landmarks such as known genes or genetic markers. Clones that are used for the assembly of these physical maps permit specific regions to be targeted for further investigation and, in particular, for the determination of the complete and accurate DNA sequence separately from other clones within the physical map. Because the source of the genomic sequence is generated clone by clone, problems encountered with sequence assemblies are similarly restricted to individual clones, greatly reducing the complexity of resolution of the problem compared to whole genome sequence assemblies.

Construction of the physical map of the mouse genome using human genomic sequence as a reference. Finished human sequence from large-insert bacterial clones (c), originating from the physical map (b) of human chromosome 6 (a), provides the template for the alignment of mouse BAC end sequences (d) that had previously been assembled into fingerprint contigs. Contig assembly using the described strategy resulted in rapid assembly of sequence-ready contig coverage (e) of the mouse genome, including mouse chromosome 4 (f)

Figure 1 Construction of the physical map of the mouse genome using human genomic sequence as a reference. Finished human sequence from large-insert bacterial clones (c), originating from the physical map (b) of human chromosome 6 (a), provides the template for the alignment of mouse BAC end sequences (d) that had previously been assembled into fingerprint contigs. Contig assembly using the described strategy resulted in rapid assembly of sequence-ready contig coverage (e) of the mouse genome, including mouse chromosome 4 (f)

The similarity in sequence organization between two genomes provides the opportunity for a reference genome, such as the finished sequence of the human genome, to be used as a framework to assemble the physical map of a second genome, such as the mouse (Gregory etal., 2002). The phasic construction of such a physical map of a second genome relies upon the existence of a highly redundant restriction digest database (> 10-fold redundancy), the availability of BAC end sequences (BESs), and a genome-wide marker set. Initially, restriction fingerprints of the secondary organism are assembled within a database, such as Finger Printed Contigs (FPC) (Soderlund et al., 2000). BESs of the clones contained within these assembled contigs are then aligned to the reference genome, prior to inclusion of independently mapped genomic markers for correct positioning within the secondary organism (Figure 1). The juxtaposition of the clone contigs along the reference genome greatly accelerates the physical map construction process and develops a homology map between the two organisms. The proven success of assembling genome-wide physical maps, the cost of constructing a > 10-fold genomic BAC library, and the ease with which genome-wide fingerprint databases can be assembled has led to the construction of several genomic fingerprint databases. While genome-wide fingerprint maps will facilitate the large-scale characterization of many varied species, the construction of small region specific sequence-ready maps will continue to be important for detailed interspecies sequence comparisons (Thomas et al., 2002).

Next post:

Previous post: