Biology Reference
In-Depth Information
so-called expression territories containing clusters of coordinately
expressed genes identified in Drosophila . Indeed, the correlation of gene
expression with gene order data indicated that eukaryotic gene arrange-
ments are not entirely random and similar, and/or that coordinated
expression patterns are often observed for physically clustered genes. 53
However, besides loosely defined gene territories, there seems to be few
evolutionary constraints to preserve orthologous gene arrangements in
synteny blocks. Indeed, assessing the conservation of microsynteny
among several insect genomes from Diptera to Hymenoptera identified
only a few hundred genes that might be linked by selection. 54
Interestingly, the size distribution of synteny blocks was found to follow
the power law, which implies a nonuniform distribution of chromosomal
breakpoints, i.e. exponentially clustered around breakage hot spots,
rather than the commonly assumed random breakage model. It also
appears that the rate of rearrangements within chromosomes (chromo-
somal arms in Diptera) is much higher than between chromosomes, such
that orthologous relations between chromosomes can be clearly estab-
lished by the excess of shared orthologous markers in comparison with
a random expectation while synteny blocks gradually become almost
randomly scattered along the chromosomes. 55-57
The delineation of orthologous genomic regions at the level of mul-
tiple DNA sequence alignments provides an opportunity to identify
much more illusive non-protein-coding functional sequences. For closely
related species, such multiple alignments are analyzed for patterns of
minor variations and population polymorphisms termed “phylogenetic
footprinting” or “shadowing”. For more distantly related species, the
alignment of orthologous genomic blocks is more complex than that of
protein sequences because nucleotides are information-poor compared
to amino acids (as there are only four nucleotides and at least 20 com-
mon amino acids). For regulatory elements, the information content is
generally low due to their short lengths; and for non-protein-coding
RNA genes, sequence conservation is weak as structural properties of
base pairing are more important. More crucially, the dynamic alignment
approaches designed for protein sequence analysis cannot cope with
sequence rearrangements such as duplications, inversions, and transposi-
tions, which are common among genomic sequences. The effectiveness
Search WWH ::




Custom Search