Alu Sequences (Molecular Biology)

The genomes of almost all higher eukaryotes contain highly repetitive DNA sequences that are not clustered together. They are distributed throughout the genome, interspersed with longer stretches of DNA with unique (or moderately repetitive) sequences. In the human genome, the majority of such sequences belong to a single family of Sines called the Alu family . Each sequence is about 300 base pairs long. Although the many copies present are recognizably related, they are not precisely conserved in sequence. Their name derives from the fact that most contain a single site of cleavage for the restriction enzyme AluI near their middle. More than 500,000 Alu sequences are present in the human genome, accounting for 3 to 6% of the total DNA. Any particular segment of DNA of 5000 bp or longer has a high probability of containing at least one Alu sequence. Most Alu sequences are flanked by tandem direct repeats of DNA and move like transposable elements creating target-site duplications when they insert.

On average Alu DNA sequences contain about 80% identity between members of the family, but certain internal regions are more conserved: an internal 40-bp region and two sets of sequences, one near the 5 end and another one farther down in the transcriptional direction, that are homologous to sequences found in the promoter for RNA polymerase III.

One end of the Alu DNA segment is defined precisely by comparing several Alu sequences. The other end occurs at, or is adjacent to, a run of A bases of variable length that may or may not be interrupted occasionally by other bases. The internal structure of an Alu sequence is dimeric and may consist of an ancestral duplication of a segment of approximately 150 bp. In some rodents, a major SINE is 130 bp long and has sequence similarities with half of the primate Alu sequences. As in Alu, it is bound on one side by a poly(dA) sequence.


1. Origin

The Alu sequence derives from an internally deleted host cell 7SL RNA gene that encodes the RNA component of the signal-recognition particle (SRP) that functions in protein biosynthesis (1, 2). Consequently, an Alu sequence can be considered to be a transposable element or an unusually mobile pseudogene. Alu sequences are transcribed from the 7SL RNA promoter, a polymerase III promoter internal to the transcript, so that it carries the information necessary for its own transcription wherever it moves. However, it needs to borrow a reverse transcriptase to transpose.

2. Evolution

The Alu sequences may be grouped into discrete subfamilies on the basis of their sequences. Distinct families have amplified within the human genome in recent evolutionary history (3). The Human Specific or Predicted Variant subfamily, one of the most recently formed group of Alu sequences, amplified to 500 copies within the human genome sometime after the human/great ape divergence, which is thought to have occurred 4 to 6 million years ago. Comparisons of the sequence and locations of the Alu sequences in different mammals suggest that they have multiplied only recently.

Polymorphism of the Alu family member differs from other types of polymorphism, such as Variable Number of Tandem Repeat (VNTR, or minisatellite DNA) or Restriction Fragment Length Polymorphism (RFLP), because individuals share Alu insertions based upon identity by descent from a common ancestor as a result of a single event that occurred one time within the human population (4). In contrast the VNTR and RFLP polymorphisms have arisen multiple times within a population. Alu sequences represent a unique source of human genetic variation and a molecular fossil record of genomic evolutionary history. These sequences are natural landmarks for physical gene mapping and for reconstructing the evolutionary history/expansion of tandemly arrayed gene families (4).

3. Possible Functions

The physiological role of Alu elements is unknown, although it has been proposed that they are involved in DNA replication, regulation of transcription, and transport of signal recognition particle RNA to the nucleus. For example, Alu RNA and proteins that bind to Alu elements have been identified in human cells. In particular, it has been demonstrated that some Alu sequences in human gene regions have been altered in sequence so that they are now important in controlling and enhancing transcription (5). The consensus sequence of one of the major Alu families contains a functional retinoic acid binding element (see Response Element). The random insertion throughout the primate genome of thousands of Alu repeats containing a retinoic acid response element might have altered the expression of numerous genes, thereby contributing to evolutionary potential (6).

Next post:

Previous post: