Bioinformatics

Operon finding in bacteria (Bioinformatics)

Bacterial genes are often organized into multigene transcriptional units (TUs), a series of genes that are transcribed together into one messenger RNA (mRNA) molecule. A TU starts with a promoter, which initiates transcription, and ends with a terminator, which terminates transcription (Figure 1a). The expression of the genes in a TU is controlled by one […]

Eukaryotic regulatory sequences (Bioinformatics)

  1. Introduction In metazoan organisms, cells have to express specific sets of genes to maintain cellular housekeeping as well as the cell’s specific identity and to respond to external stimuli that induce developmental differentiation, growth, and survival. To achieve the regulatory specificity, transcription of eukaryotic genes is controlled by a complex modular machinery of […]

Alternative splicing in humans (Bioinformatics)

1. Introduction In the human genome, the protein coding sequence of most genes is organized into discrete blocks of sequence, the exons, which are interrupted by noncoding sequences called introns. Exons and introns are transcribed together as long nuclear pre-mRNAs that must be spliced so as to excise the introns and ligate the exons, forming […]

Gene finding using multiple related species: a classification approach (Bioinformatics)

1. Introduction Ideally, we should be able to systematically discover all the functional genes in a newly sequenced genome from its sequence alone. Computational discovery methods rely both on the direct signals used by the cell to guide transcription, splicing, and translation, and also on indirect signals such as evolutionary conservation. In this paper, we […]

Exonic splicing enhancers and exonic splicing silencers (Bioinformatics)

1. Signals affecting the splicing of messenger RNA precursors RNA splicing is the process by which some sections of a primary RNA transcript (the introns) are removed, and those sections that are retained (the exons) are joined together. Splicing is carried out by the spliceosome, a large macromolecular machine consisting of five spliceosomal RNAs (U1, […]

Dynamic programming for gene finders (Bioinformatics)

1. Introduction Dynamic programming (DP), a technique commonly used in bioinformatics algorithms to reduce the evaluation time for recurrence relations, is especially prevalent in the implementation of gene-finding software. The task of gene finding has traditionally been formulated as that of choosing a single parse, or collection of zero or more nonoverlapping gene models, of […]

Computational motif discovery (Bioinformatics)

  Given the vast amounts of sequence data being generated by numerous genome and proteome projects, on which regions should biologists focus their attention? The goal of computational motif discovery is to predict relatively short subsequences that are good candidates to serve some biological function. This article focuses on the computational prediction of protein-binding sites […]

Gene structure prediction by genomic sequence alignment (Bioinformatics)

1. Introduction Gene prediction is the first and most fundamental step to genome analysis and annotation (see Article 13, Prokaryotic gene identification in silico, Volume 7, Article 14, Eukaryotic gene finding, Volume 7, Article 21, Gene structure prediction in plant genomes, Volume 7, and Article 26, Dynamic programming for gene finders, Volume 7). Consequently, the […]

In silico approaches to functional analysis of proteins (Bioinformatics)

1. Introduction Proteins are unrivaled as the primary functional agents of all biological systems. Hence, understanding how protein sequence and structure relate to its function is tantamount to decoding the most basic aspects of any given biological system. The emergence of the genomic era has resulted in an explosive growth of available protein sequences and […]

Contextual inference of protein function (Bioinformatics)

1. Introduction Ever since the first genome sequence of Haemophilus influenzae was determined in 1995, there has been an explosion in the number of organisms whose genomes have been completely sequenced and made available in public databases. As on December 2004, there were 235 organisms whose complete genome sequences are available, including that of human […]