Biology Reference
In-Depth Information
tion than, for example, a genetic locus, but it describes a relationship
that can be easily identifi ed computationally.” 30 Of course, no labora-
tory biologist would think of a gene in this way. To bridge this gap,
Ensembl's designers created a set of “adapters,” or Application Program
Interfaces (APIs), that mediate between the database and the user. The
APIs are software elements, written in the object-oriented programming
language Perl, that create familiar biological objects from the database:
Ensembl models real-world biological constructs as data ob-
jects. For example, Gene objects represent genes, Exon objects
represent exons, and RepeatFeature objects represent repetitive
regions. Data objects provide a natural, intuitive way to access
the wide variety of information that is available in a genome. 31
The APIs make it possible for software accessing the database to use
concepts familiar to laboratory biologists. For example, code to retrieve
a slice of DNA sequence from a region of chromosome 1 would be writ-
ten as follows:
#makes a connection to the database
my $db=Bio::EnsEMBL::DBSQL::DBAdaptor
new
( -host
'ensembl.db.org,'
-user
'anonymous,'
-dbname
'homo_sapiens_core_19_34'
);
#gets an object (called a slice adaptor)that slices up sequences
my $slice_adaptor=$db
get_SliceAdaptor();
#uses the slice adaptor to retrieve the sequence of chromosome 1 between
coordinate 10 million and 20 million
my $slice = $slice_adaptor
fetch_by_chr_start_end
('1,'10_000_000, 20_000_000
);
#extracts particular features from this piece of sequence
my $features=$slice
get_all_protein_align_features
The language of Perl suggests motion and action: slices, adaptors, “get,”
“fetch,” create the sense that the program is moving around and cutting
up real objects.
To create a further level of abstraction between the database and
the user, Ensembl also allows any piece of sequence to be treated as if it
were a single database object. The use of “virtual contigs” means that
the biologist can effectively treat the sequence as a continuous object
despite its underlying fragmented representation. The Ensembl design-
Search WWH ::




Custom Search