Biology Reference
In-Depth Information
1.
Genome annotation: identification of features present in the
genomic sequence.
2.
Gene annotation: analysis of each protein-coding gene.
3.
Literature-based genome curation: integration of literature infor-
mation relevant to genes or the genome.
4.
Post-sequence curation: integration of data from experimental
sources.
5.
Comparative genomics: comparisons at three different levels:
1. whole genome alignments, 2. identification of orthologs at the
gene level, and 3. identification of SNPs and other polymor-
phisms at nucleotide/residue level.
Pathogen Background Information
Besides the genome curation, PATRIC's curation team will generate
detailed background information on pathogens using the XML-based
Pathogen Information Markup Language (PIML) as described by He
et al . 4 PIML allows for a portable, system-independent, machine-
parseable, and human-readable representation of general information
for any pathogen. This body of pathogen information can be queried
and displayed graphically from a web service (http://staff.vbi.vt.edu/
pathport/services/wsdls/piml.wsdl) in ToolBus/PathPort, 5 which pro-
vides a custom graphical visualization module for PIML documents and
interoperability with other types of infectious disease data (e.g. genomic).
A web-based query and display system was also developed to query
the complete pathogen information or a specific topic across multiple
pathogens (http://www.vbi.vt.edu/pathport/pathinfo/query.html).
These documents are expected to serve the scientific community with
up-to-date information on various aspects of the pathogens with the
listing of the corresponding bibliography.
Schema for Curation of Biological Data
A significant challenge in designing PATRIC is the storage and
retrieval of diverse kinds of biological data. Our computational infra-
structure must be robust, scalable and flexible to permit the integra-
tion of data on different organisms from multiple sources and allow
Search WWH ::




Custom Search