Information Technology Reference
In-Depth Information
possible, this should be done computationally to ensure synchronization of the data
with the source databases. This is not always possible, especially when information is
extracted from the literature. The second step would involve curators who manually
validate the data and resolve any arising conflicts. The main advantage of this approach
is that the majority of the work can be done computationally, thereby allowing the
curators to focus on the validation. This strategy reduces the overall amount of effort
while maintaining a high level of quality.
Access to the allergen databases is another required feature. Different users
have different types of access requirements and the allergen databases should aim
to satisfy all the various needs. The wet lab biologists generally require a Web
interface access to the individual records. Relevant search engine facilities are
required to enable the quick location of records of interest. The records should be
presented in a manner that is easy to interpret. Suitable data visualization methods
should be used for data that is difficult to represent textually. An example would
be 3D protein structure information that are usually presented in a protein
structure viewer.
Bioinformaticians studying allergens would need a different type of database
access. Bioinformatics analysis typically requires large sets of data rather than
individual records in order to extract meaningful results. Moreover the
information contained in these records must be in a computer-readable form.
Therefore, the format of the records is far more important to the bioinformatician
than to the wet lab biologist. At the very least, the records have to be presented
in some structured form. A structured record would allow for the efficient
parsing of the information into a computer-readable form for further
computational analysis. The extensible markup language XML is ideal for this
purpose because most biological data have few issues being represented in this
form. Furthermore, the provision of an XML scheme would permit rapid parsing
and validation of the records. For efficient linking of database records to other
resources, access to the individual records in the database should also be
available as hyperlinks.
An allergen database should also provide analysis tools capitalizing on the
underlying data that it contains to service the research community. The reasons for
this have been discussed in the previous section.
5.2.3 Existing Allergen Databases
An excellent review of existing databases was published in 2003 (Brusic, Millot,
Petrovsky, Gendel, Gigonzac, and Stelman 2003). Here, we exclude the reviewed
databases except for the IUIS list and Swiss-Prot list of allergens and highlight recent
additions to the growing list of allergen databases (Table 1).
Most of the databases covered in the review article are lacking one or more
desired features of an allergen database. Only a few databases provide bioinformatics
tools and permit the downloading of data. Furthermore, many of the databases
described are not actively maintained and lag behind in recording new allergens or
changes to existing allergen information.