Information Technology Reference
In-Depth Information
healthcare costs involved in the detection and treatment of allergies. Thirty-nine million
people in the United States suffer from allergic rhinitis but only 12.3% seek medical
attention. Nevertheless, this has led to $1.23 billion in healthcare costs (Malone et al.
1997). The percentages of people who seek medical attention are likely to rise leading to
even higher healthcare costs.
The problem is further compounded by the fact that in addition to the natural
sources of allergens like house dust mites and pollen, introduction of recombinant
proteins made possible by molecular genetics into food, medicine, and other products
is increasing the number of potential allergens in our environment. The allergenicity
of these new recombinant proteins is unknown and this has made safety issues about
products containing these proteins paramount. In addition to recombinant proteins,
hidden allergens are also found in unexpected sources that people typically do not
guard against. For example, milk proteins in processed food are a source of hidden
potential allergens that most people would not suspect (Cantani 1999).
In view of these safety concerns, both the FAO (Food and Agriculture
Organization) and WHO (World Health Organization) have jointly produced a
procedure for evaluating potential allergenicity for any novel protein (FAO/WHO
2001; FAO/WHO 2003). This scheme involves the use of bioinformatics as an initial
step to determine whether the protein in question has any allergenicity potential. This
is accomplished by determining whether the primary sequence of the novel protein
bears significant sequence similarity to another known allergen. The significance is
measured as either a greater than 35% similarity over a window of 80 amino acids or
a stretch of 6 to 8 identity amino acids to any known allergens.
The increasing importance of allergy has also fueled extensive research in this
field and generated large amounts of data. This has been reflected by the number of
research articles appearing in the literature. Data contained in PubMed indicate that
in the period from 1993 to 2003, the number of allergen articles per year has doubled
to 1023 in the year 2003. The rapid growth of sequence information in major public
databases like GenBank (Benson, Karsch-Mizrachi, Lipman, Ostell and Wheeler
2003) and Swiss-Prot (O'Donovan, Martin, Gattiker, Gasteiger, Bairoch, and
Apweiler 2002) has also contributed significant amounts of allergen-related sequence
information. There is also growth in the number of allergen 3D structures although
the growth is not as spectacular as that of the sequence databases.
Traditionally, bioinformatics applications have been used in the analysis of
individual allergens (Izumi, Sugiyama, Matsuda, and Nakamura 1999; Mills,
Hart, Lynch, Thomas, and Smith 1999; Ichikawa, Vailes, Pomes, and Chapman
2001; Iyer, Koonin, and Aravind 2001). In this aspect, bioinformatics
applications like sequence similarity searches (Mills et al. 1999; Ichikawa et al.
2001), protein structure comparison (Iyer et al. 2001), sequence profile searches
(Iyer et al. 2001), multiple sequence alignments (Iyer et al. 2001), secondary
structure prediction (Izumi et al. 1999), protein sequence analysis (Izumi et al.
1999), and homology modeling (Ichikawa et al. 2001) have greatly aided the
study of allergens by providing further insights to the workings of allergens. The
applications of these methods are similar to those used in other fields and we
will not go into details. Instead, we will discuss the various specific issues
involved in the management of allergen data as well as some specific recent
Search WWH ::




Custom Search