Biomedical Engineering Reference
In-Depth Information
matching capabilities. They automatically search multiple databases using a variety of heuristics and
return results preformatted according to user preferences. Although intelligent agents vary in
capabilities, in general they automatically convert simple keyword searches to advanced pattern-
matching searches and, in some systems, concept searches. Instead of basing a search on a literal
match for a keyword, intelligent agents increase search resolution through restriction of word
proximity and exclusion of user-specified associations through Boolean operators.
Table 4-2. Search Engine Technologies. Many of the technologies applicable
to general-purpose search engines can be applied to searching
bioinformatics databases.
Search Engine Technology
Example
General-Purpose Intelligent Agents (Desktop) Intelliseek, Copernic, Lexibot, WebFerret,
SearchPad, WebStorm, and NetAttache
General-Purpose Intelligent Agents (Internet) Dogpile, Ixquick, MetaCrawler, QbSearch, ProFusion,
SurfWax, and Vivisimo
Internal (Intranet) Search Engines
AskMe, Cadenza
General-Purpose Search Engines
Google, Lycos, Yahoo!, Excite, AltaVista, AllTheWeb,
CompletePlanet
Sequence Match (Desktop and Internet)
FASTA, BLAST and BLAST derivatives
Utilities
Connection optimizers, browser extensions, personal
firewalls, file-transfer programs, download managers
Bioinformatics Portals
Entrez, SRS, BioKRIS, PubMed Central, Discovery
Space
Interface Tools
Natural Language Processing (NLP), Query by
example, controlled vocabulary
Intelligent agents that support concept searching perform searches based on the concept represented
by the keywords entered by the user. A concept search can be as simple as executing a search on a
synonym list, or as complex as inferring relationships between the keywords entered in the system.
For example, an agent-mediated search on "hypertension" could perform multiple keyword searches
on "hypertension" as well as "high blood pressure." A more sophisticated system could infer
additional search terms, such as co-morbidities of hypertension—specific renal and retinal diseases
resulting from high blood pressure, for example.
Concept-based searching is especially applicable in instances where the vocabulary may not be
consistent. For example, in a patient's medical record, a clinician might record the patient's complaint
of "chest pain" as "angina." A simple keyword search, whether mediated by an agent or submitted
directly to a search engine, would miss the alternate phrasing.
Advanced pattern search techniques don't necessarily involve concepts or recognizable keywords.
Nucleotide sequence searches use advanced techniques to identify incomplete or approximate
sequence matches. At this point in the development of molecular biology databases, higher-level
concept searches are still rare. However, researchers are quickly moving to provide the capability of
searching a database with a term such as "obesity" and viewing not only the physiological and
psychological components of obesity, but related protein structures and nucleotide sequences as well.
Portals
Search WWH ::




Custom Search