Information Technology Reference
In-Depth Information
to get information about the famous jazzman Miles Davis, we have: Subject keywords:=“Davis,” and
domain keyword:=“music.” We want to be able to retrieve the interesting pages from the user perspec-
tive, without considering the ones related to tennis Davis Cup, that pertains to the sport domain. Our
system must be able to retrieve and rank results, taking into account the Semantics of the pages and the
interaction with the user. In other words, this system performs the following tasks:
Fetching: Fetching consists of searching Web documents containing the keywords specified in
the query. This task can be accomplished using traditional search engines.
Preprocessing: This task is needed to remove from Web documents all those elements that do not
represent useful information (HTML tags, scripts, applets, etc.).
Mining: Mining consists of analyzing the content of the documents from a Semantic point of view,
assigning them a score with respect to the query.
Reporting: This task consists in ranking and returning the documents relevant to the query al-
lowing some functionality for relevance feedback.
We use external search engines in the fetching step.
The system implementation is based on several services. In this context each software module per-
forms one of the actions previously described.
Figure 1 presents a complete architectural view of the proposed system.
Figure 1. System architecture
Search WWH ::




Custom Search