Database Reference
In-Depth Information
information about partners, schemas constraints, mappings, answers). Using the query interface (QI) a
user formulates a query. The query execution module (QE) controls the process of query reformulation,
query propagation to partners, merging of partial answers, discovering missing values, and returning
partial answers (Brzykcy et al., 2008; Pankowski, 2008b; Pankowski, 2008c; Pankowski et al., 2007;
Pankowski, 2008d). Communication between peers (QAP) is realized by means of Web Services tech-
nology.
6P2P Modeling Concepts
Basic notions constituting the 6P2P data model are: peers, data sources, schemas, constraints, mappings,
queries, answers, and propagation.
1. A peer , @ P , is identified by its URL address identifying also the Web Service representing the
peer. There are two methods exported by a peer: sendAnswer - used by peers to send to @ P the
answer to a query received from @ P , and propagateQuery - used by peers to send to @ P a query
to be answered (possibly with further propagation).
2. A data source is an XML document or an XML view over a centralized or distributed data. Different
techniques can be used to implement such a view - it can be AXML documents (Abiteboul et al.,
2002; Milo et al., 2005), a gateway to another 6P2P system or even to different data integration
systems. Thus, a community of various collaborating information integration engines can be created.
3. A schema is used to specify structural properties of the data source and also the structure of the
intermediate answers to queries. In 6P2P, schemas are defined as tree-pattern formulas discussed
in previous sections.
4. Constraints delivers additional knowledge about data. Two kinds of constraints are taken into
consideration: tree-pattern XML functional dependencies (TP-XFD) and XML keys (Arenas &
Libkin, 2004; Buneman et al., 2003; Pankowski, 2008c). TP-XFD will be used to control query
propagation and answer merging (especially to discover some missing data), and keys for elimi-
nating duplicates and appropriate nesting of elements. In this chapter we restrict ourselves only to
TP-XFDs.
5. Mappings specify how data structured under a source schema is to be transformed into data conform-
ing to a target schema (Fagin et al., 2004; Pankowski, 2008c). Information provided by mappings
is also used to query reformulation. In previous sections we presented algorithms translating high
level specifications of mappings, queries and constraints into XQuery programs.
6. A query issued against a peer can be split up to many query threads - one query thread for one trace
incoming to the peer (corresponding to one propagation). Partial answers to all query threads are
merged to produce the answer to the query. A peer can receive many threads of the same query.
7. An answer is the result of query evaluation. There are partial and final answers. A partial answer is
an answer delivered by a partner who the query was propagated to. All partial answers are merged
and transformed to obtain the final answer. In some cases (when the peer decides about discov-
ering missing values), a whole peer's data source may be involved into the merging process. In
(Pankowski, 2008b) we discuss a method of dealing with hard inconsistent data, i.e. data that is
other than null and violates TP-XFDs. The method proposed in (Pankowski, 2008b) is based on
trustworthiness of data sources.
Search WWH ::




Custom Search