Database Reference
In-Depth Information
has been experimentally tested on three different
domains (microbial risk in food, chemical risk
in food and aeronautics): three OWL ontologies
have been created within a couple of hours thanks
to preexisting information retrieved from local
databases and a very simple tool which translates
automatically csv files containing the metadata
into an OWL ontology; second, the structure of
data tables is highly variable (even tables in the
same paper don't have the same structure) and
terms appear in tables with no linguistic context,
that invalidates the annotation techniques that
learn wrappers based on structure and/or textual
context such as Lixto (Baumgartner & al., 2001)
or BWI (Freitag & Kushmerick, 2000). Our ap-
proach can be compared to the construction of
frames from tables described in Pivk & al. (2004)
but they use a generic ontology and create new
relations according to the table signature, whereas
we want to recognize predefined relations in an
ontology specific to the target domain.
In the framework of XML database flexible
querying, different approaches have been proposed
to extend either XPATH or SPARQL. (Campi & al.,
2006) proposes FUZZYXPATH, a fuzzy extension
of XPATH to query XML documents. Extensions
are of two kinds: (i) the 'deep-similar' function
permits a relaxed comparison in term of structure
between the query tree and the data tree; (ii) the
'close' and 'similar' predicates extend the equality
comparison to a similarity comparison between
the content of a node and a given value expressed
in the query. (Hutardo & al., 2006) proposes an
extension of the SPARQL 'Optional' clause (called
Relax). This clause permits to compute a set of
generalizations of the RDF triplets involved in
the SPARQL query using especially declarations
done in the RDF Schema. (Corby & al., 2004)
also proposes the same kind of extension of the
SPARQL query using a distance function applied
to the classes and properties of the RDF Schema.
The originality of our approach in flexible query-
ing is that we propose a complete and integrated
solution which permits (1) to annotate data tables
with the vocabulary defined in an OWL ontology,
(2) to execute a flexible query of the annotated
tables using the same vocabulary and taking into
account the pertinence degrees generated by the
annotation system.
Finally, the ontology alignment problem has
been widely investigated in the literature (Castano
& al., 2007; Euzenat & Shvaiko, 2007; Kalfoglou
& Schorlemmer, 2003; Noy, 2004). Our original-
ity is to treat that problem as a rule application
problem where a source ontology, considered as a
fact base, is aligned with a target one, considered
as a rule base.
FUTURE RESEARCH DIRECTIONS
The domain ontology is the central element of our
data integration system. In the future, we want
to carry on our work on data integration based
on ontology.
First, we intend enhancing the performance
of the annotation system using machine learning
techniques (Doan & al., 2003) on the knowledge
of the ontology but without manual training on a
subset of the corpus. By example, a new classifier
for symbolic types can be added to the existing one
and trained using the domain of values associated
with the symbolic type in the ontology. Second,
we want to integrate the user's opinion on the
query result in order to improve the underlying
semantic annotation process and consequently
to enrich the ontology. Third, since our flexible
querying system allows the user to query uniformly
several sources indexed by the same ontology, we
want to extend our system in order to be able to
query several sources relying on distinct ontolo-
gies which have been previously aligned. Fourth,
one important feature which must be added to @
Web is to be able to detect that data included in
tables retrieved from different documents of the
Web are redundant. We want to use reference
reconciliation methods (Sais & al., 2007) to deal
with this problem.
Search WWH ::




Custom Search