Databases Reference
In-Depth Information
and produce work from the data allowing modifications and transformations. Due to the
presence of this specific license, the flight search engine is able to re-use this dataset to
pull geo-spatial information and feed it to the search engine.
Intrinsic Dimensions. Intrinsic dimensions are those that are independent of the user's
context. These dimensions focus on whether information correctly represents the real
world and whether information is logically consistent in itself. There are five dimen-
sions that are part of this group, namely, accuracy , objectivity , validity-of-documents ,
interlinking , consistency and conciseness .
Definition 16 (Accuracy). Accuracy can be defined as the extent to which data is cor-
rect, that is, the degree to which it correctly represents the real world facts and is also
free of error. In particular, we associate accuracy mainly to semantic accuracy which
relates to the correctness of a value to the actual real world value, that is, accuracy of
the meaning.
Metrics. Accuracy can be measured by checking the correctness of the data in a data
source. That is, the detection of outliers or identification of semantically incorrect val-
ues through the violation of functional dependency rules. Accuracy is one of the dimen-
sions, which is a
ected by assuming a closed or open world. When assuming an open
world, it is more challenging to assess accuracy, since more logical constraints need to
be specified for inferring logical contradictions.
Example. In the use case, let us suppose that a user is looking for flights between Paris
and New York. Instead of returning flights starting from Paris, France, the search returns
flights between Paris in Texas and New York. This kind of semantic inaccuracy in terms
of labelling as well as classification can lead to erroneous results.
ff
Definition 17 (Objectivity). Objectivity is defined as the degree to which the interpre-
tation and usage of data is unbiased, unprejudiced and impartial. This dimension highly
depends on the type of information and therefore is classified as a subjective dimension.
Metrics. Objectivity can not be measured qualitatively but indirectly by checking the
authenticity of the source responsible for the information, whether the dataset is neutral
or the publisher has a personal influence on the data provided. Additionally, it can be
measured by checking whether independent sources can confirm a single fact.
Example. In the example flight search engine, consider the reviews available for each
airline regarding the safety, comfort and prices. It may happen that an airline belonging
to a particular alliance is ranked higher than others when in reality it is not so. This could
be an indication of a bias where the review is falsified due to the providers preference
or intentions. This kind of bias or partiality a
ects the user as she might be provided
with incorrect information from expensive flights or from malicious websites.
ff
Definition 18 (Validity-of-documents). Validity-of-documents refers to the valid us-
age of the underlying vocabularies and the valid syntax of the documents (syntactic
accuracy).
Metrics. A syntax validator can be employed to assess the validity of a document, i.e. its
syntactic correctness. The syntactic accuracy of entities can be measured by detecting
Search WWH ::




Custom Search