Database Reference
In-Depth Information
user-stated properties that are true about the data. The system then
uses these properties in order to insert missing tuples or correct conflict-
ing tuples. In the event of groups of conflicting tuples, a probability of
correctness is assigned to each tuple. Thus, the StreamClean approach
transform the data to a probabilistic representation, in which explicit
probability values are assigned to tuples. The approach then transforms
constraints on the tuples into constraints on the underlying probability
values. This also allows the possibility of soft constraints ,inwhicha
probability of a fact being correct is specified, rather than a hard con-
straint , in which the fact is deterministically known to be correct. The
StreamClean method uses a non-linear optimization method, where the
objective is to determine a probability assignment that maximizes en-
tropy while satisfying the integrity constraints. The intuition behind
maximizing entropy [32] is that in the absence of additional knowledge,
the underlying solutions should be as uniform as possible. For example,
the use of entropy maximization results in the explicit assumption, that
in the absence of stated constraints, the probabilities of different input
tuples are independent of each other. While this may not necessarily be
true in all solutions, it is the most reasonable assumption to make in the
absence of prior beliefs about such tuples.
It has been observed in [29] that RFID data exhibits a considerable
amount of redundancy because of multiple scans of the same item, even
when it is stationary at a given location. In practice, one needs to
track only interesting movements and activities on the item. This is
an issue which we will discuss in some detail in the next section on
data management and warehousing. RFID tag readings also exhibit a
considerable amount of spatial redundancy because of scans of the same
object from the RFID readers placed in multiple zones. This is primarily
because of the spatial overlap in the range of different sensor readers.
This provides seemingly inconsistent readings because of the inconsistent
(virtual) locations reported by the different sensors scanning the same
object. It has been observed in [15] that the redundancy is both a
blessing and a curse. While the redundancy causes inconsistent readings,
it also provides useful information about the location of an object in
cases, where the intended reader fails to perform its intended function.
In addition, it has been observed in [15], that a considerable amount of
background information is often available, which can be used in order
to enhance accuracy. This background information is as follows:
Prior knowledge about tagged objects and readers can be used in
order to improve accuracy.
Search WWH ::




Custom Search