Database Reference
In-Depth Information
related multi-dimensional time-series data represented by sensory data
streams. Correlations within and across sensor data streams and the
spatio-temporal context of data offer new opportunities for privacy at-
tacks. The challenge is to perturb a user's sequence of data values such
that (i) the individual data items and their trend (i.e., their changes with
time) cannot be estimated without large error, whereas (ii) the distri-
bution of the data aggregation results at any point in time is estimated
with high accuracy. For instance, in a health-and-fitness social sensing
application, it may be desired to find the average weight loss trend of
those on a particular diet or exercise routine as well as the distribution
of weight loss as a function of time on the diet. This is to be accom-
plished without being able to reconstruct any individual's weight and
weight trend without significant error.
Examples of data perturbation techniques can be found in [14, 13,
59]. The general idea is to add random noise with a known distribu-
tion to the user's data, after which a reconstruction algorithm is used to
estimate the distribution of the original data. Early approaches relied
on adding independent random noise. These approaches were shown
to be inadequate. For example, a special technique based on random
matrix theory has been proposed in [95] to recover the user data with
high accuracy. Later approaches considered hiding individual data val-
ues collected from different private parties, taking into account that
data from different individuals may be correlated [86]. However, they
do not make assumptions on the model describing the evolution of data
values from a given party over time, which can be used to jeopardize pri-
vacy of data streams. Perturbation techniques must specifically consider
the data evolution model to prevent attacks that extract regularities in
correlated data such as spectral filtering [95] and Principal Component
Analysis (PCA) [86]. In addition to data perturbation, numerous group-
based anonymization methods have been proposed such as k -anonymity
and -diversity [9]. In k -anonymity methods, the data features are per-
turbed, so that adversarial attacks always retain an ambiguity level over
k -different participants. In -diversity, criteria are imposed over a group
to ensure that the values of the sensitive attributes are suciently diverse
within a group. This is motivated by the observation that k -anonymity
may sometimes not preserve the truth about individual sensitive values,
when all sensitive values within an anonymized group are the same.
In work discussed earlier in this chapter [61], it was shown that privacy
of time-series data can be preserved if the noise used to perturb the
data is itself generated from a process that approximately models the
measured phenomenon. For instance, in the weight watchers example,
we may have an intuitive feel for the time scales and ranges of weight
Search WWH ::




Custom Search