SOCIAL SENSING - Managing and Mining Sensor Data

Database Reference

In-Depth Information

related multi-dimensional time-series data represented by sensory data

streams. Correlations within and across sensor data streams and the

spatio-temporal context of data offer new opportunities for privacy at-

tacks. The challenge is to perturb a user's sequence of data values such

that (i) the individual data items and their trend (i.e., their changes with

time) cannot be estimated without large error, whereas (ii) the distri-

bution of the data aggregation results at any point in time is estimated

with high accuracy. For instance, in a health-and-fitness social sensing

application, it may be desired to find the average weight loss trend of

those on a particular diet or exercise routine as well as the distribution

of weight loss as a function of time on the diet. This is to be accom-

plished without being able to reconstruct any individual's weight and

weight trend without significant error.

Examples of data perturbation techniques can be found in [14, 13,

59]. The general idea is to add random noise with a known distribu-

tion to the user's data, after which a reconstruction algorithm is used to

estimate the distribution of the original data. Early approaches relied

on adding independent random noise. These approaches were shown

to be inadequate. For example, a special technique based on random

matrix theory has been proposed in [95] to recover the user data with

high accuracy. Later approaches considered hiding individual data val-

ues collected from different private parties, taking into account that

data from different individuals may be correlated [86]. However, they

do not make assumptions on the model describing the evolution of data

values from a given party over time, which can be used to jeopardize pri-

vacy of data streams. Perturbation techniques must specifically consider

the data evolution model to prevent attacks that extract regularities in

correlated data such as spectral filtering [95] and Principal Component

Analysis (PCA) [86]. In addition to data perturbation, numerous group-

based anonymization methods have been proposed such as k -anonymity

and -diversity [9]. In k -anonymity methods, the data features are per-

turbed, so that adversarial attacks always retain an ambiguity level over

k -different participants. In -diversity, criteria are imposed over a group

to ensure that the values of the sensitive attributes are suciently diverse

within a group. This is motivated by the observation that k -anonymity

may sometimes not preserve the truth about individual sensitive values,

when all sensitive values within an anonymized group are the same.

In work discussed earlier in this chapter [61], it was shown that privacy

of time-series data can be preserved if the noise used to perturb the

data is itself generated from a process that approximately models the

measured phenomenon. For instance, in the weight watchers example,

we may have an intuitive feel for the time scales and ranges of weight

Search WWH ::

Custom Search

Home