Database Reference
In-Depth Information
challenge and responses makes the protocol efficient. However, PDP in its original
form does not work efficiently for dynamic data.
Another similar approach is based on insertion of sentinels or special markers
inside the stored file. In this Proof of Retrievability (POR) approach [20], clients
can send small challenges for file blocks and the presence of unmodified sentinels
provide a probabilistic guarantee about the integrity of files.
19.3.4 C onFiDentiality oF D ata anD C omPutation
Research Question 4 : How can we ensure confidentiality of data and computations
in a cloud?
Many users need to store sensitive data items in the cloud. For example, healthcare
and business data needs extra protection mandated by many government regulations. But
storing sensitive and confidential data in an untrusted third-party cloud provider expose
the data to both the cloud and malicious intruders who have compromised the cloud.
Encryption can be a simple solution for ensuring confidentiality of data sent to a
cloud. However, encryption comes at a cost—searching and sorting encrypted data
is expensive and reduces performance. A potential solution is to use homomorphic
encryption for computation on encrypted data in a cloud. However, homomorphic
encryption is very inefficient, and to this day, no practical homomorphic encryption
schemes have been developed.
19.3.5 P rivaCy
Research Question 5 : How do we perform outsourced computation while guaran-
teeing user privacy [28]?
For Big Data sets of very large scale, often clients or one-time users of such
data sets do not have the capability to download the data to their own systems. A
very common technique is to divide the system into data provider (which has the
data objects), computation provider (which provides the code), and a computational
platform (such as a MapReduce framework where the code will be run on the data).
However, for data sets containing personal information, a big challenge is to prevent
unauthorized leaks of private information back to the clients.
As an example, suppose that a researcher wants to run an analysis on the medical
records of 100,000 patients of a hospital. The hospital cannot release the data to the
researcher due to privacy issues, but it can make the data accessible to a trusted third-
party computational platform, where the code supplied by the researcher (computa-
tion provider) is run on the data, with the results being sent back to the researcher.
However, this model has risks—if the researcher is malicious, he can write a
code that will leak private information from the medical records directly through
the result data or via indirect means. To prevent such privacy violations, researchers
have proposed techniques that use the notion of differential privacy. For example, the
Airavat framework [28] modifies the MapReduce framework to incorporate differ-
ential privacy, thereby preventing the leakage of private information. However, the
current state-of-the-art in this area is very inefficient in terms of performance, often
causing more than 30% in overheads for privacy protection.
Search WWH ::




Custom Search