Database Reference
In-Depth Information
Fig. 8. Total time according to m
Fig. 9. Comparative study
5 Related Works
The problem we face in this work is protecting MapReduce applications data in
public environments. Existing solutions focus on data when they have been sent
to be processed; either they apply control techniques such as Mandatory Access
Control (MAC) [Abr90, McC04] , or results verification and results control tech-
niques [Dwo10, Dwo11] which have been used by systems like Airavat [RSK + 10]
and SecureMR [WDYG09] to ensure security, integrity and privacy for MapRe-
duce during running the application. None of these systems has considered the
threats that may occur during the dispersal of data over the working machines
deployed on public clouds or desktop grids.
IDA has been essentially exploited for file sharing systems [Rab90, DFM00].
It was applied to provide a secure and reliable storage of information, and supply
fault-tolerant and ecient transmission of information in networks. The concern
was how to prevent loss of data when stocking without having to duplicate data
and provide enormous capacities, or when transmitting avoiding sending multiple
copies and charging the network. With IDA, since any m pieces, among the n
created, can reconstruct the data, loosing parts of data when stocking or routing
can be remedied.
6Con lu on
We have proposed a new approach of securing data distribution for MapReduce
applications, using Information Dispersal Algorithm. IDA is a mechanism that
allows to split a file into pieces so that, by carefully dispersing the pieces, there
is no method for a single node to reconstruct the data unless it cooperates with
others. We have implemented a prototype that adapt IDA to MapReduce needs
while respecting privacy constraints. We have realized several experiments to
evaluate our prototype performance.
In a future work, we plan to integrate our protocol in distributed MapReduce
frameworks such as hadoop and BitDew.
Search WWH ::




Custom Search