THE INTERNET OF THINGS: A SURVEY FROM THE DATA-CENTRIC PERSPECTIVE - Managing and Mining Sensor Data

Database Reference

In-Depth Information

of this step depends upon the particular implementation of MapReduce

which is used, and exact nature of the distributed data. For example,

the data may be distributed over a local cluster of computers (with the

use of an implementation such as Hadoop ), or it may be geographically

distributed because the data was originally created at that location, and

it is too expensive to move the data around. The latter scenario is much

more likely in the IoT framework. Nevertheless, the steps for collect-

ing the intermediate results from the different Map steps may depend

upon the specific implementation and scenario in which the MapReduce

framework is used.

The Reduce function is then applied in parallel to each group, which in

turn produces a collection of values in the same domain. Next, we apply

Reduce ( k 2 ,list ( v 2)) in order to create list ( v 3). Typically the Reduce

calls over the different keys are distributed over the different nodes, and

each such call will return one value, though it is possible for the call to

return more than one value. In the previous example, the input to Reduce

will be a list of the form ( Y ear, [ local max 1 ,local max 2 ,...local maxr ]),

where the local maximum values are determined by the execution of the

different Map functions. The Reduce function will then determine the

maximum value over the corresponding list in each call of the Reduce

function.

The MapReduce framework is very powerful in terms of enabling dis-

tributed search and indexing capabilities across the semantic web. An

overview paper in this direction [77] explores the various data processing

capabilities of MapReduce used by Yahoo ! for enabling e cient search

and indexing. The MapReduce framework has also been used for dis-

tributed reasoning across the semantic web [104, 105]. The work in

[105] addresses the issue of semantic web compression with the use of

the MapReduce framework. The work is based on the fact that since the

number of RDF statements are rapidly increasing over time (because of

a corresponding increase in the number of “things”), the compression of

these strings would be useful for storage and retrieval. One of the most

often used techniques for compressing data is called dictionary encod-

ing . It has been experimentally estimated that the statements on the

semantic web require about 150-210 bytes. If this text is replaced with

8 byte numbers, the same statement requires only 24 bytes, which is a

significant saving. The work in [105] presents methods for performing

this compression with the use of the MapReduce framework. Methods for

computing the closure of the RDF graph with the use of the MapReduce

framework are proposed in [104].

The Hadoop implementation of the MapReduce framework is an open

source implementation provided by Apache . This framework implements

Search WWH ::

Custom Search

Home