Spatial Index Schemes for Cloud Environments - Geographical Information Systems: Trends and Technologies

Global Positioning System Reference

In-Depth Information

when the amount of data is large (Nishimura et al. 2013). On the other

hand, distributed relational database management systems (DRDBMSs)

have been developed and are able to deal with multi-attribute accesses.

However, DRDBMSs are unable to maintain and retrieve data among

servers effi ciently because they take a lot of time to make sure the data is

consistent by appropriately locking and updating the data.

To deal with a huge amount of data effi ciently and fl exibly, cloud

computing is now playing an important role, and new cloud data

managements (CDMs), which are NoSQL databases (Stonebraker 2010),

have been developed. The most prevalent NoSQL CDMs, such as HBase

(Khetrapal and Ganesh 2008), Cassandra (Lakshman and Malik 2010) and

Amazon Simple Storage (Varia 2008), are developed based on a BigTable

(Chang et al. 2008) management system. Compared with DRDBMSs, these

management systems have the characteristics of high scalability, high

availability and fault-tolerance because they can effectively and effi ciently

handle a large number of data updates even if failure events occur. In

addition, a BigTable management system stores data as <key, value>

pairs, and thus these BigTable-like management systems can retrieve data

effi ciently by the following characteristics: 1) each<key, value> pair is stored

on multiple servers; and 2) each key owns multiple versions of a value.

In other words, the fi rst characteristic, benefi ts the effi ciency of retrieving

data, and the second characteristic eliminates the waiting time of making

data consistent. Due to the inherent restriction of a BigTable data structure,

however, these management systems only support some basic operations,

such as Get , Set and Scan . A Get operation retrieves values mapped by a

key; a Set operation inserts/modifi es values according to a corresponding

key; and a Scan operation returns all values mapped by a range of keys.

However, these basic operations do not directly support multi-attribute

accesses.

In this chapter, to support effi cient multi-attribute accesses of skewed

data on CDMs, we propose a novel multi-dimensional index, called the

KR + -index, on CDMs by designing Key names for leaves of the R + -tree. A

challenging issue is to fi lter out data after querying the results from large

differences in the volume of data between grids. In order to describe it

conveniently in this chapter, the volume of the data in the grid is represented

by the grid size. However, dividing a map more meticulously could

reduce the differences in the grid sizes but could also reduce the effi ciency

of accessing data. For example, for a range query, we need to retrieve

more grids for the same spatial range. According to the aforementioned

observations, we expect that the differences in the grid sizes could be

smaller and the time of the grid accesses could be less at the same time.

Consequently, how to divide a map into grids to reach a balance between

the two points plays an important role for CDMs. In this chapter, we fi rst

Geographical Information Systems: Trends and Technologies

Search WWH ::

Custom Search

Home