Spatial Index Schemes for Cloud Environments - Geographical Information Systems: Trends and Technologies

Global Positioning System Reference

In-Depth Information

Carweb, which is a data collection machine. The data collected by Carweb

is extremely skewed. We also generate a uniform dataset that has 440,912

GPS location points. In the following we show the different results with

different data distributions. Furthermore, we generate the different sizes of

data points varying from 200,000 to 1,000,000 GPS location points to study

the scalability of the proposed index method.

Before evaluating our method by comparing it with others, we have

done some experiments on HBase and Cassandra. It is necessary to fi nd

the features of these CDMs, and we designed the index structure according

to these features. We observe that it is more effi cient to fetch a set of

keys continuously than to fetch a single key repeatedly, and it has bad

performance when one key stores too much data for these CDMs. Figure

7a shows the evaluation between scanning a set of data once and getting

one key many times which indicates that scanning is quite outstanding.

Figure 7b shows that the response time increases rapidly when the number

of data n is increased from 25600 to 51200.

36

24

1*size(n)

n*size(1)

1*size(n)

30

20

24

16

18

12

8

6

4

0

10 50 100 200 400 800 1600 3200 6400

number of data: n

3200 6400 12800 25600 51200 102400

number of data: n

(a) One Scan v.s n Get .

(b) Scan with large n .

Fig. 7. The features of the CDMs.

Color image of this figure appears in the color plate section at the end of the topic.

Tables 2 and 3 show the range query and the k -NN query on Cassandra

respectively. We compare our KR + with the Hilbert curves and no index.

With there is no index, we scan the databases to fi nd the location points in

the query. The Hilbert curve with order 4, 5, 6 is uniformly dividing the map

along the x-axis and y-axis into 24 × 24, 25 × 25 and 26 × 26. The method of

Scan DB is obviously very slow, about 105s to 203s for the range query and

105s to 127s for the k -NN query. The Hilbert curve method for the range

query is much faster than scan databases, the fastest for the range 1 km ×

1 km is 4s and 40 km × 40 km is 9.4s with an order of 4. The time increases

as the order of Hilbert curve increases since the number of sub-queries

increases as the order increases. Our KR + with order 4 grids is much faster

than the Hilbert curve since it has the feature of balancing the number of

Geographical Information Systems: Trends and Technologies

Search WWH ::

Custom Search

Home