A Decentralized Directory Service for Peer-to-Peer-Based Telephony - Cloud, Grid and High Performance Computing: Emerging Applications

Information Technology Reference

In-Depth Information

Figure 5. Generating a 32 char identifier for an entry

Squid, the hot spots in terms of data load could be

avoided, but as a trade-off this results in a large

number of peers to be queried to find an entry.

In the following section, we introduce the

Extended Prefix Hash Tree as a way of enabling

efficient range queries, while preserving the ad-

vantages of the Random Oracle Model for hashing,

which results in a balanced distribution of the

entries among the peers in the DHT.

to filter out the right results when a user searched

for “Müller”.

The identifier length must be sufficient to en-

sure that a unique identifier can be built for each

entry with high probability. In our evaluation, the

identifiers were 32 characters long. Identifiers that

are longer than that are truncated; identifiers that

are shorter are padded with random characters.

Growing the Tree

EXTENDED PREFIX HASH

TREE ALGORITHM

The structure of an EPHT is shown in Figure

6. There are two parameters that determine the

shape of the tree:

Each entry in the distributed phone book is

associated with an identifier. The identifier is

a fixed-length string, consisting of the capital

characters [A-Z]. Identifiers are built by concat-

enating keywords from the entry. In the example

in Figure 5, we used the keywords last name, first

name, and city.

The order of the keywords determines the

relevance of these keywords for range queries. If

identifiers are built as in Figure 5, it is possible

to search for the last name without knowing the

city, but it is not possible to search for the city

without knowing the last name. This corresponds

to the hierarchical structure of printed phone

books, where entries are ordered by city, last name,

first name, etc. In order to allow alternative key-

word orders, the application must maintain sev-

eral trees in parallel.

Special characters like whitespaces or the Ger-

man ä, ö, ü, ß are omitted. That way, both German

names “Müller” and “Möller” map into the same

string “MLLER”. It is up to the application layer

1. n is the number of children per node. Each

edge is labeled with a character set, like

[S-Z]. The partitioning of the alphabet into

character sets is fixed and globally known,

and cannot be changed dynamically during

runtime. n is the number of character sets,

which can be any number between 2 and 26.

In the section on evaluation, we show that

the best performance is achieved with n =26.

2. m is the maximum load of the root node,

i.e. the maximum number of entries that

can be stored on the root node. If the root

node exceeds its maximum load, it splits up

into n child nodes and distributes all entries

among the children. The maximum load of

each child equals the maximum load of the

parent node plus one. The reason for incre-

menting the maximum load is to prevent

recursive splits, if all entries happen to be

stored on the same child. If a child node's

prefix length (see below for the definition

Cloud, Grid and High Performance Computing: Emerging Applications

Search WWH ::

Custom Search

Home