Information Technology Reference
In-Depth Information
Figure 4. Mercury hub
traffic load. This could be avoided with Squid by
introducing many dimensions in order to distrib-
ute the entries among many different peers. But
introducing many dimensions results in a tangled-
up SFC. As a result, many short fragments of the
curve need to be processed for each keyword that
is not specified in a query. We evaluated Squid
and found that this results in a very high number
of peers to be queried in order to find an entry.
Mercury
rupted and the user is asked to specify the query
more precisely, e.g. by including the first name
in the query.
That way, peers being responsible for fre-
quent names are relieved. As the last names are
Zipf-distributed, there are only a few names to
be included in the blacklist in order to achieve
significant load balancing.
However, in spite of the load balancing
achieved with fusion dictionaries, the approaches
introduced above still do not fulfill the scalability
and performance requirements of large scale com-
munication platforms. In this article, we present the
EPHT, which is a search index that does not result
in overloaded peers. That way it is unnecessary to
introduce additional load balancing techniques.
Like Squid, the Mercury approach (Bharambe et
al, 2004) supports multi-dimensional keywords.
Each dimension is handled within a separate hub,
which is a ring-shaped formation of peers. An ex-
ample of a Mercury hub is illustrated in Figure 4.
The ID range within a hub is ordered linearly,
which results in the same load balancing problems
as with the other approaches. However, Mercury
suggests that peers are moved around dynami-
cally to balance the load. Although this might be
a reasonable approach in other scenarios, this
raises difficulties in the distributed phone book
scenario. First, there are a few very popular last
names. A peer being responsible for one of these
popular last names cannot be relieved by moving
around other peers, and it will stay a hot spot in
terms of data load. Second, if peers may choose
their position in the overlay deliberately, this
raises certain security issues, because an attacker
who wants to make a person unreachable can
position its peer in a way that it becomes respon-
sible for routing queries to the victim's entry.
Summary
The brief survey of related work showed that
there are several difficulties with previous range
query solutions when applied in the distributed
phone book scenario. A more detailed overview
of search methods in peer-to-peer systems can
be found in (Risson et al, 2006). An analysis of
arbitrary search in structured peer-to-peer systems
was published in (Hautakorpi et al, 2010).
Approaches supporting real multi-dimensional
keywords like Mercury and Squid have the prob-
lem of hot spots with very popular last names. Ad-
ditionally, approaches relying on linear keywords
instead of real hashing suffer from overloaded
peers being responsible for popular prefixes. In
Fusion Dictionary
Fusion Dictionaries (Liu et al, 2004) are not a
distributed search index, but a load balancing tech-
nique that can be combined with search indexes.
The idea is to maintain a blacklist of names that
are very common, and to cache blacklist entries
in large parts of the DHT. If a user queries a last
name that is in the blacklist, the query is inter-
Search WWH ::




Custom Search