Database Reference
In-Depth Information
Hadjieleftheriou et al. [33] introduced further optimizations for the spe-
cial case of Cosine similarity. L p -norm based filtering has also been used
by Li et al. [48] and Xiao et al. [75].
A detailed analysis of threshold based algorithms is conducted by
Fagin et al. [29]. Improved termination conditions for these algorithms
are discussed by Sarawagi and Kirpal [60], Bayardo et al. [10] and Had-
jieleftheriou et al. [33]. The heaviest first algorithm for weighted
intersection based on prefix and sux lists is based on ideas introduced
by Sarawagi and Kirpal [60] and Chaudhuri et al. [19]. The same algo-
rithm, assuming unit token weights, was extended for arbitrary prefix
lengths by Li et al. [48]. The heaviest first algorithm for arbitrary
token weights was introduced by Hadjieleftheriou et al. [33].
The partitioning strategy for all-match join queries with memory
constraints was proposed by Sarawagi and Kirpal [60]. The incremen-
tal indexing for self-join queries was first proposed by Sarawagi and
Kirpal [60]. The improved algorithm for Jaccard, Dice, and Cosine
similarity based on deleting elements from the top of token lists was
proposed by Bayardo et al. [10]. The block nested loop self-join algo-
rithm in case of memory constraints was also proposed by Bayardo
et al. [10]. Various techniques for answering top- k queries using inverted
indexes and the multiway merge strategy were discussed by Vernica
and Li [69].
Ecient online updates for inverted indexes have been studied
extensively by Lester et al. [46]. Propagating updates for inverted
indexes stored in a relational database were addressed by Koudas
et al. [45]. Index construction and update related issues with regard
to L p -norm computation is discussed in detail by Hadjieleftheriou
et al. [34].
Search WWH ::




Custom Search