Database Reference
In-Depth Information
Indexing Example
For an introduction to the world of indexing, examine the code that accompanies the
book (see “Accompanying Source Code” on page 15 )in the folder chapters/indexing ,
or in the /db/apps/exist-book/indexing collection if you have installed the XAR pack‐
age. The data subcollection contains two XML data files with some old Encyclopedia
Britannica entries. The contents of these files are exactly the same, with the exception
that one is in the tei namespace and the other is not.
For a look at the index definitions for these files, open /db/system/config/db/apps/
exist-book/indexing/data/collection.xconf :
<collection xmlns= "http://exist-db.org/collection-config/1.0" >
<index xmlns:tei= "http://www.tei-c.org/ns/1.0" >
<create qname= "tei:name" type= "xs:string" />
<ngram qname= "tei:p" />
<lucene>
<text qname= "tei:p" />
</lucene>
</index>
</collection>
This file defines three indexes on data in the /db/apps/exist-book/indexing/data
collection:
• A range index on all tei:name elements in this collection. A range index optimi‐
zes searches on the content of elements and attributes.
• An NGram index on all tei:p elements in this collection. An NGram index opti‐
mizes searches on substrings within the contents of elements and attributes.
• A full-text index on all tei:p elements in this collection. Full-text indexes opti‐
mize searches on words and phrases within the contents of elements and
attributes.
How this works exactly and what index type to use when are handled in this and the
next chapter. The effect, however, of these definitions is that the data in the tei -
namespaced file is indexed, but the (same) data without a namespace is not. This
allows us to easily run the same query over the indexed and nonindexed data and
compare the results.
The test-indexes.xq script does exactly this. It runs a number of queries over the con‐
tents of the two files and outputs the results as an XML fragment.
Search WWH ::




Custom Search