Database Reference
In-Depth Information
Indexing Example
For an introduction to the world of indexing, examine the code that accompanies the
book (see
“Accompanying Source Code” on page 15
)in the folder
chapters/indexing
,
or in the
/db/apps/exist-book/indexing
collection if you have installed the XAR pack‐
age. The
data
subcollection contains two XML data files with some old
Encyclopedia
Britannica
entries. The contents of these files are exactly the same, with the exception
that one is in the
tei
namespace and the other is not.
For a look at the index definitions for these files, open
/db/system/config/db/apps/
exist-book/indexing/data/collection.xconf
:
<collection
xmlns=
"http://exist-db.org/collection-config/1.0"
>
<index
xmlns:tei=
"http://www.tei-c.org/ns/1.0"
>
<create
qname=
"tei:name"
type=
"xs:string"
/>
<ngram
qname=
"tei:p"
/>
<lucene>
<text
qname=
"tei:p"
/>
</lucene>
</index>
</collection>
This file defines three indexes on data in the
/db/apps/exist-book/indexing/data
collection:
• A
range index
on all
tei:name
elements in this collection. A range index optimi‐
zes searches on the content of elements and attributes.
• An
NGram index
on all
tei:p
elements in this collection. An NGram index opti‐
mizes searches on substrings within the contents of elements and attributes.
• A
full-text index
on all
tei:p
elements in this collection. Full-text indexes opti‐
mize searches on words and phrases within the contents of elements and
attributes.
How this works exactly and what index type to use when are handled in this and the
next chapter. The effect, however, of these definitions is that the data in the
tei
-
namespaced file is indexed, but the (same) data without a namespace is not. This
allows us to easily run the same query over the indexed and nonindexed data and
compare the results.
The
test-indexes.xq
script does exactly this. It runs a number of queries over the con‐
tents of the two files and outputs the results as an XML fragment.
Search WWH ::
Custom Search