Database Reference
In-Depth Information
type= "java.io.Set" >
<value> the </value>
<value> a </value>
<value> an </value>
</param>
</analyzer>
<text qname= "p" />
<text qname= "h1" analyzer= "a2" />
</lucene>
Now the h1 element is indexed with the stopwords the , a , and an only.
Manual Full-Text Indexing
There is yet another way to use the Lucene full-text indexer inside eXist. You can
manually (through your own XQuery code) create an index associated with a
resource in the database. You can then use this index to query the contents of this
resource. Interestingly enough, the resource does not have to be an XML document,
so, in conjunction with the contentextraction extension module (see contentex
traction ), you can create indexes to search binary content!
Here is how it works:
1. For some resource in your database (XML or otherwise), extract (or create) the
text fragments you want to index. For instance, assume we have an XHTML
document for which we want to index all the p and h3 elements. We also want to
be able to search the p and h3 elements separately.
2. Create an XML fragment with root element doc in which you list all these text
fragments and add them to so-called fields . A field can be seen as a subindex on a
document, so in our case we create two fields: one for the h3 elements, called
headers , and one for the p elements, called paras . Here is the code that does this:
declare namespace xhtml = "http://www.w3.org/1999/xhtml" ;
let $ resource := '/db/path/to/your/xhtml/document'
let $ index-def :=
<doc>
{
for $ header in doc ( $ resource )// xhtml:h3
return
<field name = " headers " store = " yes " > { string ( $ header ) } </field>
}
{
for $ para in doc ( $ resource )// xhtml:p
return
<field name = " paras " store = " yes " > { string ( $ para ) } </field>
}
</doc>
Search WWH ::




Custom Search