Database Reference
In-Depth Information
type=
"java.io.Set"
>
<value>
the
</value>
<value>
a
</value>
<value>
an
</value>
</param>
</analyzer>
<text
qname=
"p"
/>
<text
qname=
"h1"
analyzer=
"a2"
/>
</lucene>
Now the
h1
element is indexed with the stopwords
the
,
a
, and
an
only.
Manual Full-Text Indexing
There is yet another way to use the Lucene full-text indexer inside eXist. You can
manually (through your own XQuery code) create an index associated with a
resource in the database. You can then use this index to query the contents of this
resource. Interestingly enough, the resource does not have to be an XML document,
so, in conjunction with the
contentextraction
extension module (see
contentex
traction
), you can create indexes to search binary content!
Here is how it works:
1. For some resource in your database (XML or otherwise), extract (or create) the
text fragments you want to index. For instance, assume we have an XHTML
document for which we want to index all the
p
and
h3
elements. We also want to
be able to search the
p
and
h3
elements separately.
2. Create an XML fragment with root element
doc
in which you list all these text
fragments and add them to so-called
fields
. A field can be seen as a subindex on a
document, so in our case we create two fields: one for the
h3
elements, called
headers
, and one for the
p
elements, called
paras
. Here is the code that does this:
declare
namespace
xhtml
=
"http://www.w3.org/1999/xhtml"
;
let
$
resource
:=
'/db/path/to/your/xhtml/document'
let
$
index-def
:=
<doc>
{
for
$
header
in
doc
(
$
resource
)//
xhtml:h3
return
<field
name
=
"
headers
"
store
=
"
yes
"
>
{
string
(
$
header
)
}
</field>
}
{
for
$
para
in
doc
(
$
resource
)//
xhtml:p
return
<field
name
=
"
paras
"
store
=
"
yes
"
>
{
string
(
$
para
)
}
</field>
}
</doc>
Search WWH ::
Custom Search