Databases Reference
In-Depth Information
The index type that is needed for Oracle Text is
CTXSYS.CONTEXT
(line 2). On line 3,
we specify that we want this index to be refreshed when a commit is issued.
There are many more options that can be used with Oracle Text,
such as searching for alternative spelling, searching for words in a
certain context, or searching independent diacritic characters. More
suggestions on using Oracle Text are listed in
Chapter 1
,
Prepare and
Build
. All these features are outside the scope of this chapter.
When you create the index, you will notice that a number of tables are created to
support the index. These tables have the preix
DR$DOC
, which Oracle uses to support
Oracle Text searches. This index will allow you to search through large amounts of
text such as Word, PDF, XML, HTML, or plain text documents.
For the following example, I have uploaded the document containing this chapter
into our Documents table, so we might ind some text that the reviewer told me to
remove. Because the Oracle Text index is in place, we can use the functions available
to us to search for certain keywords.
SQL> col mimetype format a22
SQL> col snippet format a38 word wrapped
SQL> select doc.mimetype
2 , ctx_doc.snippet ('doc_index'
3 ,id
4 ,'express'
5 ) snippet
6 from documents doc
7 where contains(doc.document, 'express') > 0
8 /
MIMETYPE SNIPPET
---------------------- --------------------------------------
application/msword Application <b>Express</b>. And that i
s true up to a certain point. You prob
ably know that the Oracle Application
<b>Express</b> engine is
1 row selected.
In the preceding query, we have used the
contains
query operator to search for
express
(line 7). The
contains
operator returns a relevant score for every selected
row. Because when we want all rows where the word "express" is in the text of
the
documents
column, we use the greater than zero comparison. You may notice
that we have put
express
in lowercase and we still get results back even though
we didn't use "express" in lowercase in this chapter (until this part of it at least).
contains
can search through texts in a case-insensitive manner.