Databases Reference
In-Depth Information
Inverted indexing allows users to designate noise
words — words such as “the” or “an” that are typically useless for retrieval
— to be excluded from indexing. This feature reduces the amount of time
it takes to load indexes and reduces the amount of storage space that in-
dexes require.
Excluded Words.
RDBMSs are unable to exclude values from indexing.
A composite key is a virtual key that allows the redefini-
tion of one or more existing columns. Users can easily create indexes from
entire fields or parts of fields. For example, a user can break an ACCOUNT-
NUMBER column into its components — DIVISION, DEPARTMENT, NATU-
RAL ACCOUNT — without duplicating the data. In addition, composite
keys can reorganize the bytes of a column into a new key. An example
would be rearranging a MMDDYY date column to YYMMDD.
Composite Keys.
RDBMSs do not allow composite keys. They require an index to be
comprised of an entire column, in its existing order, or a combination of
columns.
Grouping is a powerful feature that lets users in-
dex several keyword indexes in one index, thus providing the flexibility to
query several similar columns at one time. Say, for example, ADDRESS1,
ADDRESS2, and ADDRESS3 contained various address information, includ-
ing city, state, and country. By grouping these three columns, the index
treats them as one logical key or retrieval unit. Users can easily retrieve on
city, state, or address information, regardless of which column the data
was entered into.
Grouping of Columns.
RDBMSs do not have a grouping capability.
PERFORMANCE BENCHMARKS
In summary, inverted file indexes allow a variety of sophisticated search
techniques: full keyword searches (e.g., find all customers with the word
“Mark” somewhere in the company name), multidimensional searches
(e.g., find all customers with the word “Mark” somewhere in the company
name that has done business with the company in the last 6 months), range
searches (e.g., find all records with transactions between June and Decem-
ber), Soundex (e.g., find all records with any word that sounds like Mark
[Marc] in the company name), plurality, synonym searches, and searches
that ignore differences in capitalization. In addition, inverted file indexes
can deliver performance improvements of as much as 1000% on multiple
selection searches, allowing retrievals that might otherwise take minutes
or even hours to be completed in seconds.
Search WWH ::




Custom Search