Database Reference
In-Depth Information
Automatic Discovery Programs—a.k.a. Data Mining
The number and size of operational databases are increasing at a progres-
sively quickened rate. Because of the number of these databases as well
as their size and complexity, there is a tremendous amount of valuable
knowledge locked up that remains undiscovered. Because the tendency of
most modern organizations is to cut back on staff, it follows that there will
never be enough analysts to interpret the data in all the databases.
Over two decades ago, Kamran Parsaye (1990) coined the term intel-
ligent databases . The goal of intelligent databases is to be able to man-
age information in a natural way, making the information stored within
these databases easy to store, access, and use. The prototypical intelligent
database would have some robust requirements. It would need to provide
some high-level tools for data analysis, discovery, and integrity control.
These tools would be used to allow users not only to extract knowledge
from databases, but also to apply knowledge to data. So far, it is not pos-
sible to scan through the pages of a database as easily as it is to flip through
the pages of a book. In order for the label intelligent database to be valid,
this feature is necessary. Users should be able to retrieve information from
a computerized database as easily as they can get from a helpful human
expert. Finally, an intelligent database must be able to retrieve knowledge
as opposed to data. To do this, it needs to use inferencing capabilities to
determine what a user needs to know.
In developing the theory behind intelligent databases, Parsaye et al.
(1990) enumerated three basic levels in dealing with the database:
1. We collect data, e.g., we maintain records on clients, products, sales,
etc.
2. We query data, e.g., “Which products had increasing sales last
month?”
3. We try to understand data, e.g., “What makes a product successful?”
In general, most current database systems passively permit these func-
tions. A database is a static repository of information that will provide
answers when a human initiates a session and asks pertinent questions.
Parsaye came up with the idea of automatic discovery software, the purpose
of which was to analyze large databases and discovered patterns, rules, and
often unexpected relationships. Automatic discovery software uses statis-
tics and machine learning to generate easy-to-read rules that characterize
Search WWH ::




Custom Search