Databases Reference
In-Depth Information
Furthermore the query can be easily enhanced and customized according
to the needs since the query is well defined based on simple concepts.
3. Integration with RDBMS
Seamless integration with RDBMS reduces cost of maintenance and
maximize the portability since the difference between platforms can be
absorbed by the RDBMS. In addition, mature technologies used in RDBMS
such as query optimizations, parallelization, indexes, checkpoints so on
are available at no extra cost.
In our evaluation we employ a modified version of SETM. For the implementation
on commercial parallel RDBMS, we could utilyze some other techniques to enhance
the query. Here we introduce using view and subquery to reduce disk I/O.
Recently pure SQL implementation of the well known Apriori algorithm 2) has
been reported but the performance is far behind its object oriented SQL extensions
or other more loosely integrated approachs 11) .
Sarawagi et.al. extended the query to mine generalized association rule with
taxonomy 10) . In addition they also extended the query further to handle sequential
pattern as well. Analysis of execution plan has given some hints to improve
performance of the Apriori based query 12) .
4 Representation of Transaction Data
The transaction data can be representated in relational database using first
normalization such as ilustrated in Table 1. The schema for the table is SALES(TID,
item) where TID represents transaction ID and item represents item code or item
name. For each customer transaction that takes place, tuples corresponding to
every items are inserted into SALES.
4.1 Modified SETM
The first SQL query available to perform flat association rule is called SETM 4) .
In our experiments we employed ordinary standard SQL query that is similar
to SETM algorithm. We modified the query to enable hash join execution. It is
shown in figure 1.
In the first pass we simply gather the count of each item. Items that satisfy the
minimum support inserted into large itemsets table C_1 that takes form(item,
item count). Then transaction data that match large itemsets stored in R_1.
Search WWH ::




Custom Search