Database Reference
In-Depth Information
2.
Methodology of implementing OLAM, which includes integration of data mining and
data warehousing techniques into a unifi ed framework that ensures data availability,
fl exibility, and integrated information-processing environment for data analysis.
3.
The resultant cluster of web pages frequently visited by users for marketing use,
which includes identifying potential customers for e-commerce, evolving the web
sites to achieve the business objectives, enhancing the quality and delivery of Internet
information services to the end user, and helping web design to improve the web site
topology.
RELATED WORK
Association Rules Discovery
The concept of association rules was fi rst introduced in Agrawal, Imielinski and
Swami (1993). The problem of data mining for association rule has been studied extensively
(Harinarayan, Rajaraman & Ullman, 1996; Agrawal & Srikant, 1994; Bayardo, 1998;
Cheung, Han, Ng & Wong, 1996; Han, Karypis & Kumar, 1997; Park, Chen & Yu, 1995b;
Savasere, Omiecinski & Navathe, 1995; Fukuda, Morimoto, Morishita & Tokuyama, 1996;
Svawagi, Thomas & Agrawal, 1998). These studies covered a broad range of topics and its
variations have been studied, aimed for further improvements of the performance of the
algorithm. These are fast algorithms based on the Apriori Algorithm (Agrawal & Srikant,
1994), incremental updating and parallel algorithms (Cheung, Han, Ng & Wong, 1996; Park,
Chen & Yu, 1995b; Han, Karypis & Kumar, 1997), and mining of generalized, multi-level
rules, and multi-dimensional rules (Han & Fu, 1995; Zhao, Deshpande & Naughton, 1997). A
hash-based technique was used to reduce the size of the candidate k-itemsets; a scan reduction
technique was used to reduce the number of database scans; and a transaction reduction
technique was used to reduce the number of transactions scanned in future iteration (Park,
Chen & Yu, 1995a). Recently, a strategy based on partitioning the data showed a stronger
effect than the other scan reduction methods to reduce the number of scans required to two
(Savasere, Omiecinski & Navathe, 1995).
Sequential Patterns Mining
The problem of discovering sequential patterns mining is to fi nd inter-transaction
patterns such that the presence of a set of items is followed by another item in the time-
stamp ordered transaction set. It was fi rst introduced by Agrawal and Srikant (1995). The
algorithm AprioriAll was to fi nd all frequent patterns. Later, the same authors (Srikant &
Agrawal, 1996a) presented the GSP algorithm that outperforms AprioriAll by up to 20 times.
The GSP algorithm was a variation of the Apriori algorithm.
Mannila, Toivonen and Verkamo (1995) presented the problem of fi nding frequent
episodes in only one long sequence of events. An episode is defi ned as a set of events occurring
with a partially defi ned order and within a given time bound. They generalized their work
to allow one to express arbitrary unary conditions on the individual event attributes, or to
give binary conditions on the pairs of event attributes. Their experiments were performed
using a web server-level log fi le.
Oates and Cohen (1996) introduced the problem of detecting strong dependencies among
multiple streams of data. Their measure of dependency strength is based on the statistical
Search WWH ::




Custom Search