Information Technology Reference
In-Depth Information
to mine both frequent and rare patterns. Experimental results are discussed in section
4. Finally, the conclusion and future work is drawn in section 5.
Fig. 1. An ontology schema for tourism information database
2
Problem Definition
In this section, we discuss rare item problem and extended version of frequent pat-
terns on multiple min_sups.
Table 1. Encoded transaction database
TID
Items
Items (ordered by MIS
value)
1
{111, 121, 211, 21*}
{21*, 111, 211, 121}
2
{111, 121, 221, 22*}
{111, 22*, 221, 123}
3
{111, 123, 212, 21*}
{21*, 111, 212, 123}
4
{112, 122, 222, 22*}
{112, 22*, 222, 122}
5
{112, 122, 211, 21*}
{112, 21*, 211, 122}
6
{112, 122, 222, 22*}
{112, 22*, 222, 122}
7
{111, 123, 211, 21*}
{21*, 111, 211, 123}
8
{112, 123, 211, 21*}
{112, 21*, 211, 123}
9
{112, 123, 222, 22*}
{112, 22*, 222, 123}
10
{112, 123, 212, 21*}
{112, 21*, 212, 123}
2.1
Rare Item Problem
There are mostly non-uniform in nature containing both frequent and rare items in
real world datasets. If the frequencies of items in a database vary widely, the
following issues will be encountered while mining frequent patterns under single
min_sup framework:
1.
If min_sup is set too high, the frequent patterns containing rare items will not
be exploited.
2.
If min_sup is set too low, it can find frequent patterns that involve both
frequent and rare items. However, it can also cause combinatorial explosion,
generating too many meaningless frequent patterns.
Search WWH ::




Custom Search