Information Technology Reference
In-Depth Information
7.7.1
Applications in Web Mining
In a conceptual paper, Secker et al. (2003) investigated the relevance of DT to
Web mining. An adaptive mailbox fi lter is presented, which essentially employs a
dynamical classifi cation task. h is system accepts or temporarily ignores incom-
ing e-mails depending on an importance measure decided by the user at a specifi c
instant of time. An antigen represents a processed original e-mail along with its
class. Clonal selection and mutation evolve the antibody set, which change and
update (including culling) over time refl ecting users' changing preferences and the
changing nature of received e-mails. h e authors state the ultimate idea inspired by
the combined artifi cial tissues capable of releasing artifi cial danger signals.
Nasraoui et al. (2002) proposed the fuzzy artifi cial recognition ball (ARB),
which represents a fuzzy set over the domain of discourse consisting of the training
dataset, as an improvement of the original ARB. h e fi nal fuzzy ARB population
can be consolidated by a crossover of randomly exchanging chromosomes, or by
any other reasonable aggregation such as arithmetically averaging. Synthesized data
and Web usage data are mined as the target of this method. For Web usage, the
fi nal merged ARBs correspond to typical profi les for the users accessing a given
Web site. h e average attributes refl ect the relevance of the individual URLs to the
combined ARBs.
Nasraoui et al. (2006) proposed a scalable immune-inspired clustering meth-
odology to continuously learn and adapt to new incoming patterns in Web mining.
In this work, the Web server plays the role of the human body, and the incoming
requests play the role of foreign antigens/bacteria/viruses that need to be detected
by the proposed immune-based clustering technique. Hence, this immune algo-
rithm is used to continuously perform clustering of the incoming noisy data. h e
authors claim that the proposed approach exhibits superior learning abilities while
requiring modest memory and computational costs. An important advantage of
this method is its adaptation to the dynamic environment that characterizes several
applications, particularly in mining data streams. h e performance of the proposed
approach is tested on mining user profi les from Web clickstream data in a single
pass under diff erent usage trend-sequencing scenarios.
7.7.2
Application in Anomaly Detection
Dasgupta and Forrest (1996) and Dasgupta and Gonzalez (2002) propose the use
of the NS algorithm to the application of detecting anomalies in general time series
data. A number of experiments were performed using Mackey-Glass time series
and other datasets (algorithmic steps in Figure 7.11).
In most of the works on anomaly detection, a sliding window scheme was used
for data preprocessing, which is illustrated in Figure 7.12.
Gonzalez et al. (2002) implemented an RNS and compared against an unsu-
pervised learning algorithm using diff erent datasets in anomaly detection. A further
Search WWH ::




Custom Search