Irrelevant Feature and Rule Removal for Structural Associative Classification Using Structure-Preserving Flat Representation - Feature Selection for Data and Pattern Recognition

Information Technology Reference

In-Depth Information

ʲ i at i =

1 {

y r ln

[ ˀ(

at i val r ) ]+ (

−

y r )

[

− ˀ(

at i val r ) ]}

(10.3)

The statistical hypothesis is then used to determine whether the input attributes

are significantly related to the class attribute. A number of models can be developed

from logistic regression analysis, and each produces a different selection of attributes.

The model that fits the data well and has the highest predictive capability is selected.

Hence, logistic regression is used to discard any fA k

∈

(

fB k

∈

(

fC k

∈

ʲ i at i value is not significant

towards the class attribute Y (logistic regression analysis in Eq. 10.3 ).

Redundant and Contradictive Rule Removal : To remove redundant rules, we uti-

lize the concept of productive rules [ 4 ]. This approach is based onminimum improve-

ment redundant rule constraint [ 4 ], which discards any rule x

(

)

for which

∃

at i contained in x of x

ₒ

y ,the

ₒ

y if confidence

(

y with con-

fidence value c 1 is considered as redundant if there exists another rule z

ₒ

) ≤

max(confidence

(

ₒ

)) ∀

ↂ

x . In other words, a rule x

ₒ

y with

confidence value c 2, where z

c 2. The contradictory rule constraint

[ 53 ] is then utilised to discard two or more rules that have the same precedent but

imply a different class value.

Rules Accuracy and Rules Coverage : A measure needs to be applied to verify

whether the removal of a large volume of rules based on statistical analysis, and

redundancy and contradictory assessment methods, will enable the discovery of all

the interesting and significant subtree patterns. As such, the quality of the subtree

pattern will be demonstrated based on their accuracy and coverage values. The values

for rule accuracy and coverage will be measured at every stage and sequence of this

task. This measure is crucial as it can determine the quality of the discovered rules.

Additionally, this analysis will reveal the balancing/optimization issues with regards

to the trade-off between accuracy rate and coverage rate.

ↂ

x and c 1

≤

10.5 Experimental Evaluation

In this section we present the experiments performed using the CRM dataset

(real estate property management records in XML), CSLOGS dataset (web access

trees) and an academic institution dataset (web access trees), structural character-

istics of which are shown in Table 10.8 , and the following notation is used:

—

Number of transactions (independent tree instances);

—Number of unique labels;

—Fan-out-factor (or

degree). Please note, that in [ 52 ] where the structural/XML classificatotion was first

proposed, it was demonstrated that a simpler classifier that does not take the struc-

ture into the account cannot achieve equally good results. Similarly, in [ 51 ]itwas

empirically shown, that tree-structured web-browsing patterns are more informative

and useful than, their itemset/sequential pattern counter part. Hence, this study is not

repeated in this work, but rather an experimental study is presented on the use of

—Number of nodes (size) in a transaction;

—Depth;

Feature Selection for Data and Pattern Recognition

Search WWH ::

Custom Search

Home