A Comparison of Rule Induction Using Feature Selection and the LEM2 Algorithm - Feature Selection for Data and Pattern Recognition

Information Technology Reference

In-Depth Information

Chapter 8

A Comparison of Rule Induction

Using Feature Selection and the LEM2

Algorithm

Jerzy W. Grzymała-Busse

Abstract Themain objective of this chapter is to compare a strategy of rule induction

based on feature selection, exemplified by the LEM1 algorithm, with another strategy,

not using feature selection, exemplified by theLEM2 algorithm. TheLEM2 algorithm

uses all possible attribute-value pairs as the search space. It is shown that LEM2

significantly outperforms LEM1, a strategy based on feature selection in terms of

an error rate (5% significance level, two-tailed test). At the same time, the LEM2

algorithm induces smaller rule sets with the smaller total number of conditions as

well. The time complexity for both algorithms is the same.

·

Keywords Rough set theory

Feature selection

LERS datamining system

LEM1

and LEM2 rule induction algorithms

8.1 Introduction

In 1982 an approach to feature selection, under the name of attribute reduction, using

rough set theory, was introduced in [ 26 ], see also [ 27 , 28 ]. In the rough set community

reducing the original attribute set of attributes is one of the main and frequently used

techniques.

Feature selection is the process of selecting a subset of relevant features. Research

on feature selection, see, e.g., [ 2 , 6 , 20 - 23 , 29 , 31 ], includes finding the smallest set

of features, improving this way the efficiency of data processing. Data are presented

in tables, with rows labeled as cases (examples or entries) and columns labeled as

features (variables or attributes).

Search WWH ::

Custom Search

Home