Information Technology Reference
In-Depth Information
size, without pruning. The tree is generated by classification and regression tree
(CART) algorithm [23] with the modification of a random feature subset to split
on. The random tree learning procedure and the combination method are the same
as in RF. Note that, RF can be used as a normal learning algorithm after balancing
training data by under-sampling or over-sampling. But in this procedure, ensem-
ble method is not adopted to handle imbalanced data; so it is different from BRF.
Bagging-style class imbalance methods also have other variants, such as
[24, 25]. These methods are very similar to the methods introduced earlier.
4.3.2 Boosting-Based Methods
Boosting-based CIL methods focus more on the minority class examples by
increasing the number of minority class examples or decreasing the number of
the majority class examples in each boosting round, such as SMOTEBoost [4]
and RUSBoost [7], and/or by balancing the weight distribution directly, such as
DataBoost-IM [8].
4.3.2.1 SMOTEBoost SMOTEBoost adopts boosting architecture (AdaBoost.
M2) to improve SMOTE, a very popular over-sampling method [4]. In each round
of boosting, SMOTE is used to generate synthetic examples for the minority class,
and then the weight distribution is adjusted accordingly. Thus, the weights of the
minority class examples become higher. It can be used in multiclass cases. The
algorithm is shown as follows.
4.3.2.2 RUSBoost RUSBoost is very similar to SMOTEBoost; the only dif-
ference between them is that RUSBoost uses random under-sampling to remove
the majority class examples in each boosting round [7].
4.3.2.3 DataBoost-IM DataBoost-IM designs two strategies to focus not only
on the misclassified examples but also on the minority class examples [8]. One is
selecting “hard” examples (examples easier to be misclassified) from the majority
class and the minority class separately and generating synthetic examples to
add into the training data; the other strategy is balancing the total weights of
the majority class and the minority class. In detail, in each round, first N s
=
n
err examples with highest weights in current training data are selected,
where err is error rate, containing N smaj majority class examples and N smin
majority class examples. From these hard examples, M L
×
min (n /n + ,N smaj )
majority class examples and M s = min ((n × M L )/n + ,N smin ) minority class
examples are selected as seed examples. Then, synthetic examples are generated
from the seed examples for the majority class and the minority class separately,
each with an initial weight equaling the seed example's weight divided by the
number of synthetic examples it generates. Then, the synthetic examples are
added into the training data. Thus, the hard minority class examples are more
than the hard majority class examples. Finally, the total weights of the majority
=
Search WWH ::




Custom Search