Information Technology Reference
In-Depth Information
The developed models can be represented by the following expression:
;
;
;
;
.
The model
belongs to an approach (A) and it is composed by Fields (F), a type
of variable (TV) a DM technique (TDM) and a sampling method (SM):
_1
,_2
,_3
,_4
,_5
,_6
,_7
,_8
,_9
,_10
,_11
,_12
,_13
,_14
,
,
,
,
10
Through the notation for representing DM models it was possible to present an
example of an implemented model. A DM (
) model following the classification
approach, using the data from the fields (F_1, F_2, F_3, F_4, F_5, F_6, F_7, F_10 and
F_11), the SVM technique with RBF kernel and the sampling method 10-folds CV is
expressed by:
,,,,,,,,
.
3.5
Evaluation
In order to evaluate the results presented by the DM models three metrics were
considered: Accuracy, Specificity and Sensitivity.
The dataset used in the training phase it was divided in exclusive subsets through
the 10-folds CV. In the implementation of the respective dividing procedures, ten
executions were performed for each one of them. About 100 experiments were
performed for each test. Table 2 presents the four best models obtained and their
metrics: Accuracy (ACC), Sensitivity (SEN) and Specificity (SPE).
Table 2.
Valuation models
Model
Fields
TDM
SM
ACC
SEN
SPE
98.71% 99.33% 94.95%
,,
95.52% 98.28% 78.72%
,,,
96.54% 98.02% 87.55%
,
,
,
,
96.77% 98.83% 87.74%
The technique which provided the best result was the SVM with Linear kernel.
However the kernel also was extremely useful for decision trees to demonstrate which
fields had greater importance/relevance to the creation of the models. Naive Bayes
techniques did not show much relevance in these models because they were always
,,,,,,,,,
,,,,
Search WWH ::
Custom Search