Information Technology Reference
In-Depth Information
The developed models can be represented by the following expression:
; ; ; ; .
The model belongs to an approach (A) and it is composed by Fields (F), a type
of variable (TV) a DM technique (TDM) and a sampling method (SM):
_1 ,_2 ,_3 ,_4 ,_5 ,_6 ,_7 ,_8 ,_9 ,_10 ,_11
,_12 ,_13 ,_14
, , , ,
10
Through the notation for representing DM models it was possible to present an
example of an implemented model. A DM ( ) model following the classification
approach, using the data from the fields (F_1, F_2, F_3, F_4, F_5, F_6, F_7, F_10 and
F_11), the SVM technique with RBF kernel and the sampling method 10-folds CV is
expressed by:
,,,,,,,, .
3.5
Evaluation
In order to evaluate the results presented by the DM models three metrics were
considered: Accuracy, Specificity and Sensitivity.
The dataset used in the training phase it was divided in exclusive subsets through
the 10-folds CV. In the implementation of the respective dividing procedures, ten
executions were performed for each one of them. About 100 experiments were
performed for each test. Table 2 presents the four best models obtained and their
metrics: Accuracy (ACC), Sensitivity (SEN) and Specificity (SPE).
Table 2. Valuation models
Model
Fields
TDM
SM
ACC
SEN
SPE
98.71% 99.33% 94.95%
,, 95.52% 98.28% 78.72%
,,, 96.54% 98.02% 87.55%
, , , , 96.77% 98.83% 87.74%
The technique which provided the best result was the SVM with Linear kernel.
However the kernel also was extremely useful for decision trees to demonstrate which
fields had greater importance/relevance to the creation of the models. Naive Bayes
techniques did not show much relevance in these models because they were always
,,,,,,,,,
,,,,
 
Search WWH ::




Custom Search