Predicting Plateau Pressure in Intensive Medicine for Ventilated Patients - New Contributions in Information Systems and Technologies

Information Technology Reference

In-Depth Information

goal is to predict a Plateau Pressure class and this pressure is a quantitative variable.

This variable was divided into two classes, according to scientific studies and ICU

physicians. Values less than 30 cm O are classified as normal.

3.4

Modelling

The Data mining techniques used to induce classification models were: Support

Vector Machine (SVM), Decision Trees (DT) and Naive Bayes (NB). The choice of

these techniques was based on two characteristics: interpretability and efficiency. The

SVM reaches the second characteristic, but the DTs and NBs meet the two

characteristics. To implement evaluation mechanisms and to test the induced model it

was applied 10 Folds Cross Validation (10-Folds CV). The 10-Folds CV was adopted

due to the good results demonstrated in multidisciplinary data [13]. All technical

underwent tuning function. This feature comes with the e1071 package. The main

objective is to perform research network ranges from hyper parameters previously

provided and sequentially identify the best model and their hyper parameters.

The use of SVM technique is based on the application of two kernels: Linear and

Radial-Basic Function. The two kernels handle different parameterizations because

the hyperparameters are different for each kernel. Depending on the kernel used by

SVMs, a range of values for parameter C was defined. Its range has been defined by

the values obtained by the power 2 ,.., 2,…,16,where 0 . The cost

parameter C introduces some flexibility separating the categories in order to control

the trade-off between errors in training or stiffness margins [14]. The hyper parameter

Gamma (γ) was defined in the same way as C. The range was determined according

to the values obtained by the power 2 ,, 0.5,1,2 . Its parameterization was

used in the RBF kernel. The γ value determines the curvature of the boundary

decision [15].

The application of the DT technique was achieved by CART algorithm. The

feature selection methods and rules of decomposition were applied: Information Gain

(IG) and the Gini Index (GI). The attribute selection measure IG determines the

attribute with the highest information gain and uses it to make the division of a node

[16]. The GI is determined by the difference between the original information

requirement (i.e., based on only the ratio of classes) and the new requirement (i.e.,

obtained after partitioning A ). The respective difference can be demonstrated as

follows: . The attribute A has the highest

information gain. Gain (A) is the division attribute of node n [16]. The objective of GI

it is to calculate the value for each attribute using the attribute for the node with the

lowest impurity index [1]. The GI index measures the impurity of D , using a data

partition or a training set of attributes 1 ∑

, where pi corresponds

to the probability of an attribute D of a class Ci . This value is estimated by

|| |⁄ . The sum is calculated as a function of m classes [16]. Finally, in NB

algorithm there was not any configuration, but as already described earlierthis

algoritm uses the tune function to identify the sampling method to be used. All the

configuration was previously determined.

New Contributions in Information Systems and Technologies

Search WWH ::

Custom Search

Home