A Novel Model Using Generalized Regression Neural Network (GRNN) for Estimating Sleep Apnea Index in the Elderly Suffering from Sleep Disturbance Part 1

Abstract

Objective: The main objective of this paper is to present a novel model for classifying senior patients into different apnea/hypopnea index (AHI) categories based on their clinical variables.

Methods and materials: The proposed model is a generalized regression neural network (GRNN). Three important variables were first selected from the original 30 clinical variables. The GRNN was trained using 75 patients that were randomly selected from the 117 patients. The remaining 42 patients were used for testing GRNN model. The design parameter of the network, i.e., the spread of the radial basis function, was empirically optimized. To alleviate the model complexity, the original AHI values were dichotomized into two different groups, i.e., AHI>13 and AHI<=13. The use of GRNN for this application appear fairly novel, notwithstanding that there is a host of literature on predicting obstructive sleep apnea (OSA) syndrome from demographic or other easy means to assess clinical variables.

Results: The proposed model has sensitivity and specificity of 95.7% and 50.0%, respectively, for the training cases, while 88.0% and 52.9%, respectively, for the testing cases.

Conclusions: The proposed neural network model has outperformed existing classification approaches in terms of classification accuracy and generalization, thus it can be potentially used in clinical applications, which would lead to a reduction of the necessity of in-laboratory nocturnal sleep studies.


Keywords: AHI, sleep apnea, elderly, GRNN, ROC

Abbreviations

AHI

= apnea/hypopnea index

AUC

= area under curve

BMI

= body mass index

ROC

= receiver operator characteristics

ANN

= artificial neural network

GRNN

= generalized regression neural network

NC

= neck circumference

NN

= neural network

OSA

= obstructive sleep apnea

Introduction

Sleep disordered breathing (SRBD) is present in 4% of men and 2% of women above 40 years of age. However, less than 3% of patients with SRBD syndrome are diagnosed due to lack of awareness of the disease among health care practitioners and patients. Polysomnography (PSG) has been used as a golden standard for diagnosing SRBD, however, this test is available only in selected centers. [1-5]

Studies on using neural network techniques for prediction of OSA are fairly sparse until recent years. In 2005, Fontenla et al [6] presented a novel approach for sleep apnea classification. Their goal was to classify each apnea in one of three basic types: obstructive, central and mixed. [6]

More recently, Liu et al in 2007 [7, 8] developed an innovative signal classification method capable of differentiating subjects with sleep disorders which cause excessive daytime sleepiness (EDS) from normal control subjects who do not have a sleep disorder. The aim of their study was to develop an artificial neural network to predict sleep disordered breathing in the elderly.

Clinical Subjects and Materials

The data were collected during the period from 1 January 2002 to 31 January 2003 at the Sleep Medicine Center, Changhua Christian Medical Centre, Taiwan. While patients’ confidentiality was maintained, accessing to patients’ records was approved by the ethics committee of Changhua Christian Medical Centre. Among the subjects who underwent nocturnal polysomnography (PSG), no patients with heart failure and chronic obstructive lung disease were admitted. . Also the data belonged to subjects who were younger than 65 years were excluded. As a result, the clinical data included a total of 124 elderly aged from 65 to 88.5 years. Out of the 124 subjects, a total of 117 subjects had both weight and height, from which body mass index (BMI) was calculated. Apnea/hypopnea index (AHI), defined as the number of events of apnea/hypoxia per hour of sleep, was measured from PSG, which documented the objective sleep criteria. Although PSG has been the golden standard for the diagnosis of obstructive sleep apnea syndrome (OSAS), it is highly invasive, time-consuming, and expensive.

The ratio of females/males in this data is 1: 1.7. The mean age of male subjects is 71.6 ± 4.47 years, while that of females is 72.3 ± 5.47 years. All subjects have chief complaints of sleeping disturbance.

The reasons for selecting this elderly age group are as follow. First, this field is a rather understudied entity, especially in sleep research on such an age group mentioned in this article. Indeed, all patients are vulnerable, but it is so much as those who are elderly. Thus, the difficulty of sleep study is due to the problems faced in obtaining volunteers as well as the possible philosophical and theological under-tones that people in general associate with ‘the sleep when they are at the end stage of their life’. Moreover, there is as well the misconception of sleep study on ‘those who haven’t many years left anyhow’ in general. Next, there are more than half of community-living people aged 65 year and over experience sleep disturbances. Third, sleep onset is often reported to be more difficult and nighttime awakenings more prevalent in the elderly.

The datum source of clinical subjects includes the corresponding author’s own study of 124 elderly aged from 65 to 88.5 years. Predominantly, the data of the total sleep time (TST), except that related to body height and its distribution, have been reported elsewhere [9].

Nocturnal in-Laboratory Polysomnography

Nocturnal polysomnography (PSG) (Alice 4 Sleep Diagnostic System, Respironics, Carlsbad, Calif., USA) was done from about 9:30 pm to 6:30am next morning in the sleep laboratory. The following parameters were measured and recorded by the PSG: (i) chest and abdominal wall motion by uncalibrated respiratory inductance plethysmography; (ii) heart rate by ECG; (iii) inspired and end-tidal carbon dioxide pressure (PETCO2), sampled at the nose or mouth at a rate of 60 mL/min by mass spectrometry (model 1100 Medical Gas Analyzer, Perkin Elmer; Pomona, CA) or by capnography (model 1000 Capnograph, Nellcor, Hayward, Calif. USA); (iv) combined oral nasal air flow, sampled with a three-pronged thermistor placed at the upper lip; (v) arterial oxygen saturation by pulse oximetry (model N 200, Nellcor, Hayward, Calif., USA); (vi) oximeter pulse wave form; (vii) electro-oculogram; (viii) EEG in overnight PSG; (ix) chin electromyogram; (x) actigraphy (placed on the hand); and (xi) microphone placed over the neck to monitor snoring. The transducers and lead wires permitted normal positional changes during sleep. Bedtime and awakening time were at each subject’s discretion; the PSG was terminated after the final wakening.

Clinical Classification of Obstructive Sleep Apnea Syndrome

Apnea was defined as a decrease in airflow of > 90% for a minimum of 10 seconds. Hypopnea was defined as > 30% decrease in airflow and desaturations required a > 3% decrease in oxygen saturation for a minimum of 10 seconds. The apnea hypopnea index (AHI) was calculated as the sum of apneas and hypopneas divided by nocturnal hours of sleep.

Based on the protocol of American Academy of Sleep Medicine Task Force (1999) [11].The degree in severity of sleep apnea is defined in Table 1.

Table 1. Degrees of Severity of Sleep Apnea (elevated AHI)

Sleep variables with Apnea (changes of AHI)

The Degree

Apnea (AHI<5)

Zero degree

Apnea (AHI 5~15)

first degree,

Apnea (AHI 15~30)

second degree,

Apnea (AHI>30)

third degree,

In terms of the staging of sleep, it follows Rechtschaiien et al s criteria (1963) [12].

Methods

Generalized regression neural network (GRNN) is a special type of neural networks. GRNN is a universal approximator that can approximate a continuous function to an arbitrary accuracy, given a sufficient number of neurons [14]. Comparing to conventional multilayer perceptron networks, GRNN has several advantages, including 1) it can accurately approximate functions from sparse and noisy data; 2) it can converge to the conditional mean surface with increasing the number of data samples; 3) it only has one design parameter (i.e., spread factor); and 4) it is easy to train. It is these unique advantages associated with GRNNs that make us to choose GRNN as our model for predicting OSA syndrome.

In this study, the single design parameter, i.e., spread factor, of GRNN is obtained via empirically optimization. The input variables to the GRNN model are also empirically determined based on classification performance. The three variables used for our final model are BMI, neck circumference (NC), and nocturnal total sleep time (TST). It is worth pointing out that TST values used in our model are dichotomized into two levels, <=6 hours and > 6 hours, before input to the model. It is also interesting to note that including age as inputs to our model does not improve our model performance.

To alleviate the model complexity, the original AHI values (the dependent variable of our model) were also dichotomized into two different groups, i.e., AHI>13 and AHI<=13. That is, our GRNN model is designed to perform 2-class classification.

Results

The 117 cases are randomly split into two disjoint subsets: 75 cases for training, whereas 42 cases for testing (validation).

To evaluate the goodness of the GRNN model, following performance metrics are used in this study: 1) Accuracy; 2) Sensitivity/specificity; 3) PPV/NPV; Kappa statistics; and 4) AUC of ROC.

GRNN Performance

Figure 1 shows the ROC curves of the GRNN model for both the training and testing sets, respectively. For the training set, the area under curve (AUC) of the ROC is 0.8405 with 95% confidence interval from 0.8304 to 0.8506. For the testing set, the AUC calculated for this ROC is 0.751 with 95% confidence intervals between 0.728, and 0.77.

ROC curve for the training set

Figure 1. ROC curve for the training set

Given the fact that the desired requirements for sensitivity and specificity are unknown, we choose the decision threshold for GRNN model to be 0.4, which gives the sensitivity and specificity of 95.7% and 50%, respectively, for training set. The confusion matrices corresponding to the decision threshold of 0.4 are given in Table 2, from which other performance measures are derived. Those performance measures are summarized in Table 3.

Table 2. Confusion matrix of the training and testing sets

Confusion matrix of the training and testing sets

Table 3. Summary of performance metrics of GRNN model

Performance metrics

Training set

Testing set

Accuracy

0.787

0.738

Sensitivity

0.957

0.880

Specificity

0.500

0.529

Positive predictive value

0.763

0.733

Negative predictive value

0.875

0.750

Kappa

0.501

0.430

(95% CI)

[0.302, 0.700]

[0.154, 0.705]

AUC

0.8405

0.751

(95% CI)

[0.830, 0.851]

[0.728, 0.774]

By looking at the Table 3, it can help to identify subjects with moderate to severe degree OSAHS (the second and third degrees) who need PSG badly, but were misclassified as AHI <=13 by the model, with a rate of 12%. Among the total five (two in the training set whereas three in the testing set) being misclassified as AHI <=13, there were merely one with AHI >25, whereas the other >40. The rest of three were all had AHI < 18 per hour.

From Table 2 one can observe the followings. Using this model, for the 47 subjects whose AHI measured from the nocturnal sleep study with in-laboratory PSG is greater than 13, 45 were correctly classified, whereas 2 was misclassified as AHI being less than or equal 13. Out of 42 testing subjects, there are 25 subjects whose AHI is greater than 13. For those 25 subjects, our model correctly classifies 22 of them while misclassifies 3, which gives the sensitivity of 88.0%. Similarly, for the 17 subjects whose AHI is less than or equal to 13, our model correctly classifies 9 and misclassifies 8, which yields the specificity 52.9%.

Characteristics of the two misclassified subjects are listed in Table 4, from which one can see that both subjects are women and their AHIs are 17.6 and 16.2, respectively. Characteristics of the three subjects whose AHI is greater than 13 but misclassified as AHI being less than 13 are also listed in Table 4 .

Prof. zz Tang: We need some text here for each of tables/figures we refer to. It looks strange to me if we leave nothing here while each table/figure has a long legend.

Table 4. Misclassified cases

Sex Age

High

weight

BMI

NC

Latency

TST

Snore

RDI/T

For training set

F 64.85

150

48.2

21.42

34

12

351

534

17.6

F 72.96

150

54.3

24.13

35

8

247.5

0

16.2

For testing set

F 72.53

164

57.5

21.38

34.5

6.5

315

674

40.4

F 69.88

159

53.1

21.00

35

22.5

295

752

14.2

M 66.72

145.5

54

25.51

34

38.5

227.5

1399

25.5

The un-weighed likelihood ratios for the training set are 1.914 and 0.086 for conventional positive and negative, respectively. The un-weighed likelihood ratios for the testing set are 1.868 and 0.227 for conventional positive and negative, respectively.

Table 5 a. Comparison among different models – training set

Performance metrics

Linear regression

Logistic regression

GRNN

Accuracy

0.667

0.667

0.787

Sensitivity

0.809

0.809

0.957

Specificity

0.429

0.429

0.500

Positive predictive value

0.704

0.704

0.763

Negative predictive value

0.571

0.571

0.875

Kappa

0.250

0.250

0.501

(95% CI)

[0.026, 0.474]

[0.026, 0.474]

[0.302, 0.700]

AUC

0.628

0.636

0.841

(95% CI)

[0.613, 0.642]

[0.622, 0.651]

[0.830, 0.851]

Table 5 b. Comparison among different models – testing set

Performance metrics

Linear regression

Logistic regression

GRNN

Accuracy

0.667

0.690

0.738

Sensitivity

0.800

0.800

0.880

Specificity

0.471

0.529

0.529

Positive predictive value

0.690

0.714

0.733

Negative predictive value

0.615

0.643

0.750

Kappa

0.028

0.339

0.430

(95% CI)

[-0.011,0.574]

[0.05, 0.628]

[0.154, 0.705]

AUC

0.617

0.614

0.751

(95% CI)

[0.590,0.643]

[0.588, 0.641]

[0.728, 0.774]

Next post:

Previous post: