Sound Classification in Hearing Aids by the Harmony Search Algorithm - Music-Inspired Harmony Search Algorithm: Theory and Applications

Information Technology Reference

In-Depth Information

5 Experimental Setup

Prior to the description of the experiments carried out and the discussion of the corre-

sponding results, we now describe the database used (Section 5.1), the particular feature

extraction process, the kind of classifier, and the parameters of the search algorithms

used for illustrating this chapter.

5.1 Available Database

The sound database used for the experiments consisted of a total of 2,627 files, with a

length of 2.5 seconds each. The sampling frequency was 22,050 Hz with 16 bits per

sample. The files correspond to the following categories: speech, music and noise.

Noise sources were varied, including those corresponding to the following environ-

ments: aircraft, bus, cafe, car, kindergarten, living room, nature, school, shop, sports,

traffic, train, and train station. Music files were both vocal and instrumental. The files

with speech in noise presented different Signal to Noise Ratios (SNRs) ranging from

0 to 10 dB.

The database has been divided into three different sets for training, validation and

test, including 943 (35%), 405 (15%) and 1,279 (50%) files respectively. The division

has been made randomly, ensuring that the relative proportion of files of each cate-

gory is preserved for each set.

5.2 Feature Extraction Stage

As described in Section 3.2, the particular feature extraction carried out in the ex-

periments may be summarized as follows:

1.

The input signal is divided into frames with a length of 512 samples (23.22

ms for the considered sampling frequency) with no overlap between adjacent

frames.

2.

The Discrete Cosine Transform (DCT) is computed [14].

3.

All considered features are calculated.

4.

Finally, the mean and standard deviation values are computed every 2.5 sec-

onds in order to mitigate the values.

The following initial 37 features were considered:

•

Mean and variance of: Spectral Centroid, Spectral Rolloff, Spectral Flux, Zero

Crossing Rate (ZCR), Short Time Energy (STE), Spectral Flatness Measure

(SFM) [22], and Voice2White (V2W) [23].

•

High Zero Crossing Rate Ratio (HZCRR), Low Short Time Energy Ratio

(LSTER) [24] and percentage of Low-Energy Frames (LEF).

20 Mel Frequency Cepstral Coefficients [25].

Since the mean value and the variance for any of the listed features has been con-

sidered, the number of features to be selected by the HS algorithm has been found to

be N F = 2×37 = 74. The final 74-feature vector, F , is created by calculating these

features from both the original time-domain sound signal, and from the linear predic-

tion coefficients (LPC) analysis residual. Note that some of these features have been

•

Music-Inspired Harmony Search Algorithm: Theory and Applications

Search WWH ::

Custom Search

Home