MINING OF SENSOR DATA IN HEALTHCARE: A SURVEY - Managing and Mining Sensor Data

Database Reference

In-Depth Information

with sophisticated data filtering and interpolation techniques to remove

and correct, when possible, data anomalies. The pre-processing stage

is also impacted by the lack of standard adoption by medical sensor

manufacturers. Indeed, data generated in different formats needs to be

syntactically aligned before any analysis can take place. Furthermore, a

semantic normalization is often required to cope with differences in the

sensing process. As an illustration, a daily reported heart rate measure

may correspond to a daily average heart rate in some cases, while in

other cases it may represent a heart rate average measured every morning

when the subject wakes up. Comparing these values in a data mining

application can yield incorrect conclusions, especially if they are not

semantically distinguished.

Another key pre-processing challenge involves data synchronization.

Sensors report data with timestamps based on their internal clocks.

Given that clocks across sensors are often not synchronized, aligning

the data across sensors can be quite challenging. In addition, sensors

may report data at different rates. Hence, assumptions and alignment

strategies need to be carefully designed.

2.2.3 Transformation Challenges. Feature extraction is

often the most complex stage of the data mining process. The transfor-

mation of sensor data into spaces where good features can be extracted

requires a deep understanding of the problem at hand and needs to

be driven by domain experts. In medical informatics, this transforma-

tion requires expertise on the physiology of the body. Despite immense

progress in medicine and in our understanding of the human body, there

is still much to learn about all the data that we can sense today. For

instance, in neurological intensive care environments, neuro-intensivists

collect and interpret electroencephalograms signals that represent the

brain activity of their patients. These signals are extremely noisy and

not fully understood [14], yet they can be used to diagnose several con-

ditions (e.g., the onset of diverse forms of seizures). Extracting features

from EEG signals is often restricted to spectral analysis techniques de-

fined by domain experts.

In addition to signals that are not well understood, human sensing

adds different types of unstructured data that needs to be effectively in-

tegrated. This includes textual reports from examinations (by physicians

or nurses) that also need to be transformed into relevant features, and

aligned with the rest of the physiological measurements. These inputs

are important to the data mining process as they provide expert data,

personalized to the patients. However these inputs can be biased by

physician experiences, or other diagnosis and prognosis techniques they

Search WWH ::

Custom Search

Home