Information Technology Reference
In-Depth Information
implemented by means of an innovative software designed by our research group.
It takes as input a list of relevant words (those having higher TF-IDF value) and,
exploiting a domain thesaurus [8] for semantic relations identification, clusterize
them in concepts.
The resources identification of the Postprocessing module uses the classifica-
tion procedure offered by KNIME [9] workflow tool.
In order to illustrate the processing phases, let's consider a fragment of an
Italian medical record:
“La Signora si presenta con un anamnesi di precedenti ricoveri presso
differenti reparti di questo ospedale. Inquieta ed a tratti aggressiva,
manifesta un forte stato d'ansia e dolori allo stomaco. Vistalastoriaclinicadi
patologie ansiogene del paziente, le sono stati somministrati 10mg di Maalox.”
Although the example is formulated in Italian the concepts to whom the
relevant terms refer to, are indicated in English.
The fragment states that “the patient presents a history of previous admis-
sions in different departments of a hospital. Restless and aggressive, shows a
strong state of anxiety and stomach pain. Given the patient's anxiety-inducing
conditions, she was given 10mg of Maalox”.
Once the terms of this fragment were extracted by means of Preprocessing
module (for brevity sake this step is not described, nevertheless, the interested
reader can find details in [6]), the Transformation module extracts the rele-
vant terms using, as described above, statistical measure; all the terms having a
TF-IDF value over an established threshold are selected: “paziente”(4.1), “an-
sia”(4.2), “dolori allo stomaco”(3.8), “aggressiva” (3.1), “storia clinica”(4.8), and
“Maalox” (4.7). These terms are then linked to the synsets to which they be-
long. Each synset refers to a concept, and each concept is then associated to a
document section as summarized in Figure 3.
In our example we obtain the concepts associated to the extracted terms:
“Patient” ,“anxiety” and “stomach pain”, “aggressive”, “Patient History” and
“Maalox”. The relevant concepts are structured in RDF format and the list of
Fig. 2. Instanced architecture for E-Health documents processing
Search WWH ::

Custom Search