Biomedical Engineering Reference
In-Depth Information
3. SPECIALIST Lexicon and lexical tools. The SPECIALIST Lexicon
adds over 200 000 additional terms from various sources and includes
commonly occurring English words. The lexical tools are used to
assist in Natural Language Processing.
UMLS bridges the terminology users will use in accessing the analytical
data store and the codes contained in the documents. For example, a user
wants to fi nd all documents related to 'Acute Myocardial infarction' in a
clinical data store with documents coded using SNOMED CT. With
UMLS, users can fi nd a mapping from the English term to the SNOMED
CT code, and then do a second query to fi nd all SNOMED CT codes
whose ancestor is 'Acute Myocardial infarction'. The results of this
second query can be used as a fi lter in the analytical data store.
UMLS does not solve all text matching and text scrubbing problems.
Our experience tells us that the last mile of matching is a continuous
refi nement and build-up of rules and samples that can be matched as time
progresses. If made confi gurable, end-users can populate the queries that
help with the mappings and data extraction.
20.6 Open source databases
The next step is data storage. Our use-case poses several challenges on
choice of technologies. First, it is becoming increasingly diffi cult to build
a single system that supports the myriad of implementation details of
even small regional sets of healthcare providers. CDA is fl exible and
extensible, so similarly fl exible mechanisms to store a complete set of
disparate, raw data are required. Second, the volume of data is expected
to be extremely large. Medical records, radiology images, and lab or
research data are notorious for large fi le components of high-fi delity
information that contain more information than is immediately usable
given any immediate questions. These requirements generally wreak
havoc on traditional application development. Third, as our understanding
of healthcare and the human body evolves, we need to support new
questions being asked of old data. The goal, therefore, is to evaluate open
source technology's ability to meet the following requirements:
￿ ￿ ￿ ￿ ￿
1. ability to store extreme amounts of data in a fl exible schema;
2. ability to re-process this data on-demand with new business rules;
3. ability to re-process data to create marts or data-cubes that allow ad
hoc analysis on new questions.
 
Search WWH ::




Custom Search