Information Technology Reference
In-Depth Information
regular basis is growing at a rapid and almost unheard of rate [ 13 - 16 ]. At the same
time, the data management practices currently used in research environments rely
on the use of conventional database or fi le-based management approaches that are
ill-suited to such “big data” sets [ 14 , 16 , 17 ]. Therefore, the use of integrative and
scalable information management platforms is critical to reducing the data manage-
ment burden associated with such multi-dimensional data, thus allowing research-
ers and their staff to focus on fundamental scientifi c problems, rather than practical
computing needs [ 5 , 7 , 14 , 18 ]. In addition, with the growth of scenarios in which
investigators need to link such high-throughput bio-molecular and phenotypic data
together in meaningful ways so as to better understand potential relationships
between them, it is also imperative that the semantics of such data be well under-
stood [ 17 , 19 , 20 ]. Such semantic interoperability between data (either within a
given data set or across data sets) requires the use of knowledge engineering
approaches to map among various representational schemas and codifi cation
regimes for source data sets [ 17 , 20 ]. When taken as a whole, the types of motivat-
ing questions one might encounter relative to the collection and management of
heterogeneous or multi-dimensional data sets can include:
1 . What are the optimal tools to allow me to collect or re - use data for my research
project as it is generated via either clinical encounters or research specifi c inter-
actions with participants or populations ?
2 . How can I store large collections of research data in ways that make it timely
and easy to both index and retrieve depending on downstream data analysis
needs ?
3 . How can I normalize the coding schemas or data structures for multiple source
data sets so that I can then analyze the interrelationships between variables of
interest contained within those resources ?
6.3.2
Using Knowledge-Anchored Methods to Discover
and Test Hypotheses Concerning Linkages Between
Phenotypic and Bio-molecular Variables
Current approaches to hypothesis generation and testing primarily tend to rely upon
the intuition of an individual investigator or their team [ 13 , 21 ]. As such, these
research questions tend to be limited in scope relative to the knowledge and experi-
ence of those individuals, and not necessarily representative of the full scope of
applicable scientifi c knowledge or inquiry. Beyond this primary limitation, it is also
important to note that such a human-centered approach is really only practicable
when the scale and scope of scientifi c data being considered is commensurate with
basic human cognitive capabilities. However, as data sets expand to reach “big data”
proportions (minimally in terms of size and speed at which data are generated), such
an approach becomes rapidly intractable and highly limiting [ 17 , 19 ]. At the same
time, signifi cant knowledge that could be used to assist in the formulation of
Search WWH ::




Custom Search