Information Technology Reference
In-Depth Information
Each application within MoSGrid produces a unique output format, which is
often unstructured. A parser was developed that supports the output formats used in
MoSGrid. It converts all of these formats to MSML by using regular expressions.
New output formats can easily be added by creating new regular expressions for the
parser.
A multitude of workflows are used to perform complex simulation within each
domain. To adjust general workflows to the speci
c task, the workflows are
parameterized by entering mandatory and optional parameters in an input mask.
These masks are generated using the respective MSML template, in which the
entered simulation parameters are stored. Since a workflow can use a number of
different simulation tools, the input and output formats vary widely. To meet this
challenge, specialized adapters were developed. These convert the MSML section
containing the input parameters into the input format for a speci
c application of
the workflow. This can range from some simple commands to highly complex input
formats.
11.3.3 Integrated Metadata Usage and Search
The amount of data is continuously growing throughout the scientific communities;
therefore, it is essential to be able to search for and
find data again. To address this
issue, the UNICORE metadata service is deployed and integrated with the MoSGrid
infrastructure. The metadata service uses the high-performance and widely used
Apache Lucence (2014) search engine library. The information regarding simula-
tion is stored in the central MSML data format. This makes it ideally suited as a
basis for searching in MoSGrid data. The Java-based tool, the MSML metadata
extractor, converts MSML to JSON format. This
file is sent to the metadata service
for indexing, which in turn makes it available for searching.
Searching capabilities are integrated in three ways. First, they are offered via
search
fields to enable the users to search through data stored by MoSGrid and
choose it as input. Second, they are integrated in portlets enabling
filtering while
browsing through stored data. Third,
it
is intended to extend the monitoring
implemented in the domain-speci
c portlets to enable speci
c views showing only
relevant data related to the selected workflow.
In all instances search terms are entered and matched with the metadata extract
from MSML. Thus, search results return a reference to the indexed MSML
file.
11.4 Domain-Speci
c Applications
The three targeted research domains in MoSGrid are currently quantum chemistry,
molecular dynamics and docking. In collaboration with the ER-flow (2014) project,
metaworkflows have been developed in MoSGrid. Many workflows consist of
Search WWH ::




Custom Search