Information Technology Reference
In-Depth Information
workflow developers from having to care how a processing step is performed -
a paradigm also promoted by the vision of grid computing, where computing
resources are supposed to be transparently allocated from wherever capac-
ity is available” [128]. That is, the ideal workflow system should be capable
of transparently choosing one service out of a group of services that pro-
vide equivalent functionality, without bothering the user with the technical
differences.
Semantic Handling of Data: Compatibility
Numerous data formats have been developed by the scientific communities,
reflecting various applications and technical requirements. Their use is con-
tinued also when the tools are provided as remote services, meaning that
workflows often have to deal with heterogeneous and incompatible data for-
mats. In fact, the heterogeneous and incompatible data formats that are in
use constitute one of the main obstacles to service composition and tool in-
teroperation [127]. For instance, there are around 20 common formats for
biological sequences alone, and, even more complicated, many available tools
and databases use tool-specific ASCII or binary formats rather than one of
the more or less common formats. What is more, in the technical terms of the
service interfaces, the textual formats are too often only classified as “strings”,
which is neither apt for reasoning about type compatibility nor does it help
users to work with them. Accordingly, workflow systems have to address how
to deal with the numerous different data types in a more satisfying way.
In principle, there are two possibilities of how to improve the handling of
heterogeneous and incompatible data formats:
1. Standardization , that is, introduction of a homogeneous system of more
specific data formats. This approach has been taken by several standard-
ization efforts in all application domains. However, a homogeneous stan-
dard technology that incorporates all data types is hard to achieve. And
even if standards for parts of the data type “jungle” are established, it
is impossible to change all the already existing software accordingly in
order to thoroughly replace all the historically grown formats.
2. Automatic adaptation by adding comprehensive annotations in terms of
semantic metadata to the existing data types [301], and using small ser-
vices that simply perform conversions from one data format into another
(so-called “shim services”) for achieving compatibility. This approach is
indeed more pragmatic than solely striving for standardization, as the
annotation is less invasive and can be applied to any resource at any
time.
This way, if standards exist, they can (and should) still be used, but
their combination with non-standard data types is also managed.
In analogy to the domain-specific service classifications that have been
outlined in the previous section, a detailed semantic description of the data
 
Search WWH ::




Custom Search