Information Technology Reference
In-Depth Information
into a more efi cient one. Most current literature on SQOs in XML focuses
on techniques that are either (1) general regardless of persistent or stream-
ing XML sources, or (2) specii c to a persistent XML source. For example,
query tree minimization is a general technique. It eliminates a pattern
from the query if the pattern is known to always exist. Since the pruned
query involves less computation than the original one, it is more efi cient
to evaluate regardless of the nature of data sources. As another example,
the query rewriting using the state extents technique requires indices of
the data. Applications on persistent XML can usually afford the prepro-
cessing of building indices, while this is often not the case for XML stream
applications due to the on-the-l y arriving nature of their data. Therefore,
,this technique is more suitable for persistent XML.
Being rich in semantic elements and expressions is an important feature
of XML language itself, which means SQO can be used to optimize the
query on the XML document, and consequently on XML data streams.
The work in [36,38] focuses on SQOs specii c to XML streams instead of
XML documents. To our knowledge, their work has proposed a compre-
hensive solution for XML stream-specii c SQO techniques. It handles only
limited queries (i.e., Boolean XPath or XQuery match) with one type of
constraint. In contrast, i rst, they handle more complex query types; that
is, a subset of XQuery. Second, their system supports most commonly
used constraints in XML schemas. However, the application of SQO on
XML data streams is at its initial stage. Some progress can be seen in
[36,38], but that is far from completion and soundness. According to the
above discussion, the techniques of SQO over XML data streams are well
developed, which can be applied to the semantic optimization over the
scientii c workl ow.
9.4.2
Execution of Optimized Scientific Workflow
The high data and computation intensity is a fundamental feature of sci-
entii c workl ow. Scheduling scientii c workl ow execution means to
allocate the resource or service for the workl ow instances effectively and
efi ciently. Whatever the environment is, the goal of the scheduling of the
scientii c workl ow is lower expense and shorter time. Naturally, the
stream-like l ow makes it possible to deploy optimization tech niques from
DSMS to cut the system space and time cost. When dei ning a workl ow,
the generated scientii c workl ow schema will be rei ned by the schema
optimizer. It is a static optimization before the execution of the workl ow.
During the workl ow execution, the runtime optimizer as shown in
Figure 9.15 will analyze the information of instances and exchange the
information with SWFMS. Consequently, the runtime optimizer will help
the SWFMS reschedule the execution plan to improve the system perfor-
mance. The optimization procedure itself is a kind of scheduling for the
scientii c workl ow execution. So the optimization research on XML data
streams will greatly enhance the scheduling over the scientii c workl ow.
 
Search WWH ::




Custom Search