Global Positioning System Reference
In-Depth Information
span over several storage media. To establish quick access patterns, arrays
are partitioned in tiles or chunks of convenient size which are basic access
units during query evaluation (Baumann 1994). Additional geo indexes
assist the tiles to act in a well-performing manner.
Array query languages, like SQL, give declarative access to arrays.
These queries are parsed, optimized, and executed to create, manipulate,
search, and delete arrays in fl exible ways. The parser receives the query and
generates the operation tree. Then, algebraic optimization rules are applied
to the query tree where applicable. Without considering the parallelism,
the execution addresses tiles sequentially. The tile-by-tile processing
strategy leads to an architecture allowing servers to process arrays orders
of magnitude far beyond the main memory.
Extensions are made for achieving scalability. Normally, scalability on a
single machine is guided by parameters like the number of processor cores
and the amount of main memory. The trends in processor development are
towards increasing the number of cores in one chip, rather than increasing
the power of a single core. By processing each tile on separate nodes or cores,
parallel processing becomes a critical development paradigm for scalable
software, allowing full utilization of these new processor architectures.
At a certain point, the hardware resources of a single machine will not be
enough to handle all tasks. It becomes necessary to distribute the data and
workload to further machines (nodes) according to some strategy. Multiple
machines present new challenges, however, like limited connection speed
between nodes, optimizing data distribution, minimizing data movement
between nodes. This drives database development towards distributed
and cloud computing. When data duplication would be inevitable with
the standard storage management mechanisms, the in situ processing
capability is an alternative way of adding value to legacy systems and
preserving scalability.
Parallel processing
Parallel databases seek to improve performance by parallelizing all steps
involved in the query evaluation whenever possible. Parallel processing
in the context of Array DBMS specifi cally means data parallelism (Hahn et
al. 2002), which focuses on data distribution across different computing
nodes for parallel evaluation. Pipeline parallelism which is widely used in
RDBMS is not particularly suitable, as the granularity of query evaluation is
much larger with array tiles and a pipeline buffer would quickly overfl ow
the main memory (Hahn et al. 2002).
Parallel DBMS typically exploit one of the following three architectures.
In s hared-memory systems, CPUs are interconnected and have access to a
common memory region. CPUs in a shared-disk architecture have access
Search WWH ::




Custom Search