Global Positioning System Reference
In-Depth Information
capabilities is the perfect candidate for this scenario. The data is registered
with the database, and typically only selection queries will be allowed,
whereas updates are left to mechanisms already in place in the data
repository. For example, rasdaman (Baumann 1994) implements
in situ
processing mainly for this purpose. Registering data for
in situ
processing in
rasdaman is possible with a simple extension of the regular insert statement
(Baumann 2013):
insert into
collName
referencing
(
typeName
)
fi lePath
[
domain
],
...,
fi lePath
[
domain
]
For example, registering a grayscale TIFF image test.tif of size 1000x1000
would be done with:
insert into
GreyColl
referencing (
GreyImage
) “/path/test.tif”
[0:999,0:999]
In situ
may also be preferred by regular desktop users, especially when
they often work with GIS tools that do not have connectors for the particular
DBMS, and work best and fastest with the fi lesystem. “Locking” the data
in a database would require exporting it fi rst, before making use of it with
such tools. By registering the data
in situ
however, it can be easily accessed
by other software, and database queries can still be performed on it. Support
for
in situ
in PostGIS Raster is mainly motivated by this use-case, as PostGIS
Raster focuses on 2D GIS raster data.
The main disadvantage of working with data
in situ
is the lack of
adaptability to the variety aspect of Big Data. In the case of image timeseries
analysis this becomes particularly evident. Images are inserted into the
archive slice by slice in time, and this is how they are usually stored.
In contrast, knowing that temporal queries are important in an Array
DBMS can rearrange incoming data into time “columns” that give access
to a particular location's timeseries in one disk. Hence, the advantage of
circumventing data copying has to be balanced against a possible loss of
performance in query evaluation. Generally, ingesting data into a database
usually involves translating the data into an internal format, optimized
towards a certain pattern of queries that are most commonly exercised on
the particular dataset. The data is typically broken and stored as tiles (also
called chunks); a detailed view on how this is done in Rasdaman is given
in Baumann (2012). Combined with an appropriate tiling and indexing
strategy, e.g., a redundant tile schema or an index on the
in situ
data, fast
access to interested tiles can be enabled.
In situ
evaluation circumvents these
mechanisms, as the original data will not be modifi ed.
Search WWH ::
Custom Search