Global Positioning System Reference
In-Depth Information
capabilities is the perfect candidate for this scenario. The data is registered
with the database, and typically only selection queries will be allowed,
whereas updates are left to mechanisms already in place in the data
repository. For example, rasdaman (Baumann 1994) implements in situ
processing mainly for this purpose. Registering data for in situ processing in
rasdaman is possible with a simple extension of the regular insert statement
(Baumann 2013):
insert into collName referencing
( typeName )
fi lePath [ domain ],
...,
fi lePath [ domain ]
For example, registering a grayscale TIFF image test.tif of size 1000x1000
would be done with:
insert into GreyColl referencing ( GreyImage ) “/path/test.tif”
[0:999,0:999]
In situ may also be preferred by regular desktop users, especially when
they often work with GIS tools that do not have connectors for the particular
DBMS, and work best and fastest with the fi lesystem. “Locking” the data
in a database would require exporting it fi rst, before making use of it with
such tools. By registering the data in situ however, it can be easily accessed
by other software, and database queries can still be performed on it. Support
for in situ in PostGIS Raster is mainly motivated by this use-case, as PostGIS
Raster focuses on 2D GIS raster data.
The main disadvantage of working with data in situ is the lack of
adaptability to the variety aspect of Big Data. In the case of image timeseries
analysis this becomes particularly evident. Images are inserted into the
archive slice by slice in time, and this is how they are usually stored.
In contrast, knowing that temporal queries are important in an Array
DBMS can rearrange incoming data into time “columns” that give access
to a particular location's timeseries in one disk. Hence, the advantage of
circumventing data copying has to be balanced against a possible loss of
performance in query evaluation. Generally, ingesting data into a database
usually involves translating the data into an internal format, optimized
towards a certain pattern of queries that are most commonly exercised on
the particular dataset. The data is typically broken and stored as tiles (also
called chunks); a detailed view on how this is done in Rasdaman is given
in Baumann (2012). Combined with an appropriate tiling and indexing
strategy, e.g., a redundant tile schema or an index on the in situ data, fast
access to interested tiles can be enabled. In situ evaluation circumvents these
mechanisms, as the original data will not be modifi ed.
Search WWH ::




Custom Search