Information Technology Reference
In-Depth Information
PDS system information and configure policies. The entry points may be called
directly or via web services to enable flexible and platform independent use of
PDS. The PDS interfaces aim to be abstract, technology independent and to sur-
vive implementation replacements. The entry points may throw different exceptions
also defined as PDS interfaces.
The main functions PDS provides are:
1. Ingest and access: various methods to ingest and access AIPs packaged in
XFDU [ 193 ] or SAFE formats. The ingest operation consists of unpacking the
AIP, assigning an AIP identifier, validating and computing its fixity, updating
its provenance and reference, and storing each section separately for future
access and manipulation. Access includes fetching and validating the data and
“metadata” of the AIP. Each section of the AIP (content data, RepInfo, fixity,
provenance, etc.) may be accessed separately. However, PDS encapsulates data
and “metadata” at the storage level and attempts to physically co-locate them on
the same media.
2. AIP generation: generation of preservation “metadata” and creation of AIPs for
the case that the ingestion to PDS includes just bare content data.
3. “Metadata” enrichment: automatic extraction of “metadata” from the submit-
ted content data and addition of representation information and/or PDI to the
stored AIP. Third party “metadata” extractors for different data types can be
easily added via an API that PDS provides.
4. RepInfo management: allows sharing, search and categorization of RepInfo
[ 194 ]. Given the expected vast amount of RepInfo, the RepInfo manager employs
a sharing architecture by which the RepInfo are grouped into expandable cate-
gories, and the AIPs point to the categories rather than directly to their associated
RepInfo. This architecture allows updating and expanding the categories with-
out the necessity to update existing RepInfo. Also, in addition to storing the
RepInfo of the content data, PDS stores RepInfo of “metadata” (of fixity,
provenance, etc.) so these “metadata” can be interpreted when accessed in the
future.
5. Fixity management: fixity calculations and its documentation in the AIP ensure
that the particular content data object has not been altered in an undocumented
manner. PDS enables one to compute and validate fixity (data integrity) within
the storage component. This reduces the risk of data loss and frees-up net-
work bandwidth otherwise required for transferring the data. PDS provides an
extendible mechanism to compute fixity values based on specified algorithms,
and the computations are calculated separately on various parts of the AIP. The
resulting fixity values are stored in the fixity section of the AIP in a standard
PREMIS (v2) format [ 139 ]. Each calculation may be later validated by access-
ing the given AIP and running a complementary fixity validation routine. New
fixity algorithms can be easily added by uploading execution module (storlet) via
an API that PDS provides.
6. Data transformations: provide the ability to load transformation modules (stor-
lets) and apply them on AIPs at the storage level. When a transformation is
Search WWH ::




Custom Search