Information Technology Reference
In-Depth Information
If we had a different definition for our Designated Community, for example a current
day professional astronomer, then such a person would not need to be provided with
all such Representation Information. However in the future, say 30 years ahead,
then a professional astronomer may not be familiar with, for the sake of example
let's say, XML. This may be a reasonable possibility when one considers that XML
did not exist 30 years ago, and it might not be in use in 30 year's time. Therefore
one must be able to supply that piece of Representation Information at that future
time.
The end of the recursion we link to the Knowledge Base of the Designated
Community. However the CEDARS [ 26 ] project referred to Gödel ends. They
argued by analogy with Gödel's Theorem, which states “any logical system has
to be incomplete”, that “representation nets must have ends corresponding to for-
mats that are understood without recourse to information in the archive, e.g. plain
text using the ASCII character set, the Posix API.”. The difference is that although
the analogy is quite nice, it is hard to see where the net ends without using the con-
cept of a Designated Community. It would mean that the repository is not testable
because one does not know who to use as a test subject (a 3-year old? a bushman?).
Moreover a problem with Representation Information is that the amount needed
for a particular object could be vast and impractical to do anything with in reality.
It is for that reason that the concept of the Designated Community is so important.
It allows us to limit the Representation Information required to be captured at any
one time, and allows the judgement of how much to be testable.
6.3.2 Preservation Issues
Given a file or a stream of bits how does one know what Representation Information
is needed? This question applies to Representation Information itself as well as to
the digital objects we are primarily interested in preserving and using; how does one
know, for example, if this thing is, for example, in FITS format?
1. Someone may simply know what it is and how to deal with it i.e. the bits are
within the Knowledge Base
2. One may have a pointer to the appropriate Representation Information.
3. One may be able to recognise the format by looking for various types of patterns,
for example the UNIX file command does this.
4. One may feed the bits into all available interpreters to see which ones accept the
data as valid
5. Other means.
Of the above, if (1) does not apply then only (2) is reliable because (3) and (4) rely
on some form of pattern recognition and there is no guarantee that any pattern is
unique. Even if the File Format is unique (perhaps discoverable using the UNIX file
command) the possible associated semantics will almost certainly not be guessable
with any real certainty.
Search WWH ::




Custom Search