Information Technology Reference
In-Depth Information
PRONOM [ 54 ] would suggest such a file is a Plain Text File, although clearly this
provides just a suggestion for the file type since a file is easily renamed.
The MIME-type [ 55 ] is a more positive declaration of the file type in internet
messaging.
Many binary (i.e. non-text) file start with a bit sequence which can be used to sug-
gest the file type, often known as “magic” numbers [ 56 ]. Some amusing examples
are:
Compiled Java class files
(bytecode)
start with the hexadecimal code
CAFEBABE.
Old MS-DOS .exe files and the newer Microsoft Windows PE (Portable
Executable) .exe files start with the ASCII string “MZ” (4D 5A), the initials of
the designer of the file format, Mark Zbikowski.
The Berkeley Fast File System superblock format is identified as either 19 54 01
19 or 01 19 54 depending on version; both represent the birthday of the author,
Marshall Kirk McKusick.
8BADF00D is used by Apple as the exception code in iPhone crash reports when
an application has taken too long to launch or terminate.
The magic number is again not definitive since it would be possible for a particular
short pattern to be present by co-incidence.
Well known to Unix/Linux users, but not to Windows users, the file com-
mand is used to determine the file type of digital objects using more sophisticated
algorithms. The file command uses the “magic” database [ 57 ] which allows it to
identify many thousands of file types. A summary of file identification techniques is
available [ 58 ]. Tools such as DROID [ 59 ] and JHOVE [ 60 ] provide file type identi-
fication, albeit for a more limited number of file types (a few hundred at the time of
writing), but they do provide additional Provenance for these formats.
7.5 Semantic Representation Information
Semantic (Representation) Information supplements Structure (Representation)
Information by adding meaning to the data elements which the latter allows one
to extract. Chapter 8 provides a much extended view of semantics but here it is
worth providing a few basic techniques.
7.5.1 Simple Semantics
Data Dictionaries provide the fairly simple definitions. A fairly self-explanatory
example using the CCSDS/ISO Data Entity Dictionary Specification Language
(DEDSL) [ 61 ]is:
Search WWH ::




Custom Search