Information Technology Reference
In-Depth Information
values. Then eight of these four bit integers could be packed into a standard 32 bit
integer. The alternative would be to have eight 8 or 16 bit integers (depending on
what the programming language natively supported). The fact remains that a set of
bits can be used to represent any information.
7.3.2 Logical Structure Information
Strings and text files have been discussed above and their structure can, in the case
of structured strings, be broken down into sub-structures (sub-strings). Similarly
any binary file can be broken down into sub-structures ending in individual DVs
of a given PDT. We will now concentrate on the logical structure of binary files.
But binary (non-text) files can also contain strings which are usually a fixed number
of characters of a given character set. These strings may also have structure which
can be further described by a BNF type description or regular expressions.
We can view binary data as just a stream of DVs of a given PDT. But this simple
view is not usually helpful as it does not allow us to locate DVs that may be of
particular interest, nor does it allow us to logically group together DVs that belong
together such as, for example, a column of data values from table of data. With
binary data DVs or groups of DVs can usually be located exactly if the logical
structure is known in advance. The next sections show the common methods used
in binary data that facilitate the logical structuring of DVs.
7.3.2.1 Location of Data Values
Numerous data file formats use offsets to locate DVs or sub-structures in binary
data. For example, TIFF [ 44 ] image files contain an octet (byte) offset of the first
Image File Directory (IFD) sub-structure, where in IFD contains information about
an image and further offsets to the image data. The offset in this case is a 32 bit
integer which gives the number of octets from the beginning of the file. Offsets
are usually expressed in data as integers but the actual value may correspond to
the number of bits, octets or some other multiplier to calculate the location exactly.
Offsets may also be calculated from one or more DVs in the data, which requires the
expression for the calculation to be stated in the structure RepInfo. In NetCDF [ 45 ]
the location of the DVs for a given variable (collection of DVs) are calculated from
a few DVs in the file, i.e. the initial offset of the variable in octets from the start of a
file, the size in bits of the DVs and the dimensions of the variable (one, two or three
dimensional array etc.)
Markers may also be used to locate DVs or sub-structures and to also indicate the
type of sub-structure. The FITS file format [ 46 ] uses markers to indicate the type of a
given sub-structure. For example a FITS file can contain several types of data struc-
ture (as described in Sect. 4.1 ) such as table data, image data etc. Each of these sub-
structures is indicated with a marker, in the case of table data the marker is an ASCII
string with the value “TABLE”. The end of the data sub-structure corresponding to
Search WWH ::




Custom Search