Information Technology Reference
In-Depth Information
string structure complexity increases making them very difficult to understand and
interpret.
The two main reasons (there are others) that languages such as BNF and regular
expressions are required become obvious when the task of storing data in text files
is considered. Data values in text files, such as a floating point values, can exist
as a variable length strings (variable number of characters/precision) and they can
be separated by delimiters and variable numbers of white spaces (spaces, tabs etc).
Defining the exact location and size (in terms of the number bits) of a given floating
point value in text data is usually not possible. In contrast, for non-text data files,
the exact size in bits and the location (typically measured as an offset in bits from
the start of the file or the last occurring value) of the data value is usually known (or
can be calculated) exactly, see the discussion of logical structure below for details.
So for strings and text data a mechanism for specifying that a data value can contain
a variable number of characters and is separated by zero or more white spaces and
a delimiter becomes necessary, hence the need for BNF and regular expressions,
which allow such statements to be made formally.
Strings and text data cannot normally be treated in the same way as other binary
data, even though at their lowest level they are indeed bit sequences (just a sequence
of characters of a given character set). Strings and text data are some of the most
complex forms of data to describe structurally. Research into formal grammars and
languages is still ongoing and is far too complex a topic to be described in detail
here. But needless to say when looking for structure RepInfo for string and text data
some formal grammar should be sought. In the case of very simple text data it may
be sufficient to have a document describing the string structure.
The length of a string may also be dynamic, and may be given by the value of
another DV in the data file, it may also be calculated via an expression using one or
more DVs in the data file.
7.3.1.10 Boolean
Boolean data values are a binary data type in that they represent true or false only.
Boolean data values can have many different representations in data. The simplest is
to have a single bit which can be either zero or one. But also a string could be used
such as “true” or “false”, or an integer (of any bit size) could also be used as long as
the values of the integer that represent true and false are specified. This makes the
Boolean data type potentially a derived data type, but with restrictions on the values
of the data type it is derived from.
7.3.1.11 Custom
Some data can take advantage of the fact that software languages allow the manip-
ulation of data values at the bit level. In some data formats, particularly older data
formats, bit packing was the norm due to memory and storage space constraints.
For example, it is perfectly possible to create a four bit integer with sixteen possible
Search WWH ::




Custom Search