Information Technology Reference
In-Depth Information
Why is some information formally recorded and others not formally
recorded? One needs to go back to the basic underpinnings of information
systems to throw some light on at least one aspect of where the formal
selection takes place, and how it emerges. At the risk of some simplifi-
cation, as people needing the information get physically separated from
the sources of information, information systems come into play. For
example, the managers need to know what is in the warehouse without
physically visiting the place. They need only turn to the inventory report.
As long as the formal information presented in that report represents the
reality in the warehouse, the system compensates for the separation. It
must, however, be noted that this information is with respect to some
agreed-upon vectors only — in this example, the stock levels. The report
does not give the manager any information about how clean the aisles
are, or which bulbs in the overhead lighting are not working. This choice
must be made, that is, stock levels and not cleanliness of aisles. This
choice results in attenuation. This attenuation must be recognized. That
the company ends up with a database of stock levels and not a database
of cleanliness of aisles is because this “choice” was made to address a
certain requirement at a certain time. The data that is not captured is in
many cases lost.
Raw Data
One often hears the term “raw data.” It is not clear what it means. A
trivial definition is that raw data is unprocessed data. It is the original
data captured by a server or a device. Most systems do not deal with raw
data. Some systems requiring higher degrees of proof about what they
store, such as those related to criminal, legal, or pharmaceutical systems,
may talk about raw data more than the normal business application. Access
to raw data may also be required for data quality improvement reasons.
In a system with data flowing across many companies and partners, trying
to improve data quality requires clear trails of where the “bad” data was
introduced. If one is informing or blaming any external (or internal) party
for introducing errors, one must be sure that the error was introduced by
that party, and not earlier or later, due to faulty logic in the processing.
If raw data loses its raw status on the slightest change or transformation,
then most data that we deal with is not raw. For example, conversion of
data from EBCDIC to ASCII could make the data “not raw.” A screen that
posts data to a database may not be sending through all the keystrokes
(backspaces, overwrites) as entered. Any physical representation needs
some form of delimiters. When dealing with “Joe, Smith, 123 Sesame
Street, Box Town, CA,” what is the raw data of the address? Is it the
content or the entire string? If the raw data is of fixed length, then the
Search WWH ::




Custom Search