Hardware Reference
In-Depth Information
Information and Data
When you are considering logging events it's worth thinking about the information you are
storing. Sometimes storage space is limited, or the cost of remotely transmitting data is
high. The power available to the computer might be limited, or the computer might be in a
hostile environment, so the data is at risk of corruption.
When logging data you may need to consider compression, encryption and error detection
and correction.
How much information does a piece of data contain? How do you measure information, for
example, and how much information does a web page contain? Is there more information
on the Internet than contained in the DNA that describes you? How much can you com-
press a ile? Information theory is a branch of computer science concerned with storing and
processing information and data and provides an answer for many of these questions. It can
lead to some deep philosophical discussions and mind-blowing concepts. Do you need to
read every page of this topic to get all the information from it?
Computing is all about taking data in, processing it and outputting it. As such it follows that
information theory, which is all about how data is stored and processed, is important to
computer science.
Thinking about information theory leads to some fascinating questions - how much infor-
mation is contained in an English sentence? Text speak shows that it is possible to com-
municate without needing all the words and letters in a sentence.
When you are collecting data with computers, maybe in a bird box in your garden, maybe in the
middle of Africa or maybe on a satellite in a distant part of the solar system, information theory
provides tools and reasoning to ensure that the valuable research information gets back safely.
Compression and Checksums
Think of the information stored on a CD. If the CD is slightly scratched, it still plays with no
loss of sound quality. If it becomes more scratched, there comes a point when it will not
play. Effectively, there is an amount of data that can be lost from a CD without mattering;
so this data must be redundant in terms of the information the CD contains. Information is
useful in calculating the amount of extra data needed for error correction on CDs and in
communication links, such as radio, which may be unreliable.
Now think about converting an audio ile from a CD to an MP3 ile. Most people cannot tell
the difference between playing a CD and an MP3 ile, yet the MP3 ile size (the amount of
data) will be typically a tenth of the size. Nine tenths of the data has apparently been thrown
Search WWH ::




Custom Search