Biomedical Engineering Reference
In-Depth Information
the largest amount of information in the most compact form
possible. All aspects of data management were by necessity
done by highly specialized technical personnel. Data
retrieval frequently required knowledge of programming
techniques, decisions about additions and changes to
databases and modifications of data codes were major
considerations, and a changeover to a new computer (even
a new model of the same brand) was likely to be a seriously
disruptive process.
Today, however, small desktop computers are powerful
enough to manage large databases using inexpensive
database software that incorporates the same sophisticated
data manipulation techniques found on large mainframe
applications. Because these packages have been imple-
mented for a broad, nontechnical clientele, they have been
made considerably easier to use than much of the older
software. The dramatic decline in the cost of data storage
has also reduced the need for cryptic data coding, meaning
that information can be stored in a form that is interpretable
to the eye. With many of the technical barriers removed,
data management may now be in the hands of those who
know the data best, and data requirements per se can be the
sole determinant of issues in colony record keeping. These
advances make it possible for even the smallest colonies to
enjoy the advantages of computer management of their
data.
However, new issues have arisen driven by these
changes: lack of technical background in records personnel
often limits the quality control of data input and the
sophistication of data retrievals. The role of technical
personnel has therefore changed but not diminished. In fact,
with significantly increased and increasingly complex
reporting requirements for regulatory monitoring, the role
of reporting from the records has become as important as
the ability to accurately store the information.
modern spreadsheet, data fields are the information filling
the cells with a different field in each column, a record
might constitute a line of data for an animal, and a table
would be the worksheet or file itself, a collection of lines or
records.
Next is the issue of structuring databases or collections
of tables. In its most basic form, two forms of databases
exist: flat databases and relational databases. In flat data-
bases, each table stands alone and contains all of the
information needed to use the data within it. The classic
example is again the spreadsheet. A lot of data out there in
the colony world is stored as Microsoft Excel, or other
spreadsheets: flat files are designed to contain all of the
information needed to use the data therein, like animal age,
sex, weight, and so on. In contrast, a relational database is
explicitly a collection of tables, with no one table or file
designed to be useful on its own. Instead, through the tools
available in the database software (e.g. Access, FoxPro,
Oracle, MySQL), it is possible to transparently link or draw
data from multiple tables, keyed on the basis of a primary
or key field (animal ID) and produce a new table, often
termed a view, which is used for the final requirement.
Therefore, a design feature of relational database tables is
to keep the data in each table unique to that table, to never
repeat information which is stored elsewhere, as long as all
of the information can ultimately be connected by the
identical key field in each table. This provides a very effi-
cient, more compact method of storage, but requires the
skills to link tables appropriately to retrieve the information
needed for each view. It is critical in these systems to make
sure that the key field(s) is unique and unchanging: in most
NIH-supported systems, each animal is assigned a unique
identifier or ID which remains with the animal throughout
its life and is not reused: this works well as a key field and is
used in most modern systems. However, if there is a risk
that an identifier, like a name, might be reused, it is
important to use an artificial, internally generated record
identifier as the key field, with the animals' IDs (names) as
simply a unit of data within the system.
An intermediate or hybrid form of structure is available
in the era of modern computing power on the desktop.
Unlimited access to easy-to-use database software allows
the merging of these forms: flat tables with repetitive fields
of information which can nonetheless be linked if necessary
to other tables through key fields (if they exist). These are
powerful tools, but ones that can lead to sloppy table
designs and the wasteful and inefficient storage of replicate
data (think about having to go through multiple files to
correct an error), and yet still require the skills to link files
into needed views.
Finally, the specific format of animal colony data falls
into two classes, each of which requires a somewhat
different approach to record keeping and analysis. Single-
entry data apply to events or measures that arise only once
DATA AND DATABASE FORMAT
Database Structure
There are a number of issues involved in deciding how to
format colony records and the databases which contain
these records, and many of these issues fall outside the
realm of this chapter. But there are some basic issues to
consider: terminology, database structure types, and the
form of databases within types.
First, terminology: data refer to the individual pieces
of information associated with an animal. A record refers
to the collection of these fields for a given animal
(universally in colony record keeping, data is organized
around a single key-field, the unique animal identification
code) within a table, which is a collection of records of
a given type of data (e.g. registry information, genetic
markers, clinical data). In the well-known example of the
Search WWH ::




Custom Search