Biomedical Engineering Reference
In-Depth Information
university or government personnel archive the large online public databases, the archiving of locally
generated data is a personal or corporate responsibility. Regardless of who takes responsibility for
the process, the issues associated with archiving are numerous, as suggested by Table 2-5 .
The archiving stage of the data life cycle usually involves making decisions about the most
appropriate software, hardware, storage medium and archiving process to use. There are the obvious
issues of media cost and longevity, security standards, the type of hardware to use to store the data,
and the software that will facilitate storage and later retrieval. For example, selecting the optimal
storage medium for the archiving process is a function of the frequency with which archived data are
accessed, the budget, and the volume of data involved.
Table 2-5. Archiving Issues. Key issues in bioinformatics in the archival
process range from the scalability of the initial solution to how to best
provide for security.
Issue
Description
Indexing
Vocabulary, metadata, language, completeness, efficiency
Space Requirements
Index space versus data space
Hardware Requirements
Hard drives, network
Scalability
Ability to expand functionality without investing in new hardware and
software
Database Design
Data model
Archival Process
Responsibilities for overseeing the process
Space Requirements
Current and projected archival capacity
Completeness
Relative quantity of total data that are archived
Media Selection
Compatibility, speed, capacity, data density, cost, volatility, durability,
and stability
Location
Local, server-based, or network
Infrastructure Requirements Network and computer hardware
Relative Value
Value of data vs. archival overhead
Hardware Configuration
RAID and other configurations
Longevity
Technical obsolescence of media and MTBF rating of related
equipment
Security
Limited access to data
The hardware involved in the archiving process may include a PC-based CD-ROM burner, a large
database server that's networked to a number of workstations and routinely backed up onto
magnetic tape, or a network-based storage that may be located offsite. As discussed later in this
chapter, each option has security, cost, and performance issues. The software tools selected for
archiving data also define the usability and performance of the data archive, especially regarding
data indexing and retrieval functions.
After data have been created and, if necessary, modified for use, and before it can be archived, it's
typically named, indexed, and filed to facilitate locating it in the future. As such, the filing system,
naming conventions, and accuracy and specificity of indexing limit the efficiency with which the data
Search WWH ::




Custom Search