Biomedical Engineering Reference
In-Depth Information
low, it is feasible to generate images representing each version of the system.
This can be critical for a long-running project that collects samples and data
over the course of months or years. It is important that all of the data from
the project be analyzed in precisely the same manner. By only using a single
AMI version to carry out the analysis, the user can be sure that all of the results
are comparable.
Another advantage of using a publicly available stored AMI is that it allows
multiple groups to have access to precisely the same analysis platform, allow-
ing it to be a standard for comparison of results between the groups. Other
groups can save a copy of the AMI to their own Amazon (Simple Storage
System) S3 storage area or even download it to local storage, thereby remov-
ing all dependence on other groups. They can then make changes to their copy
of the AMI and either keep these changes private or return them for public
use, while the public version remains unchanged. Running a tool as an Amazon
virtual computer also has desirable security features. Some groups may be
reluctant to upload their data to a third-party website for analysis. This could
be because the data are of a proprietary nature or may be human patient
related. In essence, the virtual computer created from the AMI is the property
of the group that instantiated it, not the group that developed it. If the AMI
was properly created, then the group that developed it does not have any
access to the data analyzed by the AMI. Since it is possible to encrypt all data
uploaded and downloaded to the AMI and the private S3 storage area of the
user, the data being analyzed should be as secure as if the analysis was taking
place in the user's home data center.
There are other advantages to cloud-based analysis compared to analysis
carried out in local data centers. If a group required a number of different
analysis tools as part of its research data workfl ow, then it might have to set
up and maintain a separate server for each tool. This can require a signifi cant
investment of time and resources for tools that may only be used sporadically.
Similarly, the requirement for each tool may only occur sporadically, but when
needed there may be a large quantity of data to analyze. This could require
either a signifi cant time lag in obtaining the results or the use of multiple
computers to carry out the work. With local machines, these extra servers have
to be prepared and maintained in advance. With virtual computers, the number
of nodes required to carry out the work can be easily instantiated. Since billing
is done by node hours used, it costs the same to carry out an analysis for 100
hours on 1 node as for 1 hour on 100 nodes. This gives even small groups access
to large-scale computing resources on demand.
Data loss is also less likely with cloud-based storage. Locally stored data
are subject to disk failure. Usually this can be addressed using tape backup,
but for this to be effective the tape backups have to be tested and stored off-
site. Cloud-stored data are multiply replicated and stored in different geo-
graphic locations and even across multiple continents. This ensures not only
that it is protected against loss but also that timely access to the data will not
be interrupted by a single point of failure. Since users can set access policies
Search WWH ::




Custom Search