Information Technology Reference
In-Depth Information
use of computer systems easier, and decrease the time for data analysis. Tools
developed specifically for computational biology research include scientific
workflow systems such as Galaxy (Goecks et al. 2010); web portals for analyz-
ing and sharing genomic data such as expression-package (EXP-PAC) (Church,
Goscinski, and Lefèvre 2012); and dedicated sequence-processing software
such as Bowtie (Langmead et al. 2009). Running these scientific applications,
in many cases, requires a huge amount of computational power to execute
complex algorithms or to process big data. High-performance computing
(HPC) can provide computer facilities that perform the large and complex
simulations and database searches required for research within reasonable
time frames. However, using HPC scientific systems and applications is dif-
ficult for many scientists who are not computing specialists. It is also a natural
expectation of these discipline specialists to be provided with packages/tools
that do not require deep knowledge of programming and system manage-
ment and allow them to use their specialist backgrounds; these packages or
tools should be similar to already available easy-to-use software packages.
HPC requires powerful and expensive computational hardware, data stor-
age, advanced middleware, and sophisticated distributed discipline-oriented
applications. The process of managing HPC resources requires in-depth
system administration skills, for which many scientists are not prepared.
Furthermore, due to their high initial purchase price and maintenance costs,
HPC resources are only affordable for rich institutions. As a result, these
resources are shared by many researchers, which leads to long waiting times
for application execution. Thus, many researchers cannot access HPC infra-
structures when needed; they often scale down their applications to reduce
waiting times. It is these barriers that have hindered many researchers in
achieving innovative discoveries for which they must rely on HPC resources.
A response to the problem faced by discipline specialists lies in cloud
computing (Goscinski, Brock, and Church 2011). Clouds promise to relieve
the pressure put on the demand of affordable, scalable, and on-demand
HPC resources that can provide users faster turnaround times on their
experiments. Providing users faster turnaround times on their experiments
using clouds has been one of the major issues promised to be addressed in a
new version of A Grid and Virtualized Environment (AGAVE) (2012). Public
cloud vendors, including Amazon's Elastic Compute Cloud (EC2) (Amazon
2010), have provided solutions specifically designed for running HPC appli-
cations. EC2 is an excellent example of an infrastructure-as-a-service (IaaS)
cloud offering raw processing and storage services. Other vendors provide
platform-as-a-service (PaaS) clouds where users can access an integrated
software platform for building HPC applications themselves as well as run-
ning HPC applications on cloud resources. Examples include Microsoft's
Azure (Chappell 2009) and Google's AppEngine (Gibbs 2008). Furthermore,
these clouds also provide the ability to scale on demand as the users' require-
ments change, accelerating the discovery of new knowledge in various
fields of research. Clouds can also provide software on demand; examples
Search WWH ::




Custom Search