Information Technology Reference
In-Depth Information
Cloud User Scenario: Astronomic
Data Processing on Amazon EC2
standard EC2. For the Linpack benchmark, they
saw 8.5x compared to similar clusters on standard
EC2 instances. On an 880-instance CC1 cluster,
Linpack achieved a performance of 41.82 Tflops,
bringing EC2 at #146 in the June 2010 Top 500
rankings.
The following cloud user scenario has been taken
from (Ahronovitz 2010): Gaia is a mission of the
European Space Agency (ESA) that will conduct
a survey of one billion stars in our galaxy (Gaia
2010). It will monitor each of its target stars about
70 times over a five-year period, precisely chart-
ing their positions, distances, movements, and
changes in brightness. It is expected to discover
hundreds of thousands of new celestial objects,
such as extra-solar planets and failed stars called
brown dwarfs.
This mission will collect a large amount of
data that must be analyzed. The ESA decided to
prototype a cloud-based system to analyze the
data. The goals were to determine the technical
and financial aspects of using cloud computing to
process massive datasets. The prototype system
contains the scientific data and a whiteboard used
to publish compute jobs. A framework for distrib-
uted computing (developed in house) is used for
job execution and data processing. The framework
is configured to run AGIS (Astrometric Global
Iterative Solution). The process runs a number
of iterations over the data until it converges.
For processing, each working node gets a job
description from the database, retrieves the data,
processes it and sends the results to intermediate
servers. The intermediate servers update the data
for the following iteration.
The prototype evaluated 5 years of data for
2 million stars, a small fraction of the total data
that must be processed in the actual project.
The prototype went through 24 iterations of 100
minutes each, equivalent to running a Grid of
20 Virtual Machines (VMs) for 40 hours. For
the full billion-star project, 100 million primary
stars will be analyzed along with 6 years of data,
which will require running the 20 VM cluster
for 16,200 hours. To evaluate the elasticity of a
cloud-based solution, the prototype ran a second
test with 120 high CPU extra large VMs. With
MATLAB on Amazon Cluster
Compute Instances
Another recent example for HPC on EC2 CCI
comes form the MATLAB team at MathWorks
(MATLAB 2010) which tested performance
scaling of the backslash (“\”) matrix division
operator to solve for x in the equation A * x = b . In
their testing, matrix A occupies far more memory
(290 GB) than is available in a single high-end
desktop machine—typically a quad core processor
with 4-8 GB of RAM, supplying approximately
20 Gigaflops.
Therefore, they spread the calculation across
machines. In order to solve linear systems of
equations they need to be able to access all of the
elements of the array even when the array is spread
across multiple machines. This problem requires
significant amounts of network communication,
memory access, and CPU power. They scaled up
to a cluster in EC2, giving them the ability to work
with larger arrays and to perform calculations at
up to 1.3 Teraflops, a 60X improvement. They
were able to do this without making any changes
to the application code.
Each Cluster Compute instance runs 8 workers
(one per processor core on 8 cores per instance).
Each doubling of the worker count corresponds
to a doubling of the number of Cluster Computer
instances used (scaling from 1 up to 32 instances).
They saw near-linear overall throughput (mea-
sured in Gigaflops on the y axis) while increasing
the matrix size (the x axis) as they successively
doubled the number of instances.
Search WWH ::




Custom Search