Porting HPC Applications to Grids and Clouds - Cloud, Grid and High Performance Computing: Emerging Applications

Information Technology Reference

In-Depth Information

Cloud User Scenario: Astronomic

Data Processing on Amazon EC2

standard EC2. For the Linpack benchmark, they

saw 8.5x compared to similar clusters on standard

EC2 instances. On an 880-instance CC1 cluster,

Linpack achieved a performance of 41.82 Tflops,

bringing EC2 at #146 in the June 2010 Top 500

rankings.

The following cloud user scenario has been taken

from (Ahronovitz 2010): Gaia is a mission of the

European Space Agency (ESA) that will conduct

a survey of one billion stars in our galaxy (Gaia

2010). It will monitor each of its target stars about

70 times over a five-year period, precisely chart-

ing their positions, distances, movements, and

changes in brightness. It is expected to discover

hundreds of thousands of new celestial objects,

such as extra-solar planets and failed stars called

brown dwarfs.

This mission will collect a large amount of

data that must be analyzed. The ESA decided to

prototype a cloud-based system to analyze the

data. The goals were to determine the technical

and financial aspects of using cloud computing to

process massive datasets. The prototype system

contains the scientific data and a whiteboard used

to publish compute jobs. A framework for distrib-

uted computing (developed in house) is used for

job execution and data processing. The framework

is configured to run AGIS (Astrometric Global

Iterative Solution). The process runs a number

of iterations over the data until it converges.

For processing, each working node gets a job

description from the database, retrieves the data,

processes it and sends the results to intermediate

servers. The intermediate servers update the data

for the following iteration.

The prototype evaluated 5 years of data for

2 million stars, a small fraction of the total data

that must be processed in the actual project.

The prototype went through 24 iterations of 100

minutes each, equivalent to running a Grid of

20 Virtual Machines (VMs) for 40 hours. For

the full billion-star project, 100 million primary

stars will be analyzed along with 6 years of data,

which will require running the 20 VM cluster

for 16,200 hours. To evaluate the elasticity of a

cloud-based solution, the prototype ran a second

test with 120 high CPU extra large VMs. With

MATLAB on Amazon Cluster

Compute Instances

Another recent example for HPC on EC2 CCI

comes form the MATLAB team at MathWorks

(MATLAB 2010) which tested performance

scaling of the backslash (“\”) matrix division

operator to solve for x in the equation A * x = b . In

their testing, matrix A occupies far more memory

(290 GB) than is available in a single high-end

desktop machine—typically a quad core processor

with 4-8 GB of RAM, supplying approximately

20 Gigaflops.

Therefore, they spread the calculation across

machines. In order to solve linear systems of

equations they need to be able to access all of the

elements of the array even when the array is spread

across multiple machines. This problem requires

significant amounts of network communication,

memory access, and CPU power. They scaled up

to a cluster in EC2, giving them the ability to work

with larger arrays and to perform calculations at

up to 1.3 Teraflops, a 60X improvement. They

were able to do this without making any changes

to the application code.

Each Cluster Compute instance runs 8 workers

(one per processor core on 8 cores per instance).

Each doubling of the worker count corresponds

to a doubling of the number of Cluster Computer

instances used (scaling from 1 up to 32 instances).

They saw near-linear overall throughput (mea-

sured in Gigaflops on the y axis) while increasing

the matrix size (the x axis) as they successively

doubled the number of instances.

Cloud, Grid and High Performance Computing: Emerging Applications

Search WWH ::

Custom Search

Home