Diagnosis and Performance Monitoring - Deploying and Managing a Cloud Infrastructure

Information Technology Reference

In-Depth Information

TABLE 5.1 Application Workload Profile

Characteristic

Value

Description

I/O size

Bytes/kilobytes

It is best if this value matches or is close to

the file system's block size.

Access pattern

Sequential or random

The most common read or write access

pattern used.

File access profile

Data or attribute

Determine if the app performs I/O opera-

tions on many small files.

Bandwidth

Mbps

The bandwidth requirement of the app.

Latency sensitivity

Milliseconds

Is the app sensitive to read or write latency?

I/O Size The I/O size refers to the size of the files that are constantly being processed by the

application into the disk. This plays a very large role in how the file system can be optimized.

Part of the importance of the I/O size is that because of the inherent limitations of I/O devices,

they are less efficient with small I/Os. That is why it is always a best practice to group small

adjacent I/Os into a bigger I/O using a buffer so that there will be only one large operation,

not multiple small ones that go through the same process, causing it to take longer and use

more resources. Figure 5.2 shows a comparison between commonly used file systems as bench-

marked by Vanninen and Wang from Clemson University in their paper “On Benchmarking

Popular File Systems.”

Access Pattern The access pattern of an application has to do with how it seeks data in

the storage media; it can read or write a file either sequentially or in random order. It is

much easier to tune the file system if the application does a lot of sequential I/Os because

these small I/Os can simply be grouped into a single large one. The third access pattern is

called strided access , and it's typically used for scientific applications. However, this type

can be largely considered as a type of sequential access with some characteristics of random

access, such as caching.

File Access Profile Applications can be either data or attribute intensive. Data-intensive

ones shift a lot of data around but create or delete minimally. On the other end are attribute-

intensive applications, which create and delete a lot of files yet read and write only a fraction

of each. A good example of a data-intensive application that deals with large amounts of data

(typically 100 MB or larger) is a big data application like Apache Hadoop , which utilizes

a form of Google's MapReduce architecture and programming model. Attribute-intensive

applications are those that check a lot of metadata and attributes to perform operations.

Major examples are revision control systems such as Git , SVN , and CVS .

Deploying and Managing a Cloud Infrastructure

Search WWH ::

Custom Search

Home