Information Technology Reference
In-Depth Information
17
Cloudware Application Development
17.1 Reliability Conundrum
The cloud, with its tendency to use commodity hardware and virtualiza-
tion and with the potential for enormous scale, presents many additional
challenges to designing reliable applications. In all engineering disciplines,
reliability is the ability of a system to perform its required functions under
stated conditions for a specified period of time. In software, for application
reliability, this becomes the ability of a software application and all the com-
ponents it depends on (operating system, hypervisor, servers, disks, network
connections, power supplies, etc.) to execute without faults or halts all the
way to completion. But completion is defined by the application designer.
Even with perfectly written software and no detected bugs in all underly-
ing software systems, applications that begin to use thousands of servers
will run into the mean time to failure in some piece of hardware, and some
number of those instances will fail. Therefore, the application depending on
those instances will also fail.
Many design techniques for achieving high reliability depend upon redun-
dant software, hardware, and data. For redundant software components, this
may consist of double- or triple-redundant software components (portions of
your application) running in parallel with common validation checks. One
idea is to have the components developed by different teams based on the
same specifications. This approach costs more, but extreme reliability may
require it. Because each component is designed to perform the same func-
tion, the failures of concurrent identical components are easily discovered
and corrected during quality-assurance testing.
Although redundant software components provide the quality-assurance
process with a clever way to validate service accuracy, certain applications
may want to deploy component redundancy into the production environ-
ment. In such conditions, multiple parallel application processes can provide
validity checks on each other and let the majority rule. Although the redun-
dant software components consume extra resource consumption, the trade-
off between reliability and the cost of extra hardware may be worth it.
351
 
Search WWH ::




Custom Search