Databases Reference
In-Depth Information
a “good” chance to suffer from a big “hit” from attacks. Due to data sharing,
interdependencies, and interoperability between business processes and appli-
cations, the hit could greatly “amplify” its damage by causing catastrophic
cascading effects, which may “force” an application to shut down itself for
hours or even days before the application is recovered from the hit. (Note
that high speed Internet, e-commerce, and global economy have greatly in-
creased the speed and scale of damage spreading.) The cascading damage and
loss of business continuity (i.e., DoS) may yield too much risk. Because not all
intrusions can be prevented, DQR is an indispensable part of the correspond-
ing security solution, and a quality DQR scheme may generate significant
impact on risk management, business continuity, and assurance.
Secondly, due to several fundamental differences between failure recovery
and attack recovery, the DQR problem cannot be solved by failure recovery
technologies which are very mature in handling random failures. (a) Failure
recovery in general assumes the semantics of fail-stop , while attack recovery
in general cannot assume the semantics of attack-stop, since to achieve the
adversary's goal, most attacks (except for DoS) do not allow themselves to
simply crash the system; they prefer hidden damage and alive zombies, spy-
ware, bots, etc. Assuming fail-stop, quarantine is not really a problem for
failure recovery; however, intrusion/damage quarantine is a challenging re-
search topic in attack recovery and it can make a big difference. (b) Failure
recovery assumes that all operations (e.g., transactions) have equal rights to
be recovered, while attack recovery can never assume “equal rights” because
neither malicious operations nor corrupted operations should be recovered.
Towards understanding and solving the DQR problem, the rest of the ar-
ticle is organized as follows. In Section 2, we present a comprehensive yet
tangible description of the DQR problem. In Section 3, we do in-depth dis-
cussions on the limitations of traditional fault tolerance and failure recovery
techniques in solving the DQR problem. In Section 4, we present a systematic
review on how the DQR problem is being solved. In Section 5, we propose a set
of remaining research issues in fully solving the DQR problem and conclude
the paper.
2 Overview of the DQR Problem
We are concerned with the DQR needs of mission/life/business-critical infor-
mation systems. Since those information systems have been designed, imple-
mented, deployed, and upgraded over several decades, they run both con-
ventional applications, which typically use proprietary user interfaces and
application-level client-server protocols [1], and modern applications, which
are typically web-bounded running standard Web Services protocols.
Nevertheless, both conventional and modern mission/life/business-critical
applications share some common characteristics: they are typically part of a
large-scale, semantically rich, networked, interoperable information system;
Search WWH ::




Custom Search