Information Technology Reference
In-Depth Information
session failures between ASes, leading to another round of rerouting. Similarly,
the dynamical redistribution of trac loads may disconnect other pairs of BGP
sessions. Meanwhile, the previous 'failed' sessions may re-establish since the links
are no longer congested after all the trac were rerouted around them. A single
fault in routers or links can trigger a sequence of route changes on a global scale.
This process is what we call cascading failures in inter-domain routing system.
Previous works about robustness of BGP in congested networks study the
relationship between trac overload factors (i.e. queueing delays, packet sizes,
TCP retransmission parameters and so on) and lifetime of a BGP session [1], [2].
This is a micro-view of survivability of BGP, focusing on single component in the
system. However, inter-domain routing system is a complex network, whose be-
haviour is better characterized by the dynamics induced by interactions of BGP
routers in the whole Internet. On the other hand, studies on cascading failures
in complex networks have shown that networks with highly heterogeneous dis-
tribution of loads such as the Internet and electrical power grids are particularly
vulnerable to attacks in that a large-scale cascade may be triggered by disabling
a single key node [3], [4]. But in these models, overloaded nodes are either re-
moved or avoided. They are not suitable to describe the unique 'virtual cut' and
'automatic restoration' characteristics of BGP links under dynamical congested
state. Recently, a CXPST attack is presented utilizing this property of BGP to
create control plane instability by using only data plane trac [5]. Besides this
specific attacking technique, it's also important to further analyse factors that
affect the instability scope and the difference between random breakdowns and
intentional attacks.
In this paper, we try to answer questions that: Considering the distinct prop-
erty of inter-domain routing system, under what conditions can a global cascade
take place? And who will be affected worst? Our contributions with respect to
previous works are summarized as follows.
(1) We propose a model for studying the inter-domain routing process under
a cascade of congested and resumed link states. In this model, the overloaded
links are not removed from the network. They have chances to be restored in
the future. Moreover, we apply customer-prefer and valley-free policy to routing
process instead of simply using shortest path algorithm, in order to better comply
with the actual situation.
(2) We characterize the survivability of inter-domain routing system by reach-
ability and number of rerouting messages . The most critical ability of routing
system is making routing decisions. Reachability can evaluate how the incom-
plete topology will affect the capability; and number of rerouting messages can
evaluate the effect of instability of the routing system. Because a surge of BGP
updates generated by large-scale reroutings may exceed the computational ca-
pacity of affected routers, causing a degradation of such capability.
(3) In our model, the survivability depends on congested state of AS links,
which are decided by the comparison of their loads and capacities. Therefore,
we study the relationship between survivability of inter-domain routing system
and capacity of AS links. This examination reveals that when the tolerance
 
Search WWH ::




Custom Search