Information Technology Reference
In-Depth Information
other. Furthermore, let
m
,
m
be machines,
op
1
,
op
2
∈{
create
,
delete
,
update
}
,
then we define the orthogonality relation
⊥
on actions by:
op
2
(
m, π
)if
π
π
op
1
(
m, π
)
⊥
⊥
In general, actions are considered orthogonal if they reside on different machines,
thus:
op
2
(
m
,π
)if
m
=
m
op
1
(
m, π
)
⊥
∧
op
1
,op
2
∈{
create
,
delete
,
update
,
rename
}
We overload the use of
to also capture idem-potency of actions. Actions that
can be considered idempotent, such as two consecutive updates to the same file,
are added to the relation.
Partial order reduction, based on
⊥
⊥
, is implemented by representing the vo-
cabulary of actions
L
action
=
{
a
1
,...,a
m
}
using an mapping
por
from
{
1
,...,m
}
into 2
{
1
,...m}
such that
a
i
⊥
por
(
i
). The sets
por
(
i
)areimple-
mented as bit-vectors, as
m
is relatively small and of fixed size. Depth first search
then prunes action sequences containing the pair
a
i
a
j
if
a
i
⊥
a
j
, if and only if
j
∈
a
j
(that would be
j
∈
por
(
i
)) and
j
≤
i
.
8.6 Experiments
We ran simulation relatively early in the development process. As the product
got more stable we ran a two week experiment, distributing the search over a
cluster of 200 machines each exploring a different portion of the search space.
This helped us covering slightly more than
2
trillion scenarios, for checking main
consistency properties.
Early on, simulation caught a large number of bugs that may have been caught
later in stress. On the other hand, simulation served as a pretty good regression
test as the implementation was evolving at a rapid pace. Some of the bugs found
during simulation had the traits of being extremely dicult for a stress test to
identify. For instance, the trace below exposed a corner case in the interplay
between the components that recycled unique identifiers and those that walked
directories.
create
(
m
2
, noshare/p
)
rename
(
m
2
, noshare/p, share/p
)
read
journal
(
m
2
,ω
)
sync
(
m
1
,m
2
)
rename
(
m
2
, share/p, noshare/p
)
rename
(
m
2
, noshare/p, share/p
)
read
−
journal
(
m
2
,ω
)
update
(
m
1
, share/p
)
sync
(
m
1
,m
2
)
In comparison, the bugs found in stress were predominantly in the components
that are abstracted away during simulation. There were a couple of exceptions,
though. Stress exposed a divergence bug that simulation was blind to, it also
exposed a protocol error exposed by asynchronous message passing.
−
Search WWH ::
Custom Search