Databases Reference
In-Depth Information
where appropriate. We also express the number of constraints of each type as
a function of N, the number of nodes in the propagation graph.
A1: Path Safety. We assume that most paths from a source to a sink pass
through at least one sanitizer. For example, we believe that if ReadData1 is
a source, and WriteData is a sink, at least one of Prop1 or Cleanse is a
sanitizer. This is stated using the set of constraints A1 shown in Figure 11.3.
While constraints A1 model our core beliefs accurately, they are inecient if
used directly: A1 requires one constraint per path, and the number of acyclic
paths could be exponential in the number of propagation graph nodes.
B1: Triple Safety. In order to abstract the constraint set A1 with a poly-
nomial number of constraints, we add a safety constraint B1 for each triple
of nodes as shown in Figure 11.3. The number of B1 constraints is O(N 3 ). In
Section 11.5 we prove that the constraints B1 are a probabilistic abstraction
of constraints A1 under suitable choices of parameters.
11.2.4 Auxiliary Constraints
In practice, the set of constraints B1 does not limit the solution space
enough. We have found empirically that just using this set of constraints
allows too many possibilities, several of which are incorrect classifications. By
looking at results over several large benchmarks we have come up with four
sets of auxiliary constraints|B2, B3, B4, and B5|which greatly enhance
the precision.
B2: Pairwise Minimization. The set of constraints B1 allows the solver
flexibility to consider multiple sanitizers along a path. In general, we want to
minimize the number of sanitizers we infer. Thus, if there are several solu-
tions to the set of constraints B1, we want to favor solutions that infer fewer
sanitizers, while satisfying B1, with higher probability.
For instance, consider the path ReadData1 , Prop1 , Cleanse , WriteData
in Example 2. Suppose ReadData1 is a source and WriteData is a sink. B1
constrains the triple
h ReadData1 ; Prop1 ; WriteData i
so that the probability of Prop1 not being a sanitizer is low; B1 also constrains
the triple
h ReadData1 ; Cleanse ; WriteData i
such that the probability of Cleanse not being a sanitizer is low. One solution
to these constraints is to infer that both Prop1 and Cleanse are sanitizers. In
reality, programmers do not add multiple sanitizers on a path and we believe
that only one of Prop1 or Cleanse is a sanitizer. Thus, we add a constraint B2
that for each pair of potential sanitizers it is unlikely that both are sanitizers,
as shown in Figure 11.3. The number of B2 constraints is O(N 2 ).
 
Search WWH ::




Custom Search