Databases Reference
In-Depth Information
The last several years have seen a proliferation of static and runtime analy-
sis tools for finding security violations that are caused by explicit information
flow in programs. Much of this interest has been caused by the increase in the
number of vulnerabilities such as cross-site scripting and SQL injection. In
fact, these explicit information flow vulnerabilities commonly found in Web
applications now outnumber vulnerabilities such as buffer overruns common
in type-unsafe languages such as C and C++. Tools checking for these vulner-
abilities require a specification to operate. In most cases the task of providing
such a specification is delegated to the user. Moreover, the ecacy of these
tools is only as good as the specification. Unfortunately, writing a compre-
hensive specification presents a major challenge: Parts of the specification are
easy to miss, leading to missed vulnerabilities; similarly, incorrect specifica-
tions may lead to false positives.
This paper proposesMerlin, a new approach for automatically inferring
explicit information flow specifications from program code. Such specifications
greatly reduce manual labor, and enhance the quality of results, while using
tools that check for security violations caused by explicit information flow.
Beginning with a data propagation graph, which represents interprocedural
flow of information in the program,Merlinaims to automatically infer an
information flow specication.Merlinmodels information flow paths in the
propagation graph using probabilistic constraints. A na ve modeling requires
an exponential number of constraints, one per path in the propagation graph.
For scalability, we approximate these path constraints using constraints on
chosen triples of nodes, resulting in a cubic number of constraints. We char-
acterize this approximation as a probabilistic abstraction, using the theory
of probabilistic refinement developed by McIver and Morgan. We solve the
resulting system of probabilistic constraints using factor graphs, which are a
well-known structure for performing probabilistic inference.
We experimentally validate theMerlinapproach by applying it to 10 large
business-critical Web applications that have been analyzed withCat.Net,
a state-of-the-art static analysis tool for .NET. We find a total of 167 new
confirmed specifications, which result in a total of 322 additional vulnerabil-
ities across the 10 benchmarks. More accurate specifications also reduce the
false positive rate: In our experiments,Merlin-inferred specifications result
in 13 false positives being removed; this constitutes a 15% reduction in the
Cat.Netfalse positive rate on these 10 programs. The final false positive rate
forCat.Netafter applyingMerlinin our experiments drops to under 1%.
 
Search WWH ::




Custom Search