Databases Reference
In-Depth Information
G F
Benchmark NodesEdgesVarsNodes
Alias Management Tool
59
1,209
3
3
Chat Application
156
187
25
33
Bicycle Club App
176
246
70
317
Software Catalog
190
455
73
484
Sporting Field Management Tool
268
320
50
50
Commitment Management Tool
356
563
107
1,781
New Hire Tool
502
1,101
116
1,917
Expense Report Approval Tool
811
1,753
252
2,592
Relationship Management
3,639
22,188
874
391,221
Customer Support Portal
3,881
11,196
942
181,943
FIGURE 11.13: Size statistics for the propagation graph G and factor graph
F used byMerlin.
the ultimate initial specification is also fairly large, consisting of a total of 111
methods.
To provide a metric for the scale of the benchmarks relevant forCat.Net
andMerlinanalyses, Figure 11.13 provides statistics on the sizes of the prop-
agation graph G computed byMerlin, and the factor graph F constructed in
the process of constraint generation. We sort our benchmarks by the number
of nodes in G. With propagation graphs containing thousands of nodes, it is
not surprising that we had to develop a polynomial approximation in order
forMerlinto scale, as Section 11.5 describes.
11.6.2 Merlin Findings
Figure 11.14 provides information about the specifications discovered by
Merlin. Columns 2{16 provide information about how many correct and
false positive items in each specification category have been found. Note that
in addition to \good" and \bad" specications, as indicated by Xand 7 , we
also have a \maybe" column denoted by ?. This is because often what consti-
tutes a good specification is open to interpretation. Even in consultations with
Cat.Netdevelopers we found many cases where the classification of a par-
ticular piece of the specification is not clear cut. The column labeled, Rate,
gives the false positive rate forMerlin|the percentage of \bad" speci-
cations that were inferred. Overall,Merlininfers 381 specifications, out of
which 167 are confirmed and 127 more are potential specifications. TheMer-
linfalse positive rate, looking at the discovered specifications is 22%, com-
puted as (7+31+49)/381. This is decidedly better than the average state-of-
the-art false positive rate of over 90% [5]. The area in whichMerlindoes the
worst is identifying sanitizers (with a 38% false positive rate). This is because
despite the extra constraints described in Section 11.2.4,Merlinstill flags
 
Search WWH ::




Custom Search