Information Technology Reference
In-Depth Information
Tabl e 1. Running times (STREAM/MonPoly extension /PostgreSQL) in seconds
time span
400
800
1200
1600
2000
policy
(P1)
8/9/76 9 / 19 / 279 11 / 29 / 610 12 / 39 / 1065 14 / 48 / 1650
(P2)
21 / 10 / 247 23 / 20 / 1646 24 / 30 / 5233 26 / 40 / 11989 28 / 50 / 23260
(P3)
/ 22 / 168
/44/604 / 66 / 1230
/ 88 / 2251
/ 110 / 3458
(P4)
12/9/7515 / 19 / 280 15 / 29 / 612 17 / 38 / 1068 19 / 48 / 1650
(P5)
24/76/ 83 33 / 157 / 337 41 / 234 / 745 49 / 313 / 1351 59 / 395 / 2099
8GB of RAM and an Intel Core i5 CPU with 2.67GHz. The SQL queries for
PostgreSQL and the CQL queries for STREAM were manually obtained from
the corresponding MFOTL Ω formulas. For the considered policies and logs, the
semantic differences between the languages are not substantial. In particular,
the tools output the same violations. PostgreSQL's running times only account
for the query evaluation, performed once per log file, and not for populating the
database. For MAX aggregations, STREAM aborts with a runtime error, and we
mark this with the symbol
.
Note that the formulas in Figure 3 vary in their complexity: e.g., they contain
different numbers of aggregations and temporal operators, with time windows
of different sizes. STREAM and our tool scale linearly on these examples with
respect to the time spans of the logs. This is not the case for PostgreSQL. Over-
all, our tool's performance is between STREAM's and PostgreSQL's on these
examples.
We first focus on the performance of our tool. (P2) is only slightly slower to
monitor than (P1) because the relations for the additional subformula are not
large: they contain around 50 tuples, as the limit flag is toggled for each user,
on average, every 10 days. (P3) takes longer to monitor for two reasons. First,
it contains a significantly larger time window. Second, the join of two relations
is computed, which is also the case for (P5). For (P3), the two input relations
and the output relation each have size n ,where n is the number of users. For
(P5), the size of the input relations is approximately 31 mn ,where m is the
average number of withdrawals per day of a user, while the output relation is
approximately of size 31 2 m 2 n . This explains why (P5) takes longer to monitor
than (P3). Since aggregating over a relation does not increase its size, the nesting
of aggregation operators has only a minor impact on the running times, compare
(P1) and (P4).
PostgreSQL performs worst in these experiments. This is not surprising as
PostgreSQL is not designed for this application domain. In particular, Post-
greSQL has no support for temporal reasoning and we must treat time as just
another data value. In more detail, we load log files into database tables that
have two additional attributes to represent the time point and the timestamp
of an event occurrence, and we adapt the standard embedding of temporal logic
into first-order logic to represent MFOTL Ω formulas as SQL queries. Treating
time as data has the following disadvantages. First, it is not suited for online
processing of events: query evaluation does not scale, because the query must be
 
Search WWH ::




Custom Search