Database Reference
In-Depth Information
All reported numbers of the query performance metric are the average of five executions with the
highest and the lowest values removed. The rational behind this is that the first reading of each query
is always expensively inconsistent with the other readings. This is because the relational database uses
buffer pools as a caching mechanism. The initial period when the database spends its time loading
pages into the buffer pools is known as the warm up period. During this period the response time of the
database declines with respect to the normal response time. For all metrics: the lower the metric value,
the better the approach.
Experimental Results
Table 3 summarizes the loading times for shredding the different datasets into the alternative relational
representations. The RS scheme is the fastest due to the less required number of insert tuple operations.
Similarly, the TS requires less loading time than BS since the number of inserted tuples and updated
tables are smaller for each triple.
Table 4 summarizes the storage cost for the alternative relational representations. The RS scheme
represents the cheapest approach because of the normalized design and the absence of any data redun-
dancy. Due to the limited percentage of the sparsity in the DBLP dataset, the PS does not introduce any
additional cost in the storage space except a little overhead due to the redundancy of the object identi-
fication attributes in the decomposed property tables. The BS scheme represents the most expensive
approach due to the redundancy of the ID attributes for each binary table. It should be also noted that
the storage cost of TS and BS are affected by the additional sizes of their associated indexes.
Table 5 summarizes the query performance for the SP 2 Bench benchmark queries over the alternative
relational representations using the different sizes of the dataset. Remarks about the results of this ex-
periment are given as follows:
1. There is no clear winner between the triple store ( TS ) and the binary table ( BS ) encoding schemes.
Triple store ( TS ) with its simple storage and the huge number of tuples in the encoding relation is
still very competitive to the binary tables encoding scheme because of the full set of B-tree physical
indexes over the permutations of the three encoding fields (subject, predicate, object).
2. The query performance of the ( BS ) encoding scheme is affected badly by the increase of the number
of the predicates in the input query. It is also affected by the subject-object or object-object type
of joins where no index information is available for utilization. Such problem could be solved by
building materialized views over the columns of the most frequently referenced pairs of attributes.
Table 3. A comparison between the alternative relational RDF storage techniques in terms of their
loading times
Loading Time (in seconds)
Dataset
Triple Stores
Binary Tables
Traditional Relational
Property Tables
500K
282
306
212
252
1M
577
586
402
521
2M
1242
1393
931
1176
$M
2881
2936
1845
2406
 
Search WWH ::




Custom Search