Information Technology Reference
In-Depth Information
Table 3 presents the values in millisecond achieved for each performance test rea-
lised for all options related to multi-value dimensions and double counting, using
Pentaho as our OLAP testing tool. Once again avoiding the need for a bridge table,
overcoming at the same time any double-counting derived from non-strictness.
Table 3. Main outcomes with different techniques to calculate the metric vale for PrecoLot
Me tr i c
(PrecoLote)
Agregation
Operator
Execution
Time (ms )
Comparis on
(Scenario1)
Double-
Counts
Grain of
Data
Multi-Value
Grand Total
Sce nari o1
SUM
1571
100%
Ye s
Granul ar
No
Not A ccurate
Sce nari o2
SUM
1548
99%
No
Granul ar
No
Not A ccurate
Scenario3
SUM
35293
2247%
No
Aggregated
Yes
Not Accurate
Sce nari o4
SUM
2268
144%
No
A ggre gate d
Ye s
A ccurate
Scenario 1 , in this scenario the value of PrecoLot was calculated using the de-
fault features of Pentaho-Mondrian (i.e., possibility to configure the standard
behaviour of the SUM operator). Main drawback: only provides correct values at the
lowest granularity. The tested results originate double-counting and faulty data for
Grand Total and PrecoLote, if not all dimensions are considered (i.e., foreign keys
DEA_ID, DPR_ID or DLOT_ID). This situation can fall into a summarizability
problem since there is a granularity mismatch in the instances of the fact causing
inconsistent totals;
Scenario 2 , in this scenario we used Pentaho MDX expressions to calculate Preco-
Lot for multi-value dimensions. The expression “ Sum ([DEA].[All],
[Measures].[PrecoLote]) ” calculates the metric value evaluated over the entire
set of Contracting Entities . In this case we had to assign a value of 0 (zero) to
each fact row with duplicates. This basic solution requires the end-user to have a deep
knowledge of the DW schema when performing OLAP queries. Main drawback: multi-
valued dimension are not supported because facts are defined independently of the di-
mension (e.g., the value of PrecoLote is assigned always with the same value for
contracting entities). The ETL also becomes very complex to ensure that the price of
each Lot is specified only once in the fact table.
Scenario 3 , in this scenario the approach used in scenario 2 was reviewed to the
following formula “ IIf([DLOT.H].CurrentMemberis[DLOT.H].[All],
Sum([DLOT.H].CurrentMeber.Children,[Measures].[PrecoLote3
Max]), [Measures].[PrecoLoteMax]) ”. This approach returns SUM ( Pre-
coLoteMax ) for each aggregated Lot , otherwise it returns the value of PrecoLo-
teMax . PrecoLoteMax is a hidden metric that was configured in the same way as
PrecoLote in scenario 1 but using the MAX operator instead of the SUM operator.
When applied to a single Lot, the output of PrecoLoteMax is correct because
MAX(X,X)= X , even if the price of the lot is repeated across several rows; it also
avoids double counting. Main drawback: extremely underperformance and still
provides inaccurate results for Grand Total.
Search WWH ::




Custom Search