Information Technology Reference
In-Depth Information
The technologies used to support the OOP-DW was the Pentaho as the OLAP tool
and the Microsoft SQL Server 2008 R2 as the Database Management System
(DBMS) and the Extract/Transform/Load (ETL) tool. Because of data confidentiality,
the data presented in Fig. 1 are just test data mapping 100.000 procedures, where each
fact row relates with two Lots, each one with a base price of 1
€ . From an academic
perspective it is assumed that each procedure also includes two Contracting
Entities (i.e. Adjudicantes), corresponding to a total of 400.000 fact rows. Using
.Net C#, we implemented an extension to the way aggregator operators, such as SUM,
COUNT, and AVG are executed and interpreted by the Pentaho tool. The main goal
was to extend, at the physical level, the way aggregation functions were computed by
the Pentaho tool, avoiding double-counting for multi-value dimensions.
For instance, the standard SUM operator was extended to include two parameters:
the foreign key of the target dimension and the metric of the fact table. The extended
SUM operator is invoked as presented by the following SQL statement:
SELECT dbo . DistinctSum ( f . DLOT_ID , f . PrecoLote )
FROM Factos f
WHERE f . DPR_ID = 1;
Note that as presented in Fig. 1, DPR_ID =1 contains two Lots ( DLot_ID=1 and
DLot_ID=2 ), and DPR_ID =1 also has two Contracting Entities there-
fore the report will include one fact row for each combination DPR_ID & DEA_ID
& DLot_ID . However, the Pentaho tool does not recognize the extended SUM opera-
tor ( dbo.DistinctSum ). The engineering challenge was to find a way to force the
Pentaho tool to recognize and execute the extended SUM operator as if it was a stan-
dard SUM operator.
Rewriting the Pentaho opensource code to overcome the problem was not an op-
tion. Therefore we directed our effort to the native Measure Expression capability of
Pentaho-Mondrian, a feature that enables users to write a standard SQL expression.
The SQL expression that is generated and passed to Microsoft SQL Server 2008 R2
for execution is listed next:
SELECT "DPR" . "DPR_ID" AS "c0" , "DPR" . "DPR_Desc" AS "c1" ,
SUM ( dbo . DistinctSum ( "Factos" . "DLOT_ID" , "Factos" . "PrecoLote" ))
AS "m0"
FROM "dbo" . "DPR" AS "DPR" , "dbo" . "Factos" AS "Factos"
WHERE "Factos" . "DPR_ID" = "DPR" . "DPR_ID"
GROUP BY "DPR" . "DPR_ID" , "DPR" . "DPR_Desc";
As expected Microsoft SQL Server 2008 R2 is not able to execute this SQL ex-
pression, triggering the following error message “ Cannot perform an aggre-
gate function on an expression containing an aggregate or
a subquery ”. Interpreting this message we realised that a minor adjustment to the
way the extended SUM operator is invoked would comply with the execution of any
aggregation function in Microsoft SQL Server 2008 R2. Therefore the reasoning was
to invoke the SUM operator with an input parameter that could mask the execution of
a standard SUM operator.
Search WWH ::




Custom Search