Database Reference
In-Depth Information
found in the Northwind data warehouse. In parent-child hierarchies, we
normally want to aggregate measures of child members into parent members.
For example, when considering an employee Andrew Fuller who is head
of the Northwind company (see Fig. 6.2 ), we may want to report not
only his sales amount but his sales amount plus the sum of the sales
amount of the employees that directly and indirectly report to him (Nancy
Davolio, Janet Leverling, and so on). Mondrian provides a special structure
called a closure table , which basically contains the transitive closure of
the hierarchy. This table has schema ( SupervisorKey , EmployeeKey , Distance ),
where the third attribute contains the distance from the two employees in
the hierarchy. For the hierarchy in Fig. 6.2 , the closure table will contain,
for example, the tuples ( 2 , 2 , 0 ), ( 2 , 5 , 1 ), and ( 2 , 6 , 2 ), which correspond
to ( Andrew Fuller , Andrew Fuller , 0 ), ( Andrew Fuller , Steven Buchanan , 1 ), and
( Andrew Fuller , Michael Suyama , 2 ). It follows from the above that a closure
table is similar to an aggregate table in the sense that it contains a redundant
copy of the data in the database. Note that while an aggregate table speeds
up aggregation, a closure table makes the computation of rollups along a
parent-child hierarchy more ecient.
When a query matches the definition of an aggregate table, Mondrian uses
such a table to answer the query instead of computing the aggregate from
scratch, basically applying the theoretical concepts studied in this chapter.
If more than one aggregate table matches a particular query, Mondrian must
choose between them. This is done as follows: If there is an aggregate table of
the same granularity as the query, Mondrian uses it. If there is no aggregate
table at the desired granularity, Mondrian picks an aggregate table of lower
granularity and rolls up from it. In general, Mondrian chooses the aggregate
table with the fewest rows (the heuristic we have called “smallest parent”).
7.10.2 Caching
Another feature provided by Mondrian to speed up query performance is
caching data in main memory, to avoid accessing the database to retrieve
schemas, dimension members, and facts. Mondrian provides three different
kinds of caches:
￿ The schema cache , which keeps schemas in memory to avoid reading them
each time a cube is loaded. This cache stores the schema in memory after
it has been read for the first time and keeps it in memory until the cache
is cleared. Each time the schema is updated, the cache must be cleared.
￿ The member cache , which stores dimension members in memory. The
member cache must also be synchronized with the underlying data. A
Service Provider Interface is used to flush the members from the cache.
The member cache is populated when members of a dimension are first
read, and then members are retrieved as needed. Like it is the case in
Search WWH ::




Custom Search