Conclusion - Data Warehouse Systems: Design and Implementation

Database Reference

In-Depth Information

them, which has a collaboration frequency attribute x . For every conference

in every year, we may have a coauthor graph describing the collaboration

patterns among researchers. Thus, each graph can be viewed as a snapshot

of the overall collaboration network. These graphs can be aggregated in

an OLAP style. For instance, we can aggregate graphs in order to obtain

collaborations by conference type and year for all pairs of authors. For this,

we must aggregate the nodes and edges in each snapshot graph according to

the conference type (like database conferences) and the year. For example, if

there is a link between two authors in the SIGMOD and VLDB conferences,

the nodes and the edge will be in the aggregated graph corresponding to

the conference type Databases . More complex patterns can be obtained,

for example, by merging the authors belonging to the same institution,

enabling to obtain patterns of collaboration between researchers of the same

institutions.

Taking the above concepts into account, in Graph OLAP, dimensions

are classified as informational and topological . The former are close to the

traditional OLAP dimension hierarchies using information of the snapshot

levels, for example, Conference

All . They can be used to aggregate

and organize snapshots as explained above. On the other hand, topological

dimensions can be used for operating on nodes and edges within individual

networks. For example, a hierarchy for authors like AuthorId

→

Field

→

Institution

will belong to a topological dimension since author institutions do not

define snapshots. These definitions yield two different kinds of Graph OLAP

operations. A roll-up over an informational dimension overlays and joins

snapshots (but does not change the objects), while a roll-up over a topological

dimension merges nodes in a snapshot, modifying its structure.

Graph Cube [ 239 ] is a model for graph data warehouses that supports

OLAP queries on large multidimensional networks, accounting for both

attribute aggregation and structure summarization of the networks. A multi-

dimensional network consists of a collection of vertices, each containing a set

of multidimensional attributes describing the nodes' properties. For example,

in a social network, the nodes can represent persons, and multidimensional

attributes may include UserID , Gender , City , etc. Thus, multidimensional

attributes in the graph vertices define the dimensions of the graph cube.

Measures are aggregated graphs summarized according to some criteria.

Note that the problem here is different from Graph OLAP, where there are

several snapshots. In Graph Cube, we have only one large network, thus we

have a graph summarization problem. For example, suppose that we have

a small social network with three nodes. Two of them correspond to male

individuals in the network, while the third corresponds to a female. A graph

that summarizes the connections between genders will have two nodes, one

labeled male and the other labeled female . The edges between them will be

annotated with the number of connections of some kind. For instance, if in the

original graph there were two connections between two male persons (in both

→

Data Warehouse Systems: Design and Implementation

Search WWH ::

Custom Search

Home