Database Reference
In-Depth Information
person residing in Amsterdam who committed an offence on September 1, 2010.
Then, it is likely that both records concern the same person. Alternatively, if HKS
would show that the date of the official report is unknown because it is not entered
correctly, the probability would be considerably lower. Note that in the example
the residence of the suspect is not very selective and that it is surely possible that
multiple residents of Amsterdam commit an offence on the same day. If this is the
case, additional or different attributes are needed to ensure that the records are
combined properly. After all, if more attributes overlap, the probability that the
two records denote the same person increases.
The data in a data warehouse can be made available through data marts. An
example that is based on the offender-oriented data warehouse is the Drug Crime
Data Mart which consists of a selection of the data concerning drug-related
crime. 9 This data mart can be used for analysis and reporting purposes, such as
National Drug Monitor publications.
10.3.2 A Dataspace Approach to Combining Judicial Data
In a dataspace approach, 10 also three layers are distinguished (see Figure 10.3): a
dataspace layer, a space manager layer, and an interface layer. The dataspace layer
contains a set of (cleaned) databases that are complement to each other and may
be related. Although these databases are related there is no need for data reconcili-
ation. Alternatively, the relations that exist between the databases are stored in a
relationship manager in the space manager layer. This layer maintains data quality
(the plausibility and consistency of the data) by providing rules to which the data
must adhere. For this purpose the relationship manager contains different types of
rules:
1. Rules to handle similar data coming from different sources.
2. Rules to deal with missing data.
3. Rules to allow for incomplete or tentative data.
4. Rules to record semantic changes in attributes.
5. Rules to filter out results that should not be shown to the user.
6. Rules to determine whether large deviations exist between past and future data
or between values from the same or different databases.
All in all, a combined set of rules in the relationship manager serves to complete
incomplete data sets, determine whether they are acceptable, and warn users when
they are less reliable. The relationship manager also serves to minimize the
chances of misinterpreting data. To do so, this layer maintains the relations be-
tween attributes in the different databases and keeps track of changes in the mean-
ing of these attributes. Based on this history of changes, the space manager may
decide to reorganize or convert a database, in particular if the semantics of major
attributes changed considerably over the years.
9 Choenni, S. & Meijer, R. (2011), Meijer, R., van Dijk, J., Leertouwer, E. & Choenni, S.
(2008).
10 Franklin, M., Halevy, A. & Maier, D. (2005).
Search WWH ::




Custom Search