Database Reference
In-Depth Information
sense that a common data model is not required and that there is no need to link
data based on unique identifiers. As a result, in a dataspace approach not only
microdata but also aggregate data may be used. This does not alter the fact that a
dataspace layer may contain a data warehouse as a data source.
The worked-out examples from the Dutch criminal justice chain illustrate that
data integration can be executed in a variety of ways. For instance, depending on
the needs of the users or the availability of the data, parts in this process may have
to be altered. In the next section it is shown how potential problems associated
with linking (crime) data affect the data integration process and the choices made
in it.
10.4 Challenges in Combining Judicial Data
The main problem with data integration in the field of justice is that, although it
can be automated for a large part, a significant amount of manual effort is still re-
quired. The main reason for this is the nature of crime data: redundancy, inconsis-
tencies, dependencies, and semantic changes are not uncommon. In the remainder
of this section, these potential problems and their consequences for the data inte-
gration process are described in detail.
Taking care of quantitative and qualitative dependencies
One of the problems with reconciling judicial data is the fact that quantitative de-
pendencies between different data sources exist. For example, the date on which a
crime is reported is usually the same as the date on which the crime is committed
or the output of the police is usually greater than the input into the prosecutorial
level. Though some of this knowledge may be exploited for data reconciliation (to
compare records from different sources), it requires manual effort and the partici-
pation of domain experts.
Qualitative dependencies also exist within databases. For instance, it is general-
ly assumed that the value of a certain attribute does not change dramatically in a
few years. Therefore, it is recommended to compare the value of an attribute in a
certain year to its value in preceding years in order to detect large deviations.
Thus, when data from different sources are combined, both quantitative and qu-
alitative dependencies have to be managed in order to avoid unreliable data. In a
data warehouse this has to be done manually by domain experts. In a dataspace
approach it can be automated fully using dynamic rules that check the reliability
of the data and detect deviations.
Managing semantic dependencies
Besides quantitative and qualitative dependencies, also semantic dependencies ex-
ist in and between judicial databases. These arise because different organizations
in the criminal law chain store data about the same events, but often label or clas-
sify these data differently. For example, in case of a robbery a victim may classify
it as a violent crime, while the police may classify it as a crime against property.
Additionally, for a single case in court that contains several offences, the severest
Search WWH ::




Custom Search