Database Reference
In-Depth Information
added to the outerjoin to pick up only data that is still current. This selection will
filter out data with a marker set to indicate that the data is deleted or updated. This
mechanism protects the private data in case a system reuses rowids. Thus, when
using enriched queries, only currently valid private data is retrieved. But the user is
still able to look up past private data, which may help the user to understand the
evolution of the database.
If the users are allowed to share personal data, maintenance of the central
repository that is created for such purposes presents challenges of its own. Since
the data in the central repository is only a copy of the private user data, when
such data is updated in a user basis, it should also be updated in the central
repository. Such central repository could be considered as a materialized view ,
and hence, techniques for maintaining such views (like incremental computation)
could be used here [ 9 ]. In particular, note that it may be useful to limit the sharing
only to current data in the database if only to ensure that all users can see and
understand the referents. Thus, the system should add to the central table only
currently valid data.
One intriguing application of this idea is that it can be applied to the metadata
(schema) of the database. A no-overwrite mechanism could keep information
about the old schema around, while presenting the new schema to applications
and users. This could help users understand database evolution . In particular, if
users comment on changes on formatting, units, etc., they could provide very
valuable metadata which is usually absent from the database (see next section for
more on this).
7.6 Discussion
There are several potential objections to allowing user content to be part of the
database:
l Data quality: if users can generate content without any supervision (there is no
editing or curation process), the quality of the result may be quite uneven and
questionable. Database contents (data) are valuable because they are curated;
they are obtained though a predefined process and have some guarantees of
quality.
l Data ownership: data in the database is usually owned by the organization that
created the database (funded it), delegated to the DBA and/or small group of
people in charge. If users create content and add it to the database, who owns it?
The user or the organization?
l Data control: without some central control, databases could grow in an uncon-
trolled manner (not only the data itself, but also the schema). While dealing with
large volumes of data is something that databases do well, increasing the
complexity of the schema may have negative effects in many areas (including,
ironically, user access, which can become more complicated).
Search WWH ::




Custom Search