Information Technology Reference
In-Depth Information
material need to be found and their permissions sought in order to
publish the newly derived dataset, and this is a costly and onerous
process. If one of the original data or content owners refuses permission
(and they may well), the entire mash-up cannot be published on the
internet.
The use of software robots that crawl the internet to collect scientific
information, which is then aggregated and processed to arrive at new
insights, or even to create new knowledge, presents another problem. A
good example is the work of Peter Murray-Rust, who 'spent six months
going through the [chemistry] literature and came home with several
hundred datapoints. Each datapoint was the product of a visit to the
library to find a single piece of information in a journal.' Murray-Rust
realized that a great many discoveries rely on using the information in
literature, but this requires an (exceptional) human being to digest a
huge mass of seemingly unrelated data. After creating software robots
to undertake this task, Murray-Rust was able to do in minutes what had
taken months in the library. For this method of data mining to work,
publishers must allow the robots into their servers to crawl the literature.
Some publishers are reluctant to do so, arguing that they own everything
in an article they publish: the text and the embedded data. Arguably the
'facts' extracted are not subject to copyright, but nonetheless a lack of
clarity in copyright law can inhibit the publication and sharing of this
new 'aggregated' knowledge. This problem applies not only to chemistry
but also to many sorts of intellectual property, such as map data, climate
data, traffic data and historic texts (Pounder, 2008).
Publishers of digital material who are concerned with protecting
their intellectual property have two methods at their disposal. The first
of these is the licence agreement; this is a contract between the library
and the publisher setting out how the content in an electronic resource
can be used and what the restrictions are. Usually such licences permit
fair use, and in some cases grant the user even more than is required in
copyright law. In other instances, publishers' licences can seek to restrict
use beyond what is considered fair in the print world. The issue of
contracts undermining copyright law is an area of tension, and one
where the academic sector and the publishing industry (and probably
legislators) need to find consensus.
The second method is the use of the digital rights management
(DRM): technological tools used to regulate the access and use of digital
Search WWH ::




Custom Search