Information Technology Reference
In-Depth Information
histories from different viewpoints, for example, the history of women in Japan or
history of education in high schools in the last century and so on. These kinds of
domain-oriented historical summaries have been usually done before only for selected
topics that were of particular interest as their manual creation requires much effort
and time.
In addition, historical archives can be used not only for writing historical
summaries but also for evaluating the credibility of the already existing historical
knowledge. According to the meta-history view, history is not credible and requires
constant process of revision 11 . We believe that easy access to archives and the
development of text mining and reasoning technologies will offer possibility for
automatic verification of history in the future.
3.1 Primary and Secondary Sources
According to the historiography, historical evidences can be divided into three
classes: primary, secondary and tertiary 12 . Suppose an event e occurred in the past at
time t . Documents about e that were created around t are regarded as the primary
sources on e , while the documents relevant to e but produced some time after t are
considered to be secondary sources. Secondary sources concerning historical events
are often created on the basis of primary resources.
The authors of secondary sources usually have more distant view on the events
having access to more varying and complete information regarding the event when
compared to the authors of primary resources. This is true as certain implications of
the event as well as its context can be noticed and understood only some time after the
event. On the other hand, there is a risk of missing some important details of the event
due to the time passage or even distorting the view of the events, especially, if other
secondary sources have been used in the document creation process. Historians
generally believe that the closer to the event the more reliable the sources are.
In general, the web is not a self-preserving medium but a self-updating one. Due to
constant pressure for new, up-to-date content, the stale fractions of the web are
becoming neglected, less densely linked and in consequence less frequently visited,
forcing their authors to keep the content up-to-date. This seems to be corresponding to
the characteristic of our society according to which the great value is put on freshness
and novelty, while the old fades and is cherished in only few cases (e.g., wines,
antiques, historical buildings).
Web archive and news archives are examples of collections of primary sources
regarding the time period when they were created. On the other hand, the current web
can be viewed as a mixture of primary sources regarding the current time and
secondary sources on the past, especially the distant past 13 . Figure 3 shows the
conceptual view of primary-secondary distinction of web and web/news archives.
Historians usually face an issue of incomplete data when dealing with primary
sources. Similarly, in the case of digital document archives, there is often lack of
complete information on document evolution over time or some documents are
11 http://en.wikipedia.org/wiki/Historical_revisionism
12 In this paper, we treat tertiary sources simply as secondary sources.
13 Naturally, online web and news archives or other online primary sources physically also
belong to the web; however, for the sake of clarity we treat them separately here.
Search WWH ::




Custom Search