Geoscience Reference
In-Depth Information
This approach, described in [PAL 10a, PAL 11], therefore uses a specific existing
index (first-level index) and generates a new standardized index (Figure 1.5,
section 1.4.2, Chapter 1). Thus, to a set of one-dimensional representations, we apply
a one-dimensional tiling and, to a set of two-dimensional representations, a
two-dimensional tiling and so forth, until n dimensions. Let us note that we are here
talking about spatial and temporal domains and the number of dimensions (1, 2 or
more) necessary for the representation of information. For example (Figure 3.3), the
specific temporal index references calendar representations viewable on the time line
(one dimension). The corresponding standardized index contains tiles (months)
materialized on the same time line. Similarly, the specific spatial index references
geometric representations viewable on a map (two dimensions). The corresponding
standardized index contains tiles (commune) materialized on the same map.
3.4.1.3. Standardization by tiling: from the choice of tiling to the weighting of the
tiles
Indexing is composed of two steps: the establishment of the tiling and the
weighting of the tiles. The first step thus consists of generating the different-range
tilings. The second consists of assigning a weight to each tile according to its
frequency in a document. We have identified two types of tilings: the “regular” tiling
and the “explicit” tiling described and illustrated with examples in [PAL 09],
[PAL 10d] and [PAL 10a].
- Type of tiling: Regular tiling is a tiling that consists of splitting the zone/period
covered by the corpus into tiles of similar size. This approach is comparable to
truncation. The size of the tiles and their borders are adjustable. This type of tiling
is presented in [PHA 07] for visterms.
Explicit tiling consists of using an already defined and meaningful tiling. This
approach is similar to lemmatization. This tiling is thus based on a splitting which we
have qualified “significant” because it is built on human criteria (common sense).
Consequently, we associate regular tiling and calendar tiling with temporal
information. Calendar tiling is an explicit tiling that consists of using the standard
way of splitting: days, weeks, months, seasons, years, centuries, etc. This splitting
allows the definition of several indexes corresponding to these different levels of
calendar precision. Moreover, we associate regular tiling and administrative tiling
with spatial information. Administrative tiling is an explicit tiling that consists of
using the standard way of splitting: districts, cities, townships, counties, regions,
countries, etc. This splitting allows the definition of several indexes corresponding
to these different levels of spatial precision.
- Weighting of tiles: After having chosen the type of tiling, we can weigh the tiles
by using the approaches based on the raw frequencies. To calculate the frequency of
a tile, we propose two discrete approaches (see the example illustrated in Table 3.1).
Search WWH ::




Custom Search