The Structure and Dynamics of Scientific Knowledge - Mapping Scientific Frontiers: The Quest for Knowledge Visualization

Information Technology Reference

In-Depth Information

network techniques such as self-organized maps to depict patterns and trends

derived from text. See (Lin 1997 ; Noyons and van Raan 1998 ) for example.

The pioneering software for concept mapping is Leximappe, developed in 1980s.

It organizes a network of concepts based on associations determined by the co-word

method. In 1980s, it was Leximappe that had turned co-word analysis into an

instrumental tool for social scientists to carry out numerous studies originated from

the famous the actor-network theory (ANT).

Key concepts in Leximappe include poles and their position in concept maps. The

position of the poles is determined by centrality and density. The centrality implies

the capacity of structuring; the density reflects the internal coherence of the pole.

Leximappe is used to create structured graphic representations of concept net-

works. In such networks, vertices represent concepts; the strength of the connection

between two vertices reflects the strength of their co-occurrence. In the early days,

an important step was to tag all words in the text as a noun, a verb, or an adjective.

Algorithms used in information visualization systems such as ThemeScape (Wise

et al. 1995 ) have demonstrated some promising capabilities of filtering out nouns

from the source text.

5.2.2

Inclusion Index and Inclusion Maps

Inclusion maps and proximity maps are two types of concept maps resulted from

co-word analysis. Co-word analysis measures the degree of inclusion and proximity

between keywords in scientific documents and draws maps of scientific areas

automatically in inclusion maps and proximity maps, respectively.

Metrics for co-word analysis have been extensively studied. Given a corpus of

N documents, each document is indexed by a set of unique terms that can occur in

multiple documents. If two terms, t i and t j , appear together in a single document,

it counts as a co-occurrence. Let c k be the number of occurrences of term t k in

the corpus and c ij be the number of co-occurrences of terms t i and t j , which is the

number of documents indexed by both terms. The inclusion index I ij is essentially a

conditional probability. Given the occurrence of one term, it measures the likelihood

of finding another term in documents of the corpus:

I ij D c ij =min c i ; c j

For example, Robert Stevenson's Treasure Island has a total of 34 chapters.

Among them the word map occurred in 5 chapters, c map D 5, and the word treasure

occurred 20 chapters, c treasure D 20. The two terms co-occur in 4 chapters, thus

c map, treasure D 4. I map, treasure D 4/5 D 0.8. In this way, we can construct an inclusion

matrix of terms based on their co-occurrence. This matrix defines a network. An

interesting step described in the original version of co-word analysis is to remove

certain types of links from this network.

Search WWH ::

Custom Search

Home