Autism and Vaccine
The most cited retracted article among all the retracted articles in the Web of Science
is the 1998 Lancet article by Wakefield et al. A citation burst of 0.05 was detected
for this article. The article was partially retracted in 2004 and fully retracted in 2010.
The Lancet's retraction notice in February 2010 noted that several elements of the
1998 paper are incorrect, contrary to the findings of an earlier investigation, and that
the paper made false claims of an “approval” of the local ethics committee.
In order to find out what exactly was said when researchers cited the controversial
article, we studied citation sentences, which are the sentences that contain references
to the Wakefield paper. A set of full text articles were obtained from Elsevier's
Content Syndication (ConSyn), which contains 3,359 titles of scholarly journals
and 6,643 non-serial titles. Since the Wakefield paper is concerned with a claimed
causal relation between a combined MMR vaccine and autism, we searched for full
text journal articles on autism and vaccine in ConSyn and found 1,250 relevant
articles. The Wakefield paper was cited by 156 full text articles in the 1,250 articles
from the ConSyn collection. A total of 706 citation sentences are found in the 156
citing articles. We used the Lingo clustering method provided by Carrot2, an open
source framework for building search clustering engines, 3
to cluster these citation
sentences into 69 clusters.
Figure 8.15 is a visualization of the 69 clusters formed by 706 sentences that
cited the 1998 Lancet paper. The visualization is called Foam Tree in Carrot. See
Chap. 9 for more details on Carrot. Clusters with the largest areas represent the
most prominent clusters of phrases used when researchers cited the 1998 paper. For
example, inflammatory bowel disease , mumps and rubella ,and association between
MMR vaccine and autism are the central topics of the citations. These topics indeed
characterize the role of the retracted Lancet paper, although in this study we did
not differentiate positive and negative citations. Identifying the orientation of an
instance of citation from a citation context, for example, the citing sentence and
its surrounding sentences, is a very challenging task even for an intelligent reader
because the position of the argument becomes clear only when a broader context is
taken into account, for example, after reading the entire paragraph in many cases.
In addition to aggregate citation sentences into clusters at a higher level of
abstraction, we further developed a timeline visualization that can be used to depict
year-by-year flows of topics to facilitate analytics to discern changes associated with
