Relating Subjective and Objective Pharmacovigilance Association Measures - Clustering Challenges in Biological Network

Biology Reference

In-Depth Information

[2]. In particular, since N ab ≤

min

{

N a ,N b }

, it follows that R ab is bounded above

by:

min

{

N a ,N b }

R ab ≤

(15.4)

min

{

N a ,N b }·

max

{

N a ,N b }

max

{

N a ,N b }

Further, note that this upper bound is achieved whenever N ab is equal to its max-

imum possible value, min

. Thus, in cases where both N a and N b are

small, R ab can become quite large. This situation commonly arises in practice

when an unusual drug name (e.g., a misspelling) occurs only once in the database

(implying N ab = N a =1), in a record that also lists an unusual outcome (im-

plying N b << N ). As a specific example, one “drug” listed in the portion of the

AERS database considered here is “unspecified weed killer,” which appears only

once, in a record that lists the rare adverse event “murder.” Since this adverse

event appears only N b = 147 times in N = 462 , 936 records, this combination

has the huge reporting ratio value R ab = N/N b

{

N a ,N b }

3149.

Several approaches have been proposed to overcome this difficulty. One is

the Bayesian shrinkage estimator of DuMouchel mentioned earlier [2]. Another

is the use of the proportional reporting ratio P ab with an associated χ 2 signifi-

cance measure and minimum N ab limits to down-weight small samples [4, 9]. A

different approach is taken here [10], based on the reporting ratio R ab and the sta-

tistical unexpectedness U ab ,defined as follows. Model the adverse event dataset

as an urn of N balls, with the N a balls corresponding to records that list Drug

A colored black and the others colored white. In the absence of any association

between Drug A and Adverse Event B, the records listing Adverse Event B may

be viewed as a random sample of N b balls drawn from the urn, of which N ab

are black. It is a standard result that this number should follow the hypergeomet-

ric probability distribution [11, Ch. 6]. If R ab > 1,then N ab is larger than the

value expected under this independence assumption, and the significance of this

difference can be assessed by computing the probability of observing a value as

large as or larger than N ab from the hypergeometric distribution defined by N ,

N a and N b .Conversely,if R ab < 1,then N ab is smaller than expected and the

significance of this difference can be assessed by computing the probability of

observing a value as small as or smaller than N ab . The corresponding probability

ρ ab is easily computed via standard routines for the cumulative hypergeometric

distribution, and the statistical unexpectedness is defined as the reciprocal of this

probability, u ab =1 /ρ ab .

The reason for using u ab instead of ρ ab is that large u ab values merit our

attention, which is advantageous in the graphical display used here. Also, since

the u ab values span a wide numerical range, it is convenient to work instead with

Clustering Challenges in Biological Network

Search WWH ::

Custom Search

Home