Biology Reference
In-Depth Information
[2]. In particular, since N ab
min
{
N a ,N b }
, it follows that R ab is bounded above
by:
N
·
min
{
N a ,N b }
N
R ab
=
.
(15.4)
min
{
N a ,N b
max
{
N a ,N b }
max
{
N a ,N b }
Further, note that this upper bound is achieved whenever N ab is equal to its max-
imum possible value, min
. Thus, in cases where both N a and N b are
small, R ab can become quite large. This situation commonly arises in practice
when an unusual drug name (e.g., a misspelling) occurs only once in the database
(implying N ab = N a =1), in a record that also lists an unusual outcome (im-
plying N b << N ). As a specific example, one “drug” listed in the portion of the
AERS database considered here is “unspecified weed killer,” which appears only
once, in a record that lists the rare adverse event “murder.” Since this adverse
event appears only N b = 147 times in N = 462 , 936 records, this combination
has the huge reporting ratio value R ab = N/N b
{
N a ,N b }
3149.
Several approaches have been proposed to overcome this difficulty. One is
the Bayesian shrinkage estimator of DuMouchel mentioned earlier [2]. Another
is the use of the proportional reporting ratio P ab with an associated χ 2 signifi-
cance measure and minimum N ab limits to down-weight small samples [4, 9]. A
different approach is taken here [10], based on the reporting ratio R ab and the sta-
tistical unexpectedness U ab ,defined as follows. Model the adverse event dataset
as an urn of N balls, with the N a balls corresponding to records that list Drug
A colored black and the others colored white. In the absence of any association
between Drug A and Adverse Event B, the records listing Adverse Event B may
be viewed as a random sample of N b balls drawn from the urn, of which N ab
are black. It is a standard result that this number should follow the hypergeomet-
ric probability distribution [11, Ch. 6]. If R ab > 1,then N ab is larger than the
value expected under this independence assumption, and the significance of this
difference can be assessed by computing the probability of observing a value as
large as or larger than N ab from the hypergeometric distribution defined by N ,
N a and N b .Conversely,if R ab < 1,then N ab is smaller than expected and the
significance of this difference can be assessed by computing the probability of
observing a value as small as or smaller than N ab . The corresponding probability
ρ ab is easily computed via standard routines for the cumulative hypergeometric
distribution, and the statistical unexpectedness is defined as the reciprocal of this
probability, u ab =1 ab .
The reason for using u ab instead of ρ ab is that large u ab values merit our
attention, which is advantageous in the graphical display used here. Also, since
the u ab values span a wide numerical range, it is convenient to work instead with
Search WWH ::




Custom Search