Biology Reference
In-Depth Information
12.5.1 run-length Properties
As mentioned previously, the behavior of the CUSUM depends very much
on the choice of the threshold
g
. In order to guide the choice of
g
, we will
look at the run-length distribution of
T
A
|τ
=
∞ under the fitted negative bino-
mial model. Prediction of
μ
0,
t
requires knowledge of all involved covariates
during the monitoring period, e.g., in model (6), this would be the popula-
tion size. For the monitored period of 65 weeks (2007-W40 - 2008-W52), these
values are available, but if monitoring exceeded this period, we would have
needed to predict covariate values as well before being able to compute
μ
0,
t
.
Hence, it is practically more feasible to look at
P
(
T
A
≤
t
A
| τ = ∞) for a small
t
A
than to estimate, for example,
E
(
T
A
|τ = ∞) as here many more time points
might be needed if the expectation is large. Furthermore, the distribution of
T
A
is also often skewed, which makes the expectation a bad summary of the
central tendency.
Specifically, we want to choose
g
such that
P
(
T
A
≤
6 5|τ = ∞) is below some
acceptable value, for example, 10%. In other words, the probability of a false
alarm within the 65 weeks of our
μ
0,
t
versus
μ
1,
t
monitoring should be below
10%. To compute the probability under the selected model, two approaches
exist: direct Monte Carlo estimation or a Markov chain approximation.
In the first approach, we use Monte Carlo estimation of
P
(
T
A
≤ 6 5|τ = ∞). For
each realization
j
, a time series of length 65 is simulated from the estimated
negative binomial model with mean
μ
0,
t
,
t
=1, … , 65, and dispersion param-
eter α. Then the negative binomial CUSUM is applied to this time series and
one checks if
T
A
j
≤65
. The probability of interest using
k
such realizations
can then be estimated as
∑
j
1
6( )
, where
I
(·) is the indicator func-
tion. Code-wise this can be done for
k
= 1000 for a grid of
g
's as follows.
IT
≤ /
k
k
=
A
R> simone.TAleq65 <- function(sts, g) {
observed(sts)[phase2,]<-rnbinom(length(mu0), mu = mu0, size =
mtheta)
one <-glrnb(sts, control = modifyList(control(s.nb),
list(c.ARL = g)))
return(any(alarms(one)))
}
R> g.grid <-seq(1.8, by = 0.5)
R> pMC <-sapply(g.grid, function(g) {
mean(replicate(1000,simone)))
})
Figure 12.7 shows the result. We note that
g
≈ 4.75 ensures that the false-
alarm probability within the monitoring period drops below the desired
level of 10%. If one is interested in
P
(
T
A
≤
6 5|τ
= 1) instead,
μ
1,
t
has to be used
as argument mu in
rnbinom
.
A different option to compute this false-alarm probability for a likeli-
hood ratio-based CUSUM is to use a Markov chain approximation to deter-
mine the PMF of the run-length variable. This approach implemented in