ICM To Iterative proportional fitting (Statistics)


Abbreviation for iterated conditional modes algorithm. IDA: Abbreviation for initial data analysis.

Idempotent matrix:

A symmetric matrix, A, with the property that A = A2. An example is



The degree to which there is sufficient information in the sample observations to estimate the parameters in a proposed model. An unidentified model is one in which there are too many parameters in relation to number of observations to make estimation possible. A just identified model corresponds to a saturated model. Finally an overidentified model is one in which parameters can be estimated and there remain some degrees of freedom to allow the fit of the model to be assessed.

Identification keys:

Devices for identifying samples from a set of known taxa, which have a tree-structure where each node corresponds to a diagnostic question of the kind ‘which one of a named set of attributes does the specimen to be identified possess?’ The outcome determines the branch of the tree to follow and hence the next diagnostic question leading ultimately to the correct identification. Often there are only two attributes, concerning the presence and absence of a particular character or the response to a binary test, but multi-state characters or tests are permissible. See also decision tree.

Identity matrix:

A diagonal matrix in which all the elements on the leading diagonal are unity and all other elements are zero.

Ill conditioned matrix:

A matrix X for which XX’ has at least one eigenvalue near zero so that numerical problems arise in computing (XX’)_1.

Image restoration:

Synonym for segmentation.

Immigration-emigration models.:

Models for the development of a population that is augmented by the arrival of individuals who found families independently of each other.

Immune proportion:

The proportion of individuals who are not subject to death, failure, relapse, etc., in a sample of censored survival times. The presence of such individuals may be indicated by a relatively high number of individuals with large censored survival times. Finite mixture models allowing for immunes can be fitted to such data and data analysis similar to that usually carried out on survival times performed. An important aspect of such analysis is to consider whether or not an immune proportion does in fact exist in the population.

Imperfect detectability:

A problem characteristic of many surveys of natural and human populations, that arises because even when a unit (such as a spatial plot) is included in the sample, not all individuals in the selected unit may be detected by the observer. For example, in a survey of homeless people, some individuals in the selected units may be missed. To estimate the population total in a survey in which this problem occurs, both the sampling design and the detection probabilities must be taken into account.

Improper prior distribution:

A prior distribution which is not a proper probability distribution since it does not integrate to one.


A process for estimating missing values using the non-missing information available for a subject. Many methods have been developed most of which can be put into the following multiple regression framework.


where ymi is the imputed value of y lor subject i lor whom the value of y is missing, the xmij are the values of other variables for subject i and the brj- are the estimated regression coefficients for the regression of y on the x variables obtained from the subjects having observed y values; emi is a residual term. A basic distinction between such methods involves those where the residual terms are set to zero and those where they are not. The former may be termed deterministic imputation methods and the latter stochastic imputation methods. Deterministic methods produce better estimates of means but produce biased estimates of shape parameters. Stochastic methods are generally preferred. See also multiple imputation and hot deck.


A Fortran subprogram library of useful statistical and mathematical procedures.


A measure of the rate at which people without a disease develop the disease during a specific period of time. Calculated as


it measures the appearance of disease.

Inclusion and exclusion criteria:

Criteria that define the subjects that can be accepted into a study, particularly a clinical trial. Inclusion criteria define the population of interest, exclusion criteria remove people for whom the study treatment is contraindi-cated.

Incomplete block design:

An experimental design in which not all treatments are represented in each block. See also balanced incomplete block design.

Incomplete contingency tables:

Contingency tables containing structural zeros.

Incubation period:

The time elapsing between the receipt of infection and the appearance of symptoms. The length of the incubation period depends on the disease, ranging from days in, for example, malaria, to years in something like AIDS. Estimation of the incubation period is important in investigations of how a disease may spread and in the projection of the evolution of an epidemic.

Idempotent matrix:

A symmetric, square matrix A with the property that pre or post multiplication by itself results in the same matrix. That is A = A2. An example is the identity matrix. The trace of an idempotent matrix equals its rank.


Essentially, two events are said to be independent if knowing the outcome of one tells us nothing about the other. More formally the concept is defined in terms of the probabilities of the two events. In particular two events A and B are said to be independent if


where Pr(A) and Pr(B) represent the probabilities of A and B. See also conditional probability and Bayes’ theorem.

Independent component analysis (ICA):

A method for analyzing complex measured quantities thought to be mixtures of other more fundamental quantities, into their underlying components. Typical examples of the data to which ICA might be applied are:

• electroencephalogram (EEG) signal, which contains contributions from many different brain regions,

• person’s height, which is determined by contributions from many different genetic and environmental factors.

Index of clumping:

An index used primarily in the analysis of spatial data, to investigate the pattern of the population under study. The index is calculated from the counts, x1, x2,…, xn, obtained from applying quadrant sampling as


where X and s are the mean and variance of the observed counts. If the population is ‘clustered’, the index will be large, whereas if the individuals are regularly spaced the index will be negative. The sampling distribution of ICS is unknown even for simple models of the underlying mechanism generating the population pattern.

Index of dispersion:

A statistic most commonly used in assessing whether or not a random variable has a Poisson distribution. For a set of observations x1, x2,…, xn the index is given by


If the population distribution is Poisson, then D has approximately a chi-squared distribution with n — 1 degrees of freedom. See also binomial index of dispersion. [Biometrika, 1966, 53, 167-82.]

Index number:

A measure of the magnitude of a variable at one point (in time, for example), to its value at another. An example is the consumer price index. The main application of such numbers is in economics.

Index plot:

A plot of some diagnostic quantity obtained after the fitting of some model, for example, Cook’s distances, against the corresponding observation number. Particularly suited to the detection of outliers. [ARA Chapter 10.]

Indicator variable:

A term generally used for a manifest variable that is thought to be related to an underlying latent variable in the context of structural equation models.

Indirect standardization:

The process of adjusting a crude mortality or morbidity rate for one or more variables by using a known reference population. It might, for example, be required to compare cancer mortality rates of single and married women with adjustment being made for the likely different age distributions in the two groups. Age-specific mortality rates in the reference population are applied separately to the age distributions of the two groups to obtain the expected number of deaths in each. These can then be combined with the observed number of deaths in the two groups to obtain comparable mortality rates.

Individual differences scaling:

A form of multidimensional scaling applicable to data consisting of a number of proximity matrices from different sources, i.e. different subjects. The method allows for individual differences in the perception of the stimuli by deriving weights for each subject that can be used to stretch or shrink the dimensions of the recovered geometrical solution.


Acronym for individual differences scaling.

Infant mortality rate:

The ratio of the number of deaths during a calendar year among infants under one year of age to the total number of live births during that year. Often considered as a particularly responsive and sensitive index of the health status of a country or geographical area. The table below gives the rates per 1000 births in England, Wales, Scotland and Northern Ireland in both 1971 and 1992.

1971 1992
England 17.5 6.5
Wales 18.4 5.9
Scotland 19.9 6.8
NI 22.7 6.0

Infectious period:

A term used in describing the progress of an epidemic for the time following the latent period and during which a patient infected with the disease is able to discharge infectious matter in some way and possibly communicate the disease to other susceptibles.


The process of drawing conclusions about a population on the basis of measurements or observations made on a sample of individuals from the population. See also frequentist inference and Bayesian inference. [KA1 Chapter 8.]

Infertile worker effect:

The observation that working women may be relatively infertile since having children may keep women away from work. See also healthy worker effect.

Infinitely divisible distribution:

A probability distribution f (x) with corresponding characteristic function $(t), which has the property that for every positive integer n there exists a characteristic function <pn(t) such that


This implies that for each n the distribution can be represented as the distribution of the convolution (sum) of n independent random variables with a common distribution. The chi-squared distribution is an example. [KA1 Chapter 4.]


A term used primarily in regression analysis to denote the effect of each observation on the estimated regression parameters. One useful index of the influence of each observation is provided by the diagonal elements of the hat matrix.

Influence statistics:

A range of statistics designed to assess the effect or influence of an observation in determining the results of a regression analysis. The general approach adopted is to examine the changes that occur in the regression coefficients when the observation is omitted. The statistics that have been suggested differ in the particular regression results on which the effect of deletion is measured and the standardization used to make them comparable over observations. All such statistics can be computed from the results of the single regression using all data. See also Cook’s distance, DFFITS, DFBETAS, COVRATIO and hat matrix.

Influential observation:

An observation that has a disproportionate influence on one or more aspects of the estimate of a parameter, in particular, regression coefficients. This influence may be due to differences from other subjects on the explanatory variables, an extreme value for the response variable, or a combination of these. Outliers, for example, are often also influential observations.

Information theory:

A branch of applied probability theory applicable to many communication and signal processing problems in engineering and biology. Information theorists devote their efforts to quantitative examination of the following three questions;:

• What is information?

• What are the fundamental limitations on the accuracy with which information can be transmitted?

• What design methodologies and computational algorithms yield practical systems for communication and storing information that perform close to the fundamental limits mentioned previously?

Informative censoring:

Censored observations that occur for reasons related to treatment, for example, when treatment is withdrawn as a result of a deterioration in the physical condition of a patient. This form of censoring makes most of the techniques for the analysis of survival times, for example, strictly invalid.

Informative prior:

A term used in the context of Bayesian inference to indicate a prior distribution that reflects empirical or theoretical information regarding the value of an unknown parameter. [KA1 Chapter 8.]

Informed consent:

The consent required from each potential participant prior to random assignment in a clinical trial as specified in the 1996 version of the Helsinki declaration intended to guide physicians conducting therapeutic trials, namely: in any research on human beings, each potential subject must be adequately informed of the aims, methods, anticipated benefits and potential hazards of the study and the discomfort it may entail. He or she should be informed that he or she is at liberty to abstain from participation in the study and he or she is free to withdraw his or her consent to participation at any time. The physician should then obtain the subject’s freely-given informed consent, preferably in writing. [Journal of the American Medical Association, 1997, 277, 925-6.]

Initial data analysis (IDA):

The first phase in the examination of a data set which consists of a number of informal steps including

• checking the quality of the data,

• calculating simple summary statistics and constructing appropriate graphs. The general aim is to clarify the structure of the data, obtain a simple descriptive summary, and perhaps get ideas for a more sophisticated analysis.


A facetious term for inliers.


A term used for the observations most likely to be subject to error in situations where a dichotomy is formed by making a ‘cut’ on an ordered scale, and where errors of classification can be expected to occur with greatest frequency in the neighbourhood of the cut. Suppose, for example, that individuals are classified say on a hundred point scale that indicates degree of illness. A cutting point is chosen on the scale to dichtomize individuals into well and ill categories. Errors of classification are certainly more likely to occur in the neighbourhood of the cutting point.

Instantaneous count procedure:

A sampling method used in biological research for estimating population numbers.

Instantaneous death rate:

Synonym for hazard rate.

Institutional surveys:

Surveys in which the primary sampling units are institutions, for example, hospitals. Within each sampled institution, a sample of patient records is selected. The main purpose of the two-stage design (compared to a simple random sample) are to lessen the number of institutions that need to be subsampled, and to avoid constructing a sampling frame of patient records for the entire population. Stratified sampling involving selecting institutions with differing probabilities based on some institutional characteristic (e.g. size) is typically used to lessen the variability of estimators. See also cluster sampling.

Instrumental variable:

A variable corresponding to an explanatory variable, x,-, that is correlated with x,- but has no effect on the response variable except indirectly through x,-. Such variables are useful in deriving unbiased estimates of regression coefficients when the explanatory variables contain measurement error. See also regression dilution.

Integrated hazard function:

Synonym for cumulative hazard function.

Intention-to-treat analysis:

A procedure in which all patients randomly allocated to a treatment in a clinical trial are analysed together as representing that treatment, whether or not they completed, or even received it. Here the initial random allocation not only decides the allocated treatment, it decides there and then how the patient’s data will be analysed, whether or not the patient actually receives the prescribed treatment. This method is adopted to prevent disturbances to the prognostic balance achieved by randomization and to prevent possible bias from allowing compliance, a factor often related to outcome, to determine the groups for comparison.


A term applied when two (or more) explanatory variables do not act independently on a response variable. Figure 74 shows an example from a 2 x 2 factorial design. See also additive effect.


The parameter in an equation derived from a regression analysis corresponding to the expected value of the response variable when all the explanatory variables are zero.

Intercropping experiments:

Experiments involving growing two or more crops at the same time on the same piece of land. The crops need not be planted nor harvested at exactly the same time, but they are usually grown together for a significant part of the growing season. Used extensively in the tropics and subtropics, particularly in developing countries where people are rapidly depleting scarce resources but not producing enough food.

Interim analyses:

Analyses made prior to the planned end of a clinical trial, usually with the aim of detecting treatment differences at an early stage and thus preventing as many patients as possible receiving an ‘inferior’ treatment. Such analyses are often problematical particularly if carried out in a haphazard and unplanned fashion.

Interior analysis:

A term sometimes applied to analysis carried out on the full model in a regression problem. The basic aim of such analyses is the identification of problem areas with respect to the original least squares fit of the full model. Of particular interest is the disposition of individual data points and their relative influence on global measures used as guides in subsequent stages of analysis or as estimates of parameters in subsequent models. Outliers from the fitted or predicted response, multivariate outliers among the explanatory variables, and points in the space of explanatory variables with great leverage on the full model should be identified, and their influence evaluated before further analysis is undertaken.


The process of determining a value of a function between two known values without using the equation of the function itself.

Interquartile range:

A measure of spread given by the difference between the first and third quartiles of a sample.

Interrupted time series design:

A study in which a single group of subjects is measured several times before and after some event or manipulation. Often also used to describe investigations of a single subject. See also longitudinal data and N of 1 clinical trial.

Interaction in a 2 x 2 x 2 design.

Fig. 74 Interaction in a 2 x 2 x 2 design.

Interruptible designs:

Experimental designs that attempt to limit the information lost if an experiment is prematurely ended.

Interval-censored observations:

Observations that often arise in the context of studies of time elapsed to a particular event when subjects are not monitored continuously. Instead the prior occurrence of the event of interest is detectable only at specific times of observation, for example, at the time of medical examination.

Interval variable:

Synonym for continuous variable.

Intervened Poisson distribution:

A probability distribution that can be used as a model for a disease in situations where the incidence is altered in the middle of a data collection period due to preventative treatments taken by health service agencies. The mathematical form of the distribution is


where x = 1, 2,…. The parameters 6(> 0) and p(0 < p < 1) measure incidence and intervention, respectively. A zero value of p is indicative of completely successful preventive treatments, whereas p = 1 is interpreted as a status quo in the incidence rate even after the preventive treatments are applied.

Intervention analysis in time series:

An extension of autoregressive integrated moving average models applied to time series allowing for the study of the magnitude and structure of changes in the series produced by some form of intervention. An example is assessing how efficient is a preventive programme to decrease monthly number of accidents.

Intervention study:

Synonym for clinical trial.

Interviewer bias:

The bias that occurs in surveys of human populations because of the direct result of the action of the interviewer. This bias can arise for a variety of reasons including failure to contact the right persons and systematic errors in recording the answers received from the respondent.

Intraclass contingency table

: A table obtained from a square contingency table by pooling the frequencies of cells corresponding to the same pair of categories. Such tables arise frequently in genetics when the genotypic distribution at a single locus with r alleles, A1, A2,…, Ar, is observed. Since AtAj is indistinguishable from AjAt, i = j, only the total frequency of the unordered pair AiAj is observed. Thus the data consist of the frequencies of homozygotes and the combined frequencies of heterozygotes.

Intraclass correlation:

Although originally introduced in genetics to judge sibling correlations, the term is now most often used for the proportion of variance of an observation due to between-subject variability in the ‘true’ scores of a measuring instrument. Specifically if an observed value, x, is considered to be true score (t) plus measurement error (e), i.e.


the intraclass correlation is


where of is the variance of t and of the variance of e. The correlation can be estimated from a study involving a number of raters giving scores to a number of patients.

Intrinsic error:

A term most often used in a clinical laboratory to describe the variability in results caused by the inate imprecision of each analytical step.

Intrinsic rate of natural increase:

Synonym for Malthusian parameter.


A property of a set of variables or a statistic that is left unchanged by a transformation. The variance of a set of observations is, for example, invariant under linear transformations of the data.

Inverse Bernoulli sampling:

A series of Bernoulli trials that is continued until a preassigned number, r, of successes have been obtained; the total number of trials necessary to achieve this, n, is the observed value of a random variable, N, having a negative binomial distribution.

Inverse distribution function:

A function G(a) such that the probability that the random variable X takes a value less than or equal to it is a, i.e.


Inverse Gaussian distribution:

Synonym for inverse normal distribution. Inverse normal distribution: The probability distribution, f(x), given by


where n and X are both positive. The mean, variance, skewness and kurtosis of the distribution are as follows:


A member of the exponential family which is skewed to the right. Examples of the disitribution are shown in Fig. 75.

Inverse polynomial functions:

Functions useful for modelling many dose-response relationships in biology. For a particular dose or stimulus x, the expected value of the response variable, y, is defined by


The parameters, p1, p2,…, jid, define the shape of the dose-response curve and a defines its position on the x axis. A particularly useful form of the function is obtained by setting a = 0 and d = 1. The resulting curve is


which can be rewritten as


where k1 = 1/p1 and k2 = fi0/P1. This final equation is equivalent to the Michaelis-Menten equation. [Biometrics, 1966, 22, 128-41.]

Inverse sine transformation:

Synonymous with arc sine transformation.

Examples of inverse normal distributions: ^ = 1.0; A = 2.0, 6.0.

Fig. 75 Examples of inverse normal distributions: ^ = 1.0; A = 2.0, 6.0.

Inverse survival function: The quantile, Z(a) that is exceeded by the random variable X with probability a, i.e.


Z(a) = G(1 — a) where G is the inverse distribution function.

Inversion theorem: A

theorem that proves that a probability distribution, f (x), is uniquely determined by its characteristic function, S(t). The theorem states that


Inverted Wishart distribution:

The distribution of the inverse of a positive definite matrix, A, if and only if A—1 has a Wishart distribution.


Abbreviation for iteratively reweighted least squares.

Irreducible chain:

A Markov chain in which all states intercommunicate.

Irwin-Hall distribution:

The probability distribution of the sum, S, of n independent random variables each with a uniform distribution in (0,1). The distribution is given by

Irwin-Hall distribution:

The probability distribution of the sum, S, of n independent random variables each with a uniform distribution in (0,1). The distribution is given by


Irwin, Joseph Oscar (1898-1982):

Born in London, Irwin was awarded a mathematics scholarship to Christ’s College, Cambridge in December 1917, but because of the war did not graduate until 1921, when he immediately joined Karl Pearson’s staff at University College, London. In 1928 he joined Fisher at Rothamsted Experimental Station, working there until 1931, when he joined the Medical Research Council, where he stayed until 1965. He worked on a variety of statistical problems arising from areas such as animal carcinogenicity, accident proneness, vaccines and hot environments for soldiers in the tropics. Received the Royal Statistical Society’s Guy Medal in silver in 1953 and acted as President of the British Region of the Biometric Society in 1958 and 1959. Irwin died on 27 July 1982 in Schaffhausen, Switzerland.

Ising-Stevens distribution:

The probability distribution of the number of runs, X, of either of two types of objects (n of one kind and n2 of the other) arranged at random in n = n + n2 positions around a circle. Given by



A diagram used to characterize the interactions among jointly administered drugs or chemicals. The contour of constant response (i.e. the isobole), is compared to the ‘line of additivity’, i.e. the line connecting the single drug doses that yield the level of response associated with that contour. The interaction is described as syner-gistic, additive, or antagonistic according to whether the isobole is below, coincident with, or above the line of additivity. See Fig. 76 for an example.

 An example of an isobologram.

Fig. 76 An example of an isobologram.

Isotonic regression:

A form of regression analysis that minimizes a weighted sum of squares subject to the condition that the regression function is order preserving.

Item non-response:

A term used about data collected in a survey to indicate that particular questions in the survey attract refusals, or responses that cannot be coded. Often this type of missing data makes reporting of the overall response rate for the survey less relevant.

Item-response theory:

The theory that states that a person’s performance on a specific test item is determined by the amount of some underlying trait that the person has.

Item-total correlation:

A widely used method for checking the homogeneity of a scale made up of several items. It is simply the Pearson’s product moment correlation coefficient of an individual item with the scale total calculated from the remaining items. The usual rule of thumb is that an item should correlate with the total above 0.20. Items with lower correlation should be discarded.

Iterated bootstrap:

A two-stage procedure in which the samples from the original bootstrap population are themselves bootstrapped. The technique can give confidence intervals of more accurate coverage than simple bootstrapping.

Iterated conditional modes algorithm (ICM):

A procedure analogous to Gibbs sampling, with the exception that the mode of each conditional posterior distribution is determined at each update, rather than sampling a value from these conditional distributions.


The successive repetition of a mathematical process, using the result of one stage as the input for the next. Examples of procedures which involve iteration are iterative proportional fitting, the Newton-Raphson method and the EM algorithm.

Iteratively reweighted least squares (IRLS):

A weighted least squares procedure in which the weights are revised or re-estimated at each iteration. In many cases the result is equivalent to maximum likelihood estimation. Widely used when fitting generalized linear models.

Iterative proportional fitting:

A procedure for the maximum likelihood estimation of the expected frequencies in log-linear models, particularly for models where such estimates cannot be found directly from simple calculations using relevant marginal totals. [The Analysis of Contingency Tables, 2nd edition, B.S. , Chapman and Hall/CRC Press, London.]

Next post:

Previous post: