Survey Data Collection and Processing - Sampling Spatial Units for Agricultural Surveys

Agriculture Reference

In-Depth Information

surveys based on a geographical definition of the statistical unit, because it is

assumed that their weight has very little variability in the target population.

It should be stated that no data-editing program is able to automatically detect,

and impute any error in the data. In general, only the errors that violate some rules

(identifiable errors) can be detected and subjected to appropriate processing to

resolve the inconsistencies. Such an imputation does not necessarily restore the

true information, but changes it to a value that we estimate to be closer to the true

value using a set of logical rules that we believe are valid for the collected data.

Therefore, the automatic editing process may be seen as a way to increase the

quality of the data by constraining them to some prior knowledge.

For this reason, we should only correct the data if we decide that the errors

reduce the quality of the information to below a predefined level, and if we think

that the available set of auxiliary information can correct the data if applied in the

form of compatibility rules. Generally, the problem is to correctly identify this

information. In fact, if we define inappropriate logical rules or apply inadequate

procedures we can introduce a serious bias into the estimates.

By incorrectly defining a set of edit rules we can cause further problems instead

of detecting errors. In fact, we can introduce biases by only partially addressing the

errors, for example, by accurately treating some errors and ignoring others. Addi-

tionally, many edit rules can be defined for a single survey and they may conflict

with each other, leading to inconsistencies. We may also define redundant edit

rules. Even if they are consistent, they can result in too many corrections, which

contrasts with the principle of correcting the data as little as possible.

Problems can also arise if we treat some errors with improper methods. Treating

deterministic errors with imputation methods suited to random errors may introduce

significant biases into the data. Additionally, it may not be optimal to correct errors

in the automatic editing phase. This is good practice when it is possible to perform a

controlled recording of the collected data. However, automatic data editing will

probably identify the errors caused by an incorrect recording, but will impute them

in a non-efficient way. By correcting these errors when they are generated, we will

obtain a better approximation of the correct values.

In the case of interactive data editing, a serious problem may occur if one or

more operators do not comply with the established procedures. The effect of any

bias introduced in this way may be even greater than other cases, because infor-

mation should be restored close to reality in interactive corrections. Indeed, this

mode of operation is usually applied to very influential units, for example, large

farms. In this case, the first step is to return to the questionnaire, or to consult

administrative archives or other sources. If the information therein is not considered

reliable, it must be collected again by returning to the farm (Berthelot and Latouche

1993 ).

Automatic editing procedures should be designed to prevent the introduction of

errors and biases during implementation. Thus, the plan should first carefully assess

if an imputation process is actually required, rather than simply identifying and

counting incompatibilities in the data. In general, it is a good practice to give

priority to methods that have well-known theoretical and statistical properties,

Sampling Spatial Units for Agricultural Surveys

Search WWH ::

Custom Search

Home