GUTTMAN SCALE (Social Science)

Guttman scaling is a method of scale construction developed by Louis Guttman (1916-1987) in the 1940s. Widely used in the measurement of attitudes and public opinion, the goal of Guttman scaling is to establish unidi-mensional measurement instruments. A typical Guttman scale includes a series of items or statements tapping progressively higher levels of a single attribute. For instance, a scale designed to measure attitudes toward immigrants might ask people to indicate whether or not they would be accepting of immigrants as (a) citizens in the country, (b) coworkers in the same company, (c) neighbors, (d) close friends, and (e) close relatives by marriage. A tolerant individual would probably endorse each one of these statements in the scale. A less tolerant individual might be willing to accept immigrants as citizens in the country but not as neighbors or close family members. Thus, knowing that a person is not willing to accept immigrants as neighbors allows us to infer that this person probably would not want immigrants as close friends either; accepting them as family members would be even less likely for this person. By knowing the last statement endorsed by a respondent, one can easily reproduce his or her pattern of responding to the rest of the statements. In a perfect Guttman scale, the pattern of responding across the scale can be reproduced without any errors.

When the issue is relatively concrete, such as the amount of smoking, or when the construct is hierarchical in nature, such as social distance, Guttman scales work quite well; however, when it comes to measuring relatively more abstract attributes such as attitudes toward immigrants or marriage, it is not always easy to construct such highly reproducible scales.

As in all methods of scale construction, the first step in Guttman scaling is to generate a large set of items representing the construct. Then, a group of judges (80—100) evaluates each of the statements. Depending on the nature of the construct, the evaluations might be to indicate whether or not they agree with each statement, or to indicate whether the statement reflects the absence or presence of the phenomenon. For instance, if the construct in question is attitudes toward gun control, the judges will try to determine whether each item in the scale (e.g., "future manufacture of handguns should be banned") is in favor of or against gun control. In the next step, the responses of each judge are tabulated. In this table, the columns represent items in the scale, and the rows represent each judge. Once all the entries are laid out, the table is sorted so that judges who agree with more statements are listed at the top and those agreeing with fewer are at the bottom; then the number of affirmative responses are summed to create a total score for each judge. If the set of items were to form a cumulative scale, then it should be possible to reproduce the responses of each judge from his or her total score. Errors in reproduction indicate deviation from the optimum form, and suggest a call for revision of the scale, which can be done by changing or dropping existing items or even adding new items. The scale is revised until it becomes possible to reproduce over 90 percent of the responses from the total score. At this point, some statistical analyses can be conducted to examine deviations from a perfectly cumulative scale. These statistics can also be used to estimate a scale score for each item, just as in Thurstone scaling; ultimately, these scale scores are used in the calculation of a respondent’s final score. A final set of items emerges after several revisions. Once the final set of items is established, the order of the items is mixed again to derive the final form of the scale to be shared with actual respondents.

The primary area of application has been in attitudes and public opinion—Guttman constructed a number of attitude scales during World War II using these procedures. Currently, however, as a method of constructing attitude scales, Guttman scaling is almost never used. Investigators prefer to use simpler methods of scale construction such as Likert scaling and the semantic differential technique.

Next post:

Previous post: