The Standard Cross-Cultural Sample, or SCCS (Murdock and White 1969), is a cumulative and collaborative database of coded variables on maximally diverse and ethno-graphically best-described societies used by scholars in the social sciences. The champion of modern cross-cultural and statistical methods, George P. Murdock, in preparation for a standard sample, had classified the 1,267 societies in his coded Ethnographic Atlas into 200 distinctive world cultural provinces (Murdock 1962-1967, 1968). Douglas R. White (1968) had compiled a database of coded cross-cultural studies and done a concordance of previous samples (repeated by Ember 1992) that showed the fruitlessness of testing hypotheses that involved variables from different studies because there was little overlap between randomly drawn or ad hoc samples. White (1969), linked to the Columbia-Michigan historical-evolutionary successor to the Boasian school, had just completed the first comparative historical study to use Murdock’s Ethnographic Atlas codes in a regional study. The dual-authored approach to the SCCS signified a rapprochement between the Yale school of evolutionary func-tionalism and the historical anthropological schools at Columbia, Michigan, and Berkeley. Their founding of the Cumulative Cross-Cultural Coding Center (CCCCC) for the SCCS at the University of Pittsburgh (1968-1973), like their 1969 authorship, was coequal, although they were born nearly fifty years apart.
Both Murdock and White (1969) were advocates of multiple competing hypotheses (Chamberlain 1897). They designed the SCCS to be sufficiently large to test multivariate and competing hypotheses and sufficiently small to allow different investigators to code all the cases and so contribute to a cumulative database. Their evaluation of the literature on the 1,267 Ethnographic Atlas societies and other ethnographies led to their selection of the best and preferably earliest described representatives for 186 of Murdock’s cultural provinces that were most independent of one another. Each society was pinpointed to specific dates and communities (White and Murdock 2006). Bibliographic recommendations for ethnographic sources (White 1986) were classified by focus and pertinence. The SCCS—the goal of which was to represent the cultural diversity of well-described human societies— ranges from contemporary hunter-gatherers such as the Kung to historical cities (e.g., Babylon, 1750 bce; Rome, 110 ce; the Khmer capital of Angkor, 1292; Erevan, 1880; Abomey 1890) to communities of industrial nations (e.g., an Irish village, 1932; a Russian commune,
Construction of the SCCS reflected Murdock and White’s concerns for sorting out alternative hypotheses. Correlations among variables in the SCCS cannot be taken as necessarily causal or functional relationships. As with similarities between the SCCS societies, they are affected by many factors, and the challenge is to separate out the different strands of influence, including measurement biases that may be studied by data quality controls (Naroll 1962; Whyte 1978b). Correlations or similarities may be influenced by functional associations, diffusion or borrowing from one society to another, or shared characteristics passed along through common ancestry. It is only the assumption of parallel independent inventions that supports the interpretation of a correlation between traits as functional adhesions in cultural systems. Murdock (1956), Harold E. Driver (1956), White (1968, 1975), and many other social scientists have long recognized the complementarities of different processes and the problems of inferences from correlational tests. History and science are integrally connected in this respect (Lyman and O’Brien 2004). The interpretation of intersocietal similarities and correlational analysis and of comparative distributional findings necessarily involves the different kinds of networks that mediate the effects of diffusion or borrowing on correlations or similarities: Intermarriage, migration, economic exchange, political, and other interactions may be involved. Likewise, shared "ancestry" may include common "stock" that is linguistic, political, or the result of outmigration or genetic origin, such as mitochondrial genes passed entirely through the maternal line, which have strong adaptive correlates to environment (Mishmar et al. 2003). A similar principle applies to strictly paternal genetic origin passing through the Y chromosome. Assortative genes also may be implicated in selection for correlates or similarities. Without large samples of cases, such as the SCCS, or even larger, such as Murdock’s Ethnographic Atlas, the data are likely to be insufficient to separate or tease out different kinds of effects. From a statistician’s point of view, many of the alternative kinds of effects fall under the general term of historical nonindependence of cases. Only one major attempt has been made so far, however, to implement a diachronic coding of variables such as those studied by Eric Wolf (1982), but there is no intrinsic constraint against doing so for researchers ambitious enough to try. Such an approach would remedy the defect of using the SCCS exclusively for a synchronic or snapshot approach to comparison rather than coding changes over time at spatially pinpointed and related sites.
Edward B. Tylor, although very familiar with diffusion and issues of common ancestry, was rightly critiqued for asserting that cross-cultural correlations represented functional evolutionary parallelisms in how traits are linked. Following Tylor’s 1889 paper, statistician Francis Galton objected that because his cases were not historically independent, Tylor was not warranted to apply the usual tests of statistical significance to his correlational results. This is because when duplicates of the same originals are included in a sample (which does not necessarily change correlations, since that would depend on which kinds of cases are duplicated), the variance among the originals is much greater than it would appear from computing ordinary standard deviations. Variance is an average, divided by the number of cases, and so the variance increases in dividing the observed deviations by a smaller number of originals. The statistical significance of any correlation so affected is diminished accordingly. This critique, however, allows for a sample-size adjustment to recompute the effective variance for a smaller estimated number of independent cases. All this was well understood in the 1880s. In the 1889 oral discussion, H. W. Flower made an additional important comment: that any cross-cultural method "depended entirely upon the units of comparison being of equivalent value" (Tylor 1889, p. 272) as, for example, the contrast between individual communities and larger religious, political, or regional units. The Galton and Flower problems may be linked in that if the traits of the smaller communities reflect those of the larger units, then the statistical significance of correlations of traits for the communities must be adjusted by reducing the effective n (estimated number of independent cases) to those of the larger units.
Galton’s problem, as named by Raoul Naroll (1961)—that is, the problem of historically nonindepen-dent cases—is not limited to cross-cultural research. It entails foundational statistical problems endemic to non-experimental research in which the variance of statistics is underestimated when common historical factors or network interactions influence similarities of cases. Samples with historically nonindependent cases can still yield unbiased statistics (means and correlations, for example), but incorrect variance estimates of statistical estimates (and thus significance tests) require appropriate adjustment for Galton’s problem, independently of the question of how to interpret statistics in the light of alternative hypothesis. Random sampling provides no escape.
There remains an unfortunate Tylorian tendency in cross-cultural research today, however, to try to interpret cross-cultural correlations as evidence of functional associations among traits if the sample is both small and random. Carol R Ember and Melvin Ember (1998), for example, incorrectly argue that the independence of cases is strictly a matter of independent selection of cases in the sample, as if historical independence did not matter. This view (1998, p. 678)—that "independence of cases means only that the choice of one case is not influenced by the choice of any other case (which random sampling guaran-tees)"—is fundamentally mistaken. Simple random or cluster sampling of one or multiple cases from a sampling frame of well-described societies (e.g., Naroll 1967; Lagace 1979) does not solve nonindependence problems. Nor does it solve the problem of representation of diversity where data for comparable descriptive coverage are lacking for the vast majority of cases in the underlying universe, as in cross-cultural research. The mistaken view that random sampling solves Galton’s problem by guarding against sampling bias ignores the real problems, those of variance underestimates that skew significance and other tests in favor of the theory being tested. Malcolm Dow (1993) discusses the effects of ignoring Galton’s problem on unwarranted saving of incorrect hypotheses, and shows that appropriate statistical controls can also help identify results that would otherwise be rejected. Without appropriate variance estimates, two independent tests of the same correct hypothesis can easily fail to replicate if confidence limits are underestimated.
Some researchers now refer to nonindependence of cases as "Galton’s Opportunity" (Witkowski 1974) or "Galton’s Asset" (Korotayev and de Munck 2003) because historical nonindependence and network interactions invite further research into alternative hypotheses. More recently, cultural anthropologists have used the SCCS, along with methods from evolutionary biology, to address common historical ancestry, horizontal transmission, environmental adaptation, and functional interrelations in the distributions of cultural traits (Mace and Pagel 1994; Borgerhoff Mulder et al., 2001). Methods of independent contrasts (Nunn, Borgerhoff Mulder, and Langley 2006), for example, are sensitive to even small amounts of horizontal transmission in cultural datasets.
The SCCS was designed to provide some of the appropriate measurements for Galton’s problem controls and for adjusting estimates of variance and significance tests accordingly. Murdock and White (1969) provided simple tests, following those proposed by Naroll (1961, 1965), to detect similarities among societies that depended on their relative geographic closeness and overall cultural affinities. They also provided a provisional phylogenetic language classification to help detect one of the types of common origin that might account for similarities among nonindependent cases. These allowed for estimates of the effective sample size of different variables. Benefits could increase for multivariate analysis if some of the new coding studies would expand ethnographic coverage to a new Extended SCCS, yet to be designed, with double the number of cases.
Hundreds of cross-cultural studies have by now contributed new codes for the pinpointed societies in the SCCS. Those resulting from Murdock and White’s 1968-1973 CCCCC research projects are published along with others in Barry and Schlegel (1980). The thousands of authors who have used SCCS data for their research cover virtually every area in which cross-ethnographic comparisons are useful, including a great many subdisciplines of the social and related sciences. Assuming that a researcher avoids spurious findings that result from mistaken strategies such as cherry-picking high correlations and significance or ignoring Galton-type problems and opportunities, cumulativity can have exponential benefits when researchers, typically coding thirty or more new variables, can test relationships in a database with thousands of variables.
A sample of some findings of authors who treat their subject comprehensively at book length, using the SCCS, and who added coded data on their specialties, will illustrate how the SCCS is sufficiently large to test multivari-ate hypotheses. Sociologist Orlando Patterson (1982) carried out a magisterial study of the internal dynamics of slavery based on his own codings of slavery variables for the sixty-six slave societies in the SCCS. This was a first-of-its-kind study on the nature of slavery over time, world and historical-comparative in scope: tribal, ancient, pre-modern, and modern. Slavery is shown to be "a parasitic relationship between master and slave, invariably entailing the violent domination of a natally alienated, or socially dead, person," and its internal dynamics to involve "a single process of recruitment, incorporation on the margin of society, and eventual manumission or death."
Economist Fred Pryor (2005a) carried out a similarly broad program of research in his comparative study of world economic systems. In his article (2005b) on the forty-one agricultural societies in the SCCS, he used clustering analysis of variables to cover a full range of variation in production, property, and distribution. He found evidence for only four basic agricultural systems among thirty-six clusterable cases and apart from the five that were unclusterable: herding plus, egalitarian farming, individualized, and semimarketized farming. Although "many anthropologists and historians consider agricultural systems to be the outcome of environmental, social, social-structural, and political variables, a statistical analysis indicates that very few such variables are correlated with the derived economic systems. The systems are thus revealed to stand as independent entities and worthy of more intensive study" (2005b, p. 2).
Anthropologist and Islamic specialist Andrey Korotayev (2004) was the first to code world religion for the SCCS. Two of the most powerful of his fertile set of findings are that world religion is the best predictor of large regional similarities in social structure, and that many of the major types of social structure (like Pryor, identified by a cluster analysis on the relevant variables) closely follow enduring regional boundaries such as the extent of the eighth-century caliphates resulting from the Arab/Islamic expansion.
Karen Paige and Jeffrey Paige (1981), teamed as sociologist and political scientist, succeeded in identifying systemic patterns in their study of gender roles by restricting their focus to the 108 prestate societies in the SCCS. They sharpened their hypotheses to focus on three determinative levels of resources: low-value, unstable, and stable. The hypotheses they tested showed how resource levels affect women and the womanly interests of men either in identification with females or in surveillance over female reproduction. With low material resources, females tend to be food producers and highly valued for the reproduction of children who add labor and enlarge the kin group and its prestige. With stable resources, property and inheritance become major issues for men as regards women: Children’s paternity comes to be at stake, and men often form conflicting fraternal interest groups. The findings of these authors show high coherence with respect to their theory of the politics of reproductive issues and the effects of these issues on social organization generally.
Evolutionary biologist Laura Betzig (1986) focused on the starkest of Darwinian issues, power and the differential extremes of open or sub rosa control over female reproduction in harems and among concubines and mistresses, as documented by her coding of the 186 cases in the SCCS. Like Paige and Paige, she regarded men in societies with property as strongly concerned with the fidelity of their wives, but she went further to explore the vicious circle of links between differences in power and differences in reproductive success that are virtually without limit for the most powerful males in the historical era. Controversial and starkly sociobiological, her explanation as to why modern states become less despotic is that to attract mercenaries, specialists in defense, craftsmen, and those who run the state, people in power are forced to make concessions to others who still serve, directly or indirectly, to contribute to the reproductive efforts of men in power. Her philosophical predilection is to reject theories of further checks and balances in favor of an extreme: that the powerful dictate the laws in their own (reproductive) interests even in the absence of absolutist despotism.
Peggy Sanday (1981), exploring feminist issues, rejected arguments of universal female subordination, and after coding variables for different measures of relative male domination and female power, argued that dominance is not inherent in human relations but is socially constructed through deep symbolic mechanisms and not only as instituted in a people’s secular power roles and behavior. Symbolic sources of male dominance, she argued, derive partly from ancient concepts of power, as exemplified by origin myths. Her hypotheses were designed to test the extent to which female power and male dominance are further determined by a people’s adaptation to their environment, social conflict, and emotional stress. She illustrated her thesis through case studies of the effects of European colonialism, migration, and food stress, supported by statistical associations between aspects of sexual inequality and diverse forms of cultural stress.
The advantage of a database like SCCS is that, in spite of what theories authors are hoping to test, in so doing they contribute coded data and statistical hypothesis tests that can be revisited and challenged by others, using new data and that cumulated from the past. Some researchers are intent on coding variables that reflect the range of variability in the phenomena they study and on working more inductively from their findings, guided by theoretical questions. An example of a strongly inductive approach is that of Martin K. Whyte, who instructed his researchers to code half the SCCS societies for each of hundreds of gender-related variables relevant to the literature on gender roles. He summarized how his findings on male dominance contrast with those of Sanday, noting that his variables have divergent cross-cultural distributions. Some, such as items for political leadership, are highly skewed in favor of men; others, such as property inheritance…, are more moderately skewed toward men; still others, such as the elaborateness of funerals or final authority over infants, show little or no male bias cross-culturally.. [Further], these different indicators are not associated with each other.. [and] some things that have been assumed in the . literature to have status implications for women may not. For example, there now seem to be no grounds for assuming that the relative subsistence contribution of women has any general status implications. (Whyte 1978a, p. 169)
Many of these authors address the Galton problem of controlling for nonindependence of cases. How prevalent is autocorrelation among the variables studied in cross-cultural research? Econometrician Anthon Eff (2004) tested 1,700 variables in the SCCS database to measure Moran’s I for spatial autocorrelation (distance), linguistic autocorrelation (common descent), and autocorrelation in cultural complexity (mainline evolution). "The results suggest that … it would be prudent to test for spatial and phylogenetic autocorrelation when conducting regression analyses with the Standard Cross-Cultural Sample" (Eff 2004, p. 153). He illustrated the use of autocorrelation tests in exploratory data analysis, showing how all variables in a given study can be evaluated for nonindependence of cases in terms of distance, language, and cultural complexity. He explains the methods for estimating these autocorrelation effects, illustrates ordinary least squares regression using the Moran I significance measure of autocorrelation (options for Durbin-Watson tests are commonly available as an alternative), and shows how, when autocorrelation is present, it can often be removed so as to get proper estimates of regression coefficients and their variances. This is done by constructing a respecified dependent variable "lagged" by weightings on the dependent variable on other locations, where the weights are degree of relationship. Ordinary least squares regression will still bias the estimated coefficients when the dependent variable is respecified, but maximum likelihood methods (Anselin 1988) will give unbiased statistics and variances in which the effects of autocorrelation have been removed.
Use of the SCCS seems to encourage good research practices. Other methodological advances that would not have been made without the shared SCCS database include statistical entailment analysis, for example, of the sexual division of labor (White, Burton, and Brudner 1977; White 2000) and Murdock’s (1980) use of this discrete-structure statistical method in his study of cultural theories of illness and their sociological entailments.
All of the cross-cultural articles and data published in the journal Ethnology for the SCCS and all the bibliographic, pinpointing, and coded data sets of the SCCS are in the public domain so as to facilitate scientific research. The journal World Cultures, edited by White from 1985 to 1990, has continued to publish SCCS cross-cultural codes and analytical articles. Google Scholar as of 2006 cites 413 online citations to articles referencing the SCCS, and the number unreferenced is perhaps four to six times that estimate. These works address a huge variety of topics. Their diversity, and their common references to a framework of variables and sample cases, along with the agreements and relatively clear bases for disagreements among authors, can be taken as indicators of success in the research design and cumulativity of the SCCS.
The SCCS is not about statistics or method but about science, and about broadly encompassing anthropo-logical-cum-historical science that encompasses contending and often complementary theories of the social, biological, and physical sciences as they interact with questions about human societies and culture. The SCCS does not represent a narrowly conceived school of thought about what the assumptions or methods of this science ought to be, other than a good and far-ranging combination of science, history, and humanities. If one draws today from new findings about mitochondrial inheritance of energetic-environmentally adaptive genes in the maternal line, for example, and reconstructs the human matri-line in its geographic migrations (and similarly for the Y chromosome patriline), is the SCCS a place to try to develop approaches to understanding the complexities and testing hypotheses about how human evolution has proceeded to the present? Or, taking Wolf’s (1982) approach to world sociopolitical comparisons through a diachronic lens of world system histories and interactions: Is the SCCS not a suitable sample for entirely different kinds of codes that compare what is known about these societies through time, and through networks of interaction, not with one another but with larger entities of the global system and through the larger networks of sociopolitical and military interactions?
Recognizing that "each society is a process in time" (quoting Edmund Leach), Robert McC. Adams (2004 p. 353) reviews the vexing problems of using a mix of textual sources and archaeological data in coding or process modeling. To code or comprehend through space and time, however, also involves attention to the subjective positioning to the different texts that provide perspectives on history, and so opens into a whole set of other classical problems in anthropology that remain to be successfully integrated—at the level of the individual investigator, the research team, or a larger and more cumulative enterprise. Part of the success of the SCCS collectivity of researchers in relation to those who publish the data from investigators and from difference sources is not to try to edit out or reedit data but to respect the integrity of original data in correct form as originally presented from a particular standpoint. Thus different streams of data coming from different sources and investigators are not compromised. Rather these independent streams themselves can be compared for indicators of what might be missing, hinted at, biased, or interpreted as reliable through cross-validation and triangulation of methods of analysis.
Mindful of Flower’s observation that any cross-cultural method depends entirely on the units of comparison and the problem under study, Murdock and White did not regard the SCCS as a unique touchstone for theory testing but only as a worthy example of what could be accomplished with collaborative construction of shared databases and efforts at comparable codings from ethnographic materials to facilitate new understandings in the human sciences.