COMPUTER APPLICATIONS IN SOCIOLOGY

Most sociologists, both professionals and students, now have their own computer with direct access to a printer for writing and to the Internet for electronic mail (e-mail). Beyond the basic tasks of writing and e-mailing are a variety of other computer-supported research applications, both quantitative and qualitative. This article describes how sociologists and other social scientists use these applications and what resources are available.

The data and modeling requirements of social research have united sociologists with computers for over a hundred years. It was the 1890 U. S. census that inspired Herman Hollerith, a census researcher, to construct the first automated data processing machinery. Hollerith’s punchcard system, while not a true computer by today’s definitions, provided the foundation for contemporary computer-based data management.

In 1948 the U. S. Bureau of the Census, anticipating the voluminous tabulating requirements of the 1950 census, contracted for the building of Univac I, the first commercially produced electronic computer. The need to count, sort, and analyze the 1950 census data on this milestone computer led to the development of the first highspeed magnetic tape storage system, the first sort-merge software package, and the first statistical package, a set of matrix algebra programs.

Only a decade later many social scientists were exploring ways to use computers in their research.Not only were social scientists writing about how to apply computers, they were designing and developing new software. Some of the most popular statistical software packages, e.g., SPSS (Nie, Bent, and Hull 1975), were developed by social scientists.

During the 1980s, universities and colleges began to acquire microcomputers, accepting the premise that all researchers needed their own desktop computing equipment. The American Council of Learned Societies (ACLS) survey in 1985 (Morton and Price 1986) reported that 50 percent of sociologists had a computer for exclusive use. A survey of academic departments supported by the American Sociological Association (Koppel, Dowdall, and Shostak 1985) found that slightly less than half of the sociology faculty reported to have immediate access to microcomputers. To put these findings into a more complete perspective, of the approximately 9,000 sociologists in 1985, about 4,500 had their own computers and about 5,200 reported routine computer use. Now, it is hard to find a sociologist’s office without at least one computer. And in many countries most students in sociology have a computer for writing papers and accessing online resources.

Sociology and the Web. The Internet may be one of the largest and probably the most rapidly growing peaceful social movements in history. It is not just a technology, or a family of technologies, but a rapidly evolving socio-cultural phenomena often called ”cyberspace” or ”cyberculture.” No matter how this phenomena is defined, it is changing the way sociologists conduct their work.

By the mid-1990s sociology, like most other academic disciplines, had come to depend upon email. In addition, a rapidly growing number had begun to use the World Wide Web (WWW), commonly called the Web (Babbie 1996). Bainbridge (1995) claimed that the Web is ”a significant medium of communication for sociologists, and extrapolation of present trends suggests it may swiftly become the essential fabric of sociology’s existence.” In January 1999, the author searched Web sites with the Alta Vista search engine for the word ”sociology” and found 750,000 instances. Two years earlier a search yielded only 250,000. Sociology is indeed rapidly building its presence on the Web.

The Internet and the Web are often described as a medium of communication because e-mail, electronic conferencing, online chats (synchronous discussion), groupware, and data exchange imply social interaction. Another major metaphor of the Web is that of a database for information search and retrieval. However, new roles are emerging for the Web. For the individual, the Web has become an important ”presentation of self,” an opportunity for publishing personal and professional resumes and other such information. For the organization, the Web has become an opportunity for advertising, recruiting, communicating with the public, and conducting commerce itself. The rapid proliferation of personal Web sites in the form of various home pages suggests that many now see the Web primarily as a medium for personal and organizational impression management.

In the next few years the Web and its technologies will offer many new opportunities for conducting research. Already there are software packages developed for Web-assisted surveys and other types of data collection. The amount of data accumulating on the Web is enormous. And the graphical content of the Web is indeed a challenge to sociologists conducting computerized content analyses. The pace of these new developments will continue to challenge sociologists to not only to stay current with the new tools for research but to conduct research on the rapidly growing cultures of the Internet.

Publications on Sociological Computing. The primary source for articles on social science computer applications is the Social Science Computer Review, a quarterly publication of Sage Publications, Inc., Other relevant software reviews periodically appear in such journals as Educational and Psychological Measurement, the Journal of Marketing Research, The American Statistician, and Simulation and Games. In addition, JAI Press publishes an occasional series volume on ”Computers and the Social Sciences.”

In the late 1980s, the American Sociological Association (ASA) formed a ”Section on Microcomputing.” Over 350 sociologists joined this new section in 1990, making it the fastest-growing new section in the history of the ASA. The section publishes a quarterly newsletter and organizes sessions at annual meetings. In 1993, the section name was officially changed to ”Sociology and Computers.” A regular newsletter called SCAN (Sociology and Computers: A Newsletter) is published by members.

Another indicator of growing involvement in computing was the first annual conference, ”Computing in the Social Sciences,” held in 1990 at Williamsburg, Virginia. From this conference emerged a professional association, the Social Science Computing Association. The official journal of this association is the Social Science Computer Review, and they still hold an annual conference.

Computer Applications in Sociology. The practice of computing in sociology has evolved rapidly. Computers have been applied to practically every research task, including such unlikely ones as field note-taking, interviewing, and hundreds of other tasks (Brent and Anderson 1990). The many diverse uses of computing technology in social research are difficult to categorize because applications overlap and evolve unpredictably. Nonetheless, it is necessary to discuss different categories of applications in order to describe the state of the art of computing in sociology. Since 1987, the Winter issue of the Social Science Computer Review has been devoted to an annual symposium on the ”State of the Art of Social Science Computing.” The categorization of computer applications in this article reflects these discussions. First, some major types of applications are summarized in order of descending popularity. Then some of the challenges of computing for sociological research are noted.

Writing and Publishing. Once equated with the secretarial pool, word processing now is an activity of nearly every graduate student and professional in sociology. It consists not only of writing but preparing tables, ”typesetting” mathematical equations, and resizing objects, such as three-dimensional graphs embedded within text. Social researchers are using such capabilities and moving rapidly toward workstation environments that obscure the transition between data analysis and manuscript preparation (Steiger and Fouladi 1990). Not only do researchers use their computers for writing papers, but word processing software plays a central role in the refinement of data collection instruments, especially questionnaires and codebooks, which allows for rapid production of alternative forms and multiple drafts.

Trends in text production that blur traditional distinctions between writing and publishing (Lyman 1989) may in the long term have the most impact on what sociologists do. The growing body of articles and books in electronic-text form propel scholarship toward hypertext, which is a document system that provides for nonsequential reading of text using links that automatically access other documents. Contemporary word processors contain the capacity to easily produce documents in HTML (Hypertext Markup Language) that are ready for installation as sites on the Web. HTML can contain hypertext links, which with a single click of the mouse can bring up a totally different document from anywhere in the world, making it a truly new form of publishing.

There are several major forms of text entry that may also change the nature of writing and publishing. These forms include scanning for optical character recognition (OCR), voice recognition, and automated language translation. The technology for scanning text documents and producing computer text files has been in use for some years and requires only a scanning device and OCR software. This technology will continue to improve and its use will yield fewer errors and require considerably less effort in the future.

Likewise it is now possible to use voice recognition software to automatically transcribe dictation, interviews, and field notes into computer text files. One of the remaining problems in this approach is that all voice recognition software now requires considerable ”training” time where the errors made in recognizing a speaker’s word-sound pattern are corrected. The software is thus ”taught” to make refined guesses in translating vocalized sounds into electronic text. Even the best voice recognition software now makes a moderate number of errors, so it is not yet a panacea for manual transcription of either the spoken word or audio recordings.

It is possible now to find software that will translate text into many different languages. However, like voice recognition software, translation software still requires considerable time to manually check, decipher, and make judgments about the text produced by such programs. Future software may yield significantly improved results, automating nearly all of the voice recognition and translation process.

Communicating Electronically (E-mail, etc.)

Networks for computer-mediated communication (CMC) continue to expand internationally following the traditional logistic diffusion curve (Gurbaxani 1990). Electronic networks now supplement most other forms of social communication. E-mail, which is asynchronous or nonsimultaneous, is still the most common form of electronic interaction, but Internet-based, synchronous (simultaneous) ”mailing lists” and ”newsgroups” are also popular, as are ”chat rooms” or ”discussion groups.” With improvements in transmitting digital audio and video files on the Internet, it is expected that some new forms of video conferencing will become commonplace. At the turn of the millennium, desktop video conferencing is available ”off the shelf,” but suffers from extraneous noise and rough motion. While individual sociologists vary greatly in how they utilize e-mail, nearly all sociologists in economically developed countries depend upon it for certain types of communication.

While e-mail messages are generally written in plain text, ”attachments” to e-mail now make it possible for formatted documents, even those including graphics and multimedia, to be shared with others around the world in a matter of minutes. This remarkable technology makes co-authoring, and other forms of collaboration, far more feasible due to reduced time and cost.

As e-mail systems continue to expand, they offer social researchers new opportunities for conducting studies using electronic networks. For instance, Gaiser (1997) explored issues of running online focus groups. Online surveys have become quite common in various forms: e-mail texts, email attachments, entry forms on the Web, and as programs in external storage devices like diskettes and CD-ROMs. Sampling problems and low completion rates pose the greatest challenges. Ongoing methodological investigations will be necessary to determine the implications of this new mode of research.

Statistics. Hundreds of computer programs and articles have been written to address the needs of statistical computing in social research. Prior to the 1980s, all statistical work was performed on large or medium-size, mainframe computers. But advances in both hardware and software for microcomputers now make it possible to conduct the statistical data analysis of most small or moderate-size research studies on microcomputers. A large share of ongoing social data analysis, like analysis of massive census files, would never get done without computer technology. For example, one use of LISREL, a computer procedure which analyzes linear structural relationships by the method of maximum likelihood, would consume weeks or months without a computer.

Not only does statistical computing save time but it offers unique views of the patterns in one’s data. Without the ability to quickly reorganize data and display it in a variety of forms, social researchers neglect important patterns and subtle relationships within complex data. Some patterns cannot be observed without special software tools. For example, Heise’s (1988) computer program called Ethno gives the researcher a framework for conceptualizing, examining, and analyzing data containing event sequences. In addition, several general-purpose statistical packages offer powerful exploratory data analysis capabilities with bidirectionality through dynamic data links (Steiger and Fouladi 1990). One type of bidirectionality puts a graph in one window and frequency distributions in another, and when the user adjusts the data in one window, it automatically changes in the other.

Finding a statistical program tailored to a particular problem or technique is often challenging as the potential ”user community” may be quite small. The best sources for such software are the notices and reviews in journals such as the Social Science Computer Review, Educational and Psychological Measurement, Journal of Marketing Research, and The American Statistician. Another important source is the annual Sociological Methodology and it’s software list on the Web (http://weber.u.wash-ington.edu/~socmeth2/software.html). Some of the software noted in these sources can be obtained at no cost or very low cost. No matter what the cost, one cannot assume that any complex program is free of errors. Thus is it important to run test data and to apply data to multiple programs in order to check for inaccuracies.

Accessing, Retrieving and Managing Data.

While years ago students and researchers had to use a library or similar institution to gain access to bibliographic data files, now such services are available from one’s desktop using the Web or external storage units such as CD-ROM or DVD-ROM. Large bibliographic databases including Sociological Abstracts and Psychological Abstracts are available in these forms, as is a vast amount of data in the form of statistical tables and maps. Now that devices for ”writing” onto CD-ROMs have become inexpensive, it is expected that data from even small research projects will be disseminated in this medium.

One major development is interactive access to data by means of the Web. A variety of models are used for interactive access to both preformatted text files and precoded data files. Among the systems are GSSDIRS from ICPSR at the University of Michigan (http://www.ICPSR.umich.edu/ gss), IPUMS at the University of Minnesota (http:/ /www.hist.umn.edu/~ipums), QSERVE from Queens College-CUNY (http://www.soc.qc.edu/qserve), and SDA Archive from the University of California at Berkeley (http://csa.berkeley.edu:7502/archive.htm).

The software technology for archiving and analyzing social data is less than a half-century old. But it is very plausible to expect many advancements in the next fifty years. Interactive data analysis sites of the Web hint about what these advancements might be. For instance, using the SDA Archive Web site (see http address above), one can get a large crosstabulation table for any three variables in the full General Social Survey of over 35,000 respondents in less time than it takes to type in the variable names.

The refinement of such systems faces issues such as how to balance functionality with ease of use, the plausibility of standardizing interfaces for many numeric data files, and the use of gateways between the Web and third-party software such as statistical packages. It will take considerable research and development to sort out the feasibility of providing many different analytical facilities in the Web environment. A major challenge is determining the amount and type of data documentation necessary for typical users to get meaningful results in a reasonable amount of time. A considerable challenge is created when a variety of types of materials are necessary for retrieving useful data-related information. The main types are text, graphics, meta databases (data about databases), fielded text/data (such as bibliographic databases), and multimedia (audio and video) clips.

Qualitative Computing. Computer-based content analysis began with Stone (1966) and associates, and now plays an important role in the social sciences (Weber 1984; Kelle 1995). A survey of110 qualitative-oriented researchers found three-fourths regularly used computers (Brent, Scott, and Spencer 1987). Ragin and Becker (1989, p. 54) persuasively claimed that qualitative research using these computing tools yields more systematic attention to diversity, for example, by encouraging a ”more thorough examination of comparative contrasts among cases.”

This type of computing became much more common as researchers combined content analysis with other tasks associated with qualitative analysis. Several general-purpose programs for qualitative analysis have been widely distributed (Tesch 1989; Fielding and Lee 1991). These tools make the analysis of large amounts of text more accurate and efficient, and potentially direct the focus of attention to analytic procedures. The general tasks of text entry, code assignment, counting, and data organization have been extended to include special routines for improving the quality of coding and code management (Carley 1988). Hesse-Biber, Dupuis, and Kinder (1997) technically extended this methodology to include the management and analysis of audio and video segments as well as text.

Simulating and Modeling. Early in the history of sociological computing, Coleman (1962) and McPhee and Glaser (1962) designed computer simulation models and showed how they could be used to identify elusive implications of different theoretical assumptions. Other social scientists followed in their footsteps but the excitement of the pioneers was lost and few simulations and formal computer models were developed in the 1970s. With the emergence of artificial intelligence and other modeling methodologies, social researchers demonstrated renewed interest in formal computer-supported models of social processes (cf. Feinberg and Johnson 1995; Hanneman 1988; Markovsky, Lovaglia, and Thye 1997). New computer simulations for social policy analysis as well as pedagogy or instruction have emerged as well (Brent and Anderson 1990, pp. 188-210).

Neural networks combined with other techniques of artificial intelligence and expert systems have excited a number of social scientists (Garson 1990). Neural nets organize computer memory in ways that model human brain cells and their ability to process many things in parallel. Systems that use neural nets are especially good when pattern matching is required, however, the computations require high-performance computers.

Computer-Assisted Data Collection. CATI (Computer-Assisted Telephone Interviewing) is a computing system with online questionnaires or entry screens for telephone interviewers. It has become very common in sociological research, although its impact is not fully understood (Groves et al. 1988). It is used on free-standing PCs, networked PCs, or larger computers. These systems generally, but not always, have the following characteristics: centralized facilities for monitoring individual interviewer stations, instantaneous edit-checks with feedback for invalid responses, and automatic branching to different questions depending upon the respondents’ answers. Other major forms of computer-supported data collection include (1) CAPI (Computer-Assisted Personal Interviewing), the acronym used in survey research to refer to face-to-face interviewing assisted with a laptop or hand-held computing device; (2) Computerized Self-Administered Questionnaires (CSAQ), online programs designed for direct input from respondents; and (3) data-entry programs to facilitate the entry of data collected manually at a prior time.

A related innovation is software built for designing online questionnaires. For example, the Questionnaire Programming Language (QPL), developed by Dooley (1989), allows the researcher to draft a questionnaire with any word processor. From embedded branching commands within the questionnaire document, the QPL software package simultaneously produces two versions of the questionnaire: one for computer administration and the other for interviewer- or self-administration. An additional bonus of the program is that it automatically produces data definition commands for SPSS or SAS to use for data analysis.

Visualization and Graphics. Many social researchers have come to rely on computer graphic systems to produce maps, charts summarizing statistical data, network diagrams, and to retrieve data from GIS (Geographic Information Systems) databases. GIS data contain coordinates to associate a specific spatial point or area with any attribute, e.g., social density, associated with that point or collection of points. Because of the complexities of these data structures, the integration of these techniques onto sociologists’ desktops has been slow. Another constraint is the paucity of techniques for analyzing such data. Work such as that of Cleveland (1993) for analyzing and visualizing data graphically may increase the utilization of such data by sociologists. In addition, techniques for storing and delivering interactive audio and video by means of the Web may stimulate sociologists to investigate multimedia data containing sounds and moving images.

Teaching and Learning. During the 1970s long before the microcomputer, a small group of social science instructors began to explore how to utilize computer technology in teaching (Bailey 1978). Now it has become quite common, and several sociologically-oriented instructional packages are widely used. The most popular have been Chipendale, designed by James Davis (1990), and MicroCase, developed by Roberts and Stark (cf. Roberts and Corbett 1996). These two packages have served as the basis for the exercises contained in at least a dozen published textbooks and workbooks. A variety of instructional approaches and software tools are described regularly in Teaching Sociology and the Social Science Computer Review.

ISSUES AND CHALLENGES

The practice of computing in social research has evolved rapidly. Computers have been applied to practically every research task, including such unlikely ones as interviewing, and hundreds of other tasks (Brent and Anderson 1990). This variety of computer applications will continue to evolve with the newer Internet-based technologies.

The application of computing to sociology is not without problems. Errors in data and software abound yet rarely do social scientists check their results by running more than one program on the same data. Data and software tend to be very costly, but there are many impediments to the sharing of these critical resources. Better software is needed but graduate students often are discouraged from programming new software. Nonetheless, new breakthroughs in computer technology will continue, and major new opportunities will emerge. Many of the advances in sociological computing during the next few years undoubtedly will follow the lines of progress already described: hypertext networks; integrated, high performance, graphic data analysis stations; software for computer-supported cooperative work; and neural networks for complex models of social systems.

Perhaps the most exciting challenge for the future involves a concert of these innovations directed at the problem of modeling and analyzing vast amounts of social data. One solution would incorporate three-dimensional, multicolored, dynamic graphical representations of complex social data structures. But new techniques for analyzing these data will require new models of dynamic social structures as well as parallel social processes. Computer representations of these models tend to require extremely fast processing. Access to such models on the Web, supplemented with audio and video displays, may evolve into an important part of the sociologist’s tool kit of the future.

SOCIOLOGY USENET NEWSGROUPS

In contrast to Internet Mailing Lists, UseNet Newsgroups do not broadcast a message to all those included on the list. Internet users must specifically access the postings (messages) for any such newsgroup. While on any given day, ”sociology” will be mentioned in a variety of newsgroup discussions, the only newsgroup where academic sociological topics are regularly discussed is: alt.sci.so-ciology. Generally a few messages are added to this discussion area every day.

COMPUTER APPLICATIONS IN SOCIOLOGY

Communicating Electronically (E-mail, etc.)

Accessing, Retrieving and Managing Data.

ISSUES AND CHALLENGES

SOCIOLOGY USENET NEWSGROUPS

Related Links

:: Search WWH ::