United States Census

The U.S. Census has evolved a great deal since the first enumeration in 1790. As the population has grown, the task of collecting demographic information from every household in the country has become more complex and costly. Over time the survey itself has also changed, now including questions designed to elicit more sensitive information than originally intended. As technological advances have shaped census methodological and data dissemination practices, new privacy concerns have developed. Although privacy and confidentiality may be conceptually distinct, in the context of the U.S. Census, and in the minds of respondents, the two issues are intricately related. The Census Bureau’s privacy and confidentiality practices are important factors in determining the agency’s procedures for collecting and releasing household data. One of the primary concerns of the bureau is data protection, which requires minimizing intrusion and developing comprehensive measures to ensure that the data collected remain confidential. Reducing both real and perceived risk is essential for a successful enumeration. Because census results determine congressional redistricting and influence the distribution of billions of dollars in federal funds, government officials, policymakers, and the general public have an interest in the agency’s ability to collect accurate information.
The Census Bureau’s largest task is the decennial census. Although most of the agency’s other projects rely on nationally representative sampling techniques, the goal of the decennial census is to account for every household in the country. This undertaking required a census employee to visit each household until 1970, when the agency implemented the “mail out-mail back” survey allowing the United States Postal Service to distribute the questionnaires and respondents to mail them back to local processing offices. Each household now receives either the short form (intended to collect only basic demographic and housing information such as gender, age, race, and home ownership/rental status) or the long form (which is sent to a sample of households and contains the short form plus additional questions). The content and length of the long form have varied historically; in the 2000 census it contained more than 50 questions eliciting information on such topics as English language fluency, previous residences, the value of one’s home and cost of utilities, whether a unit is covered by a second mortgage or homeowner’s insurance, and the number of automobiles per household.
Participation in the decennial census is technically mandatory, but noncompliance is rarely penalized. Collecting household information inherently involves a level of intrusion. Because a successful enumeration requires the voluntary cooperation of respondents, addressing the public’s privacy concerns is paramount to the bureau’s success. Respondents have been particularly troubled by the perceived invasiveness of the long form. Growing intolerance for private market researchers, generally low levels of trust in the government, and greater fear of fraud and the unauthorized use of personal information create additional barriers to census participation. Political leaders have also expressed concerns. Prior to both the 1970 and 2000 censuses, Congress considered legislation to formally limit the questions that could be asked in the decennial census in order to reduce what was viewed as an undue burden and invasion of privacy imposed on the public.
In reaction to steadily declining response rates since the 1970 census, the bureau launched a research initiative in the 1990s to assess attitudes about census-related privacy concerns and their effects on individuals’ willingness to provide information. They found that the public was not well informed about the bureau’s privacy principles and that fears of data misuse were especially pronounced among historically underrepresented minority groups. To reverse participation trends in 2000, the bureau instituted a multi-million-dollar, professionally designed public relations campaign that included television, radio, and print ads. Hard-to-count populations were targeted, and the advertisements emphasized that individually identifiable census data are not subject to disclosure under the Freedom of Information Act and that even the Internal Revenue Service (IRS) and Supreme Court do not have access to personally identifiable information collected in the decennial census.
The agency saw the success of the 2000 census campaign as evidence of the importance of addressing the public’s privacy concerns. In March 2005, the bureau appointed the agency’s first formal Chief Privacy Officer, Gerald W. Gates, whose responsibilities included strengthening data protections and educating census employees and the public about privacy issues. After the 2000 census, the agency also updated its data stewardship website, increasing emphasis on the bureau’s legal and ethical obligations to respect privacy and protect confidentiality. The new content stressed that personally identifiable information is never published and cannot be used against a person by any government agency or court, and that employees are sworn to uphold the privacy of respondents for life, pointing out that violating the census confidentiality oath carries severe fines of up to $250,000 and/or maximum prison sentences of five years.
The website also highlighted the protection that specific federal laws provided respondents, as well as the agency’s own privacy principles: the census collects only information that is necessary for federal programs; respondents have the right to know the purpose and uses of the survey or census they agree to participate in; the census is in full compliance with federal protections for research participants, and respondents have a right to respectful treatment, including reasonable limits and restrictions on the number of follow-up surveys; access to private information is restricted to sworn staff; and, statistical methodologies, computer technologies, and security procedures are in place to prevent identity disclosure.
When the first census enumeration was taken, there was very little concern for individual privacy. Until the 1850 census, local returns were openly posted and subject to public review. Citizens were allowed unrestricted access to original records until 1880. An important turning point in the privacy history of the U.S. Census occurred in 1952, when the statutory regulations of U.S. Code Title 13 were formalized. The provision establishes that census data may be used only for statistical purposes and cannot be published with individual identifiers such as name or Social Security number. The statute also restricts the use of information that contains personal identifiers to sworn census employees. Legal contests by the attorney general, the Federal Bureau of Investigation (FBI), the Immigration and Naturalization Service, and municipal governments to release personally identifying census data or master address lists have consistently upheld the strict confidentiality standards set forth in Title 13.
One of the enduring tasks of the Census Bureau is to provide greater accessibility to data while protecting respondent confidentiality. Increasing use of the Internet and PCs, combined with technological advances, has led to an increased capacity for the dissemination of census data to nongovernment users while also generating new sources of privacy risk. In the late 1990s, the Census Bureau introduced the American FactFinder website, which gives users remote access to census microdata (disaggregated data at the respondent level without personal identifiers). To make data more easily available while limiting confidentiality risks, the bureau has developed a number of disclosure limitation procedures.
Prior to 1990, data suppression techniques were primarily used, leaving marginal totals in tables of aggregate data intact but suppressing small individual cell values in addition to other random cells, so that the missing values could not be calculated. Data-swapping techniques were introduced in the 1990 census and entail exchanging records for a sample of cases with a sample of households in neighboring areas that are matched on the basis of key variables within households. Public Use Microdata Samples (PUMS) may also be protected by introducing random noise (adding or multiplying the data by a random number), top- and bottom-coding (collapsing data that fall above or below a certain percentage of the distribution), and using population thresholds (prohibiting disclosure of geographic identifiers for areas with populations below 100,000). A disclosure review board reviews all requests for microdata files to assess disclosure risk before being released.
As the census and technology continue to evolve, new privacy concerns and solutions will also arise. For instance, in the 2000 census, respondents were allowed to indicate multiple race categories. Although intended to collect better race data, the change inadvertently increased disclosure risks because of the added race detail. In 2000, respondents were allowed to file their census short form over the Internet, requiring multiple firewalls, encryption, and authentication codes to protect the submitted data. The Census Bureau also is considering within-household privacy issues. In 1998, the Census Bureau implemented a respondent identification policy that requests permission to share limited information with other household members in future waves of panel studies. The bureau also has been testing the use of the American Communities Survey, based on a monthly random sample of U.S. households, which elicits information similar to that requested on the long form. If successful, this could result in the elimination of the long form, which is perceived as especially invasive.

Next post:

Previous post: