Geography Reference
In-Depth Information
further sub-divided by gender. The report for 1861-1870 is mainly concerned with a 2
×
12
25 three-dimensional nCube: men or women; one of 12 different age groups; and 25
different cause of death categories. An even more extreme example is the first ever report on
Britain's occupational structure, from the 1841 census. No occupational classification had
been devised in advance, so the listings are a mixture of some ad hoc groupings and many
individual job titles. The occupations listed vary between counties but, ignoring purely ty-
pographic differences, we found a total of 3647 different occupational categories. The tables
divide the population into men and women, and into those aged over and under 20, so we
havea2
×
3647 nCube.
The three most important entities in the DDI aggregate data extension are variables ,
which combine into nCubes andaremadeupof categories . However, there are a number of
other important concepts. First, while the original questionnaire may ask a simple question,
like age, an almost infinite number of different categorizations can be imposed in the
creation of aggregate statistics. For example, we have so far found 17 different sets of age
groups used in British census and vital registration reports. These we group together into
a single variable group , and we have similar collections of occupational and cause of death
classifications. The DDI specification also defines category groups as collections of categories
within a single variable that can be treated as one for analysis. nCubes can also be assembled
into groups.
Second, two different nCubes can consist of exactly the same variables but not be the
same. For example, both the census and the Decennial Supplements contain tables listing
age by sex, but the census is tabulating the number of people alive on a given day while the
Supplements are tabulating deaths over a period. The DDI standard records this via an nCube
attribute called the universe , which is defined as a text string describing the sum of all values
in the nCube: all people, all deaths, all persons in employment. Two other nCube attributes
are measurement units , meaning what the numbers are counts of, such as 'persons' or 'acres',
and additivity . Most basically, this last records whether the values within an nCube do add
up to a meaningful total. For example, most historical British censuses published a single
parish-level table, the only source of information on the most detailed administrative units.
These tables are usually a mixture of information, the different columns not necessarily
being logically connected. It would be possible to wholly define one of these tables as a
single nCube, but clearly the result of adding together an acreage and a population total
is meaningless. We generally avoided this, but it was useful to combine the current total
population, the population 10 years previously and, occasionally, the population 20 years
previously into a single non-additive nCube.
Thirdly, the detailed structure of an nCube is defined by its Location Map . The example
below is the XML representation of a simple 2
×
2
×
2 nCube breaking down the total number
of births by both gender and legitimacy. The nCube definition itself specifies the various
attributes already mentioned as well as identifying the two variables used, defined elsewhere.
The <text> attribute of the nCube definition could hold a much longer explanation of
the data structure, and this is in fact where much of the text appearing on the Vision
of Britain web site is held. The nCube definition is then followed by the Location Map,
which records the actual location of the data. One key point is that in this terminology
the number of legitimate male births is not a variable, it is a cell defined by the coming
together of two different variables. This example was generated from the GB Historical
GIS, and as explained below, the way that the physical locations of the data are specified is
non-standard:
×
Search WWH ::




Custom Search