Database Reference
In-Depth Information
INTRODUCTION
The relational database design concepts were developed without considering missing
information in relations (Codd, 1986; Levene, 1999; Date 2000). Value of an attribute in a
tuple may be missing for several reasons: 1) Value is applicable but it is unknown, 2) Value
is not applicable, 3) Value does not exist, or 4) Other reasons, such as value is undefi ned
(Date 2000). An example of value that is not applicable is the attribute driver license number
for a ten-year-old child. If an adult does not have a driver license number, then the value
does not exist. If an adult has a driver license number, but it is unknown, then the value is
applicable but unknown.
A common method of representing missing values is using nulls (Codd, 1986). Other
methods include using default values (Date, 1990), using a subset of the attribute domain
(Lipski, 1979), and using variables or many different “null values” (Imielinski & Lipski,
1984). Missing information can create problems in querying data from relations (Imielinski
& Lipski, 1984; Date, 1990). Several methods have been proposed to extend the relational
operators to deal with missing values (Codd, 1986; Reiter, 1986; Sutton & King, 1995).
Another group of studies examined the effect of nulls on the concept of functional depen-
dency (Vassiliou, 1980; Vardi, 1986; Levene & Loizou, 1999). These studies have focussed
primarily on missing values of the type “applicable but unknown”.
The current paper examines the effect of nulls in candidate keys on normalizing a
relational schema. The nulls considered in this paper are of the type “not applicable” or
“does not exist”. Specifi cally, this paper examines the effectiveness of Boyce-Codd Normal
Form (BCNF) and Domain Key Normal Form (DKNF) in identifying insertion/deletion
anomalies if missing values in candidate keys are represented by nulls. Candidate keys
with nulls commonly are found in relations that represent information on two entities with
a one-to-one relationship between them. It is shown that the current defi nition of Boyce-
Codd Normal Form is ineffective in identifying poor designs in such relations. Domain Key
Normal Form (DKNF) also suffers from the same problem. The paper identifi es the source
of the problem and offers a solution by incorporating the concept of entity integrity rule
into the defi nitions of BCNF and DKNF. This paper also shows that incorporating the entity
integrity rule into the defi nition of either a relation or a candidate key does not provide a
satisfactory solution to the problem.
DESCRIPTION OF THE PROBLEM
To help explain the problem, we consider two entities, EMPLOYEE and COMPUTER,
that have a (zero-or-one)-to-(zero-or-one) relationship between them. Thus, a computer has
zero or one employee assigned to it at any given time. Similarly, an employee is assigned
to zero or one computer at any time. Consider a relation:
ASSIGNMENT (ID, NAME, TITLE, COMPUTER_NO, MODEL, RAM).
In the above relation, ID, NAME, and TITLE represent the identifi cation number,
the name, and the title of the employee, respectively. ID is the only unique identifi er of
the employee. COMPUTER_NO is the only unique identifi er of the computer assigned to
the employee. MODEL and RAM represent the model, and the amount of memory of the
employee's computer, respectively. Figure 1 shows a sample state of the relation.
Search WWH ::




Custom Search