Data Modeling with the Entity-Relationship Model - Database Processing: Fundamentals, Design, and Implementation

Database Reference

In-Depth Information

Summary

When databases are developed as part of a new information

systems project, the database design is accomplished in two

phases. First, a data model is constructed from forms, re-

ports, data sources, and other requirements. The data model

is then transformed into a database design. A data model is a

blueprint for a database design. Like blueprints for buildings,

data models can be altered as necessary, with little effort.

Once the database is constructed, however, such alterations

are time consuming and very expensive.

The most prominent data model in use today is the

entity-relationship, or E-R, data model. It was invented by

Peter Chen and extended by others to include subtypes. An

entity is something that users want to track. An entity class

is a collection of entities of the same type and is described by

the structure of the entities in the class. An entity instance

is one entity of a given class. Entities have attributes that

describe their characteristics. Identifiers are attributes that

name entity instances. Composite identifiers consist of two

or more attributes.

The E-R model includes relationships, which are associ-

ations among entities. Relationship classes are associations

among entity classes, and relationship instances are asso-

ciations among entity instances. Today, relationships are not

allowed to have attributes. Relationships can be given names

so that they can be identified.

The degree of a relationship is the number of entity

types that participate in the relationship. Binary relation-

ships have only two entity types. In practice, relationships

of degrees greater than two are decomposed into multiple

binary relationships.

The difference between an entity and a table is that you

can express an entity relationship without specifying foreign

keys. Working with entities reduces complexity and makes it

easier to revise the data model as work progresses.

Relationships are classified according to their cardi-

nality. Maximum cardinality is the maximum number of

instances that can participate in a relationship instance.

Minimum cardinality is the least number of entities that

must participate in a relationship.

Relationships commonly have one of three maximum

cardinalities: 1:1, 1:N, or N:M. In rare instances, a maxi-

mum cardinality might be a specific number, such as 1:15.

Relationships commonly have one of four basic minimum

cardinalities: optional to optional, mandatory to optional,

optional to mandatory, or mandatory to mandatory. In rare

cases, the minimum cardinality is a specific number.

Unfortunately, many variations of the E-R model are

in use. The original version represented relationships with

diamonds. The Information Engineering version uses a line

with a crow's foot, the IDEF1X version uses another set of

symbols, and UML uses yet another set. To add further com-

plication, many data modeling products have added their

own symbols. In this text, we will use the IE Crow's Foot

model with symbols, as summarized in Figure 5-14. Other

models and techniques are summarized in Appendices B, C,

D, and H.

An ID-dependent entity is an entity whose identifier

includes the identifier of another entity. Such entities use an

identifying relationship. In such relationships, the parent is

always required, but the child (the ID-dependent entity) may

or may not be required, depending on application require-

ments. Identifying relationships are shown with solid lines in

E-R diagrams.

A weak entity is an entity whose existence depends on

the presence of another entity. All ID-dependent entities

are weak. Additionally, some entities are weak but not ID-

dependent. Some people believe such entities are not impor-

tant; others believe they are.

A subtype entity is a special case of another entity

called its supertype. Subtypes may be exclusive or inclusive.

Exclusive subtypes sometimes have discriminators, which

are attributes that specify a supertype's subtype. The most

important (and perhaps only) reason for creating subtypes in

a data model is to avoid value-inappropriate nulls.

A relationship between an entity and itself is a recursive

relationship. Recursive relationships can be 1:1, 1:N or N:M.

Relationships among nonsubtype entities are called

HAS-A relationships. Relationships among supertype/sub-

type entities are called IS-A relationships.

The elements of a data model are constructed by analyz-

ing forms, reports, and data sources. Many forms and reports

fall into common patterns. In this text, we discussed the 1:1,

1:N, and N:M strong entity patterns. We also discussed three

patterns that use ID-dependent relationships: association,

multivalue attribute, and version/instance. Some forms in-

volve mixed identifying and nonidentifying patterns. Line

items are the classic example of mixed forms, but there are

other examples as well.

The for-use-by pattern indicates the need for subtypes.

In some cases, subtypes differ because they have different at-

tributes, but they also can differ because they have different

relationships. The data modeling process is iterative. Forms

and reports are analyzed, and the data model is created,

modified, and adjusted as necessary. Sometimes the analysis

of a form or report will require that earlier work be redone.

C'est la vie!

Database Processing: Fundamentals, Design, and Implementation

Search WWH ::

Custom Search

Home