Database Reference
In-Depth Information
Alphanumeric
All data models contain alphanumeric data: any data in a string format,
whether it is alphabetic characters or numbers (as long as they do not par-
ticipate in mathematic operations). For example, names, addresses, and
phone numbers are all string, or alphanumeric, types of data. The actual
data types used for alphanumeric information are char, nchar, varchar, and
nvarchar. As you can probably tell from the names, all these char data
types store character data, such as letters, numbers, and special symbols.
For all these data types, you specify a length. Generally, the length is
the total number of characters that the specified attribute can contain. If
you are creating an attribute to contain abbreviations of U.S. state names,
for example, you might choose to specify that the attribute is a char(2).
This defines the attribute as an alphanumeric field that contains exactly
two characters; char data types store exactly as many characters as they are
defined to hold, no more and no less, no matter how much data is inserted.
You probably noticed that there are four kinds of char data types: two
with a prefix of var , and two with an n prefix (one of which contains both
prefixes). The var prefix means that a variable-length field is being speci-
fied. A variable-length field is defined as a field having no more than the
number of characters specified in the length designation. To contrast char
with varchar, specifying char(10) results in a field that contains ten charac-
ters, even if a specific instance of an entity has six characters in that spe-
cific attribute. The remaining four characters are padded. If the attribute
is defined as a varchar(10), then there will be only six actual characters
stored.
The n prefix specifies that the data is being stored in a Unicode format.
Unicode is an international, platform-agnostic specification for the storage
of character data. Using Unicode allows systems that work with characters
from multiple languages to have a common storage format that can be read
by any other system using the Unicode specification. If you need to store
anything beyond basic ASCII text, you will need to have a Unicode data type.
The primary difference between Unicode and non-Unicode systems is
that Unicode requires two bytes of physical storage for every character
stored; non-Unicode systems generally use only one byte (sometimes more
than one byte is needed when you start storing variable-length data). The
problem with using only one byte for character storage is that one byte
cannot adequately store certain character data, such as Japanese Kanji or
Korean Hangul characters. Obviously, there are storage and performance
trade-offs involved here, and they are covered in more depth in Chapter 3.
Search WWH ::




Custom Search