Database Reference
In-Depth Information
5.1. String Properties
One string property is whether it is binary or nonbinary:
• A binary string is a sequence of bytes. It can contain any type of information, such
as images, MP3 files, or compressed or encrypted data. A binary string is not as‐
sociated with a character set, even if you store a value such as abc that looks like
ordinary text. Binary strings are compared byte by byte using numeric byte values.
• A nonbinary string is a sequence of characters. It stores text that has a particular
character set and collation. The character set defines which characters can be stored
in the string. The collation defines the character ordering, which affects comparison
and sorting operations.
To see which character sets are available for nonbinary strings, use this statement:
mysql> SHOW CHARACTER SET;
+----------+-----------------------------+---------------------+--------+
| Charset | Description | Default collation | Maxlen |
+----------+-----------------------------+---------------------+--------+
| big5 | Big5 Traditional Chinese | big5_chinese_ci | 2 |
| koi8r | KOI8-R Relcom Russian | koi8r_general_ci | 1 |
| latin1 | cp1252 West European | latin1_swedish_ci | 1 |
| latin2 | ISO 8859-2 Central European | latin2_general_ci | 1 |
| utf8 | UTF-8 Unicode | utf8_general_ci | 3 |
| ucs2 | UCS-2 Unicode | ucs2_general_ci | 2 |
The default character set in MySQL is latin1 . If you must store characters from several
languages in a single column, consider using one of the Unicode character sets (such as
utf8 or ucs2 ) because they can represent characters from multiple languages.
Some character sets contain only single-byte characters, whereas others permit multi‐
byte characters. Some multibyte character sets contain characters of varying lengths.
For others, all characters have a fixed length. For example, Unicode data can be stored
using the utf8 character set in which characters take from one to three bytes or the ucs2
character set in which all characters take two bytes.
In MySQL, the utf8 and ucb2 Unicode character sets include only
characters in the Basic Multilingual Plane (BMP). To use the full set
of Unicode characters, including supplemental characters that lie
outside the BMP, use utf8mb4 , in which characters take from one to
four bytes. Other Unicode character sets that include supplemental
characters are utf16 , utf16le , and utf32 .
Search WWH ::




Custom Search