Databases Reference
In-Depth Information
taste; you could use underscores or dashes, or omit the word-separating formatting
altogether. As with database and table names, the longest column name is 64 characters
in length.
Collation and Character Sets
Because not everyone wants to store English strings, it's important that a database
server be able to manage non-English characters and different ways of sorting charac-
ters. When you're comparing or sorting strings, how MySQL evaluates the result de-
pends on the character set and collation used. Character sets define what characters can
be stored; for example, you may need to store non-English characters such as ٱ or ü. A
collation defines how strings are ordered, and there are different collations for different
languages: for example, the position of the character ü in the alphabet is different in
two German orderings, and different again in Swedish and Finnish.
In our previous string-comparison examples, we ignored the collation and character-
set issue, and just let MySQL use its defaults; the default character set is latin1 , and
the default collation is latin1_swedish_ci . MySQL can be configured to use different
character sets and collation orders at the connection, database, table, and column
levels.
You can list the character sets available on your server with the SHOW CHARACTER SET
command. This shows a short description for each character set, its default collation,
and the maximum number of bytes used for each character in that character set:
mysql> SHOW CHARACTER SET;
+----------+-----------------------------+---------------------+--------+
| Charset | Description | Default collation | Maxlen |
+----------+-----------------------------+---------------------+--------+
| big5 | Big5 Traditional Chinese | big5_chinese_ci | 2 |
| dec8 | DEC West European | dec8_swedish_ci | 1 |
| cp850 | DOS West European | cp850_general_ci | 1 |
| hp8 | HP West European | hp8_english_ci | 1 |
| koi8r | KOI8-R Relcom Russian | koi8r_general_ci | 1 |
| latin1 | cp1252 West European | latin1_swedish_ci | 1 |
| latin2 | ISO 8859-2 Central European | latin2_general_ci | 1 |
| swe7 | 7bit Swedish | swe7_swedish_ci | 1 |
| ascii | US ASCII | ascii_general_ci | 1 |
| ujis | EUC-JP Japanese | ujis_japanese_ci | 3 |
| sjis | Shift-JIS Japanese | sjis_japanese_ci | 2 |
| hebrew | ISO 8859-8 Hebrew | hebrew_general_ci | 1 |
| tis620 | TIS620 Thai | tis620_thai_ci | 1 |
| euckr | EUC-KR Korean | euckr_korean_ci | 2 |
| koi8u | KOI8-U Ukrainian | koi8u_general_ci | 1 |
| gb2312 | GB2312 Simplified Chinese | gb2312_chinese_ci | 2 |
| greek | ISO 8859-7 Greek | greek_general_ci | 1 |
| cp1250 | Windows Central European | cp1250_general_ci | 1 |
| gbk | GBK Simplified Chinese | gbk_chinese_ci | 2 |
| latin5 | ISO 8859-9 Turkish | latin5_turkish_ci | 1 |
| armscii8 | ARMSCII-8 Armenian | armscii8_general_ci | 1 |
| utf8 | UTF-8 Unicode | utf8_general_ci | 3 |
 
Search WWH ::




Custom Search