Commit ca1860c3 authored by Alexander Barkov's avatar Alexander Barkov

Updating charset doc files.

Thanks to Paul for preparing the up-to-date files
reflecting 4.1 changes.
parent 585d205d
This directory holds configuration files which allow MySQL to work with
This directory holds configuration files that enable MySQL to work with
different character sets. It contains:
*.conf
Each conf file contains four tables which describe character types,
charset_name.xml
Each charset_name.xml file contains information for a simple character
set. The information in the file describes character types,
lower- and upper-case equivalencies and sorting orders for the
character values in the set.
Index
The Index file lists all of the available charset configurations.
Index.xml
The Index.xml file lists all of the available charset configurations,
including collations.
Each charset is paired with a number. The number is stored
IN THE DATABASE TABLE FILES and must not be changed. Always
add new character sets to the end of the list, so that the
numbers of the other character sets will not be changed.
Each collation must have a unique number. The number is stored
IN THE DATABASE TABLE FILES and must not be changed.
The max-id attribute of the <charsets> element must be set to
the largest collation number.
Compiled in or configuration file?
When should a character set be compiled in to MySQL's string library
(libmystrings), and when should it be placed in a configuration
file?
(libmystrings), and when should it be placed in a charset_name.xml
configuration file?
If the character set requires the strcoll functions or is a
multi-byte character set, it MUST be compiled in to the string
library. If it does not require these functions, it should be
placed in a configuration file.
placed in a charset_name.xml configuration file.
If the character set uses any one of the strcoll functions, it
must define all of them. Likewise, if the set uses one of the
......@@ -30,11 +33,7 @@ Compiled in or configuration file?
more information on how to add a complex character set to MySQL.
Syntax of configuration files
The syntax is very simple. Comments start with a '#' character and
proceed to the end of the line. Words are separated by arbitrary
amounts of whitespace.
For the character set configuration files, every word must be a
number in hexadecimal format. The ctype array takes up the first
257 words; the to_lower, to_upper and sort_order arrays take up 256
words each after that.
The syntax is very simple. Words in <map> array elements are
separated by arbitrary amounts of whitespace. Each word must be a
number in hexadecimal format. The ctype array has 257 words; the
other arrays (lower, upper, etc.) take up 256 words each after that.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment