Unicode ======= Encoding systems ---------------- Computers deal with numbers, this means that letters and other characters are internally represented as numbers. Basically, an encoding system associates each character with a number. For example, for the encoding named ASCII the number 97 represents the character "a". So a text is just a sequence of characters, and for computers it's just a sequence of numbers. There're many different encoding systems, each one is used to represent characters from one or more languages. For example the ASCII encoding is used for english; ISO-8859-1 can be used with spanish, french or german; EUC-JP represents japanese characters; etc.. The problem is that different encodings can use the same number to represent different charecters, then they're incompatible. This is a problem for example if you want to mix different languages in the same text. Unicode ------- To solve this problem Unicode appeared. Unicode is an encoding system that is able to represent all the characters in the world. Using Unicode it's possible to mix different languages in the same text without problems. Python ------ The Python programming language provides two types of strings, normal strings and unicode strings. Internationalized software written in Python always should use unicode strings for text. Normal strings represent sequences of bytes while unicode strings represent sequences of characters. Unicode strings provide a higher abstraction layer for the programmer that lets to forget, most of the time, about the encoding issues. Encoding becomes an issue when an unicode string needs to be serialized, for example when the server response is sent to the browser. Then an specific encoding needs to be choosen. For fully multilingual applications this should be UTF-8, which is a particular representation of the Unicode character set. .. seealso:: Related links General information about Unicode: * `Official Unicode web site <http://www.unicode.org/>`_ * `UTF-8 and Unicode FAQ for Unix/Linux <http://www.cl.cam.ac.uk/~mgk25/unicode.html>`_ * `Unicode and Multilingual Support in HTML, Fonts, Web Browsers and Other Applications <http://www.alanwood.net/unicode/>`_ Python resources for Unicode: * `Python Unicode Tutorial <http://www.reportlab.com/i18n/python_unicode_tutorial.html>`_ * `Python Internationalization Special Interest Group <http://www.python.org/sigs/i18n-sig/>`_