- 16 Oct, 2010 1 commit
-
-
Victor Stinner authored
_Py_char2wchar() callers usually need the result size in characters. Since it's trivial to compute it in _Py_char2wchar() (O(1) whereas wcslen() is O(n)), add an option to get it.
-
- 15 Oct, 2010 1 commit
-
-
Victor Stinner authored
* PyUnicode_EncodeFSDefault(), PyUnicode_DecodeFSDefaultAndSize() and PyUnicode_DecodeFSDefault() use the locale encoding instead of UTF-8 if Py_FileSystemDefaultEncoding is NULL * redecode_filenames() functions and _Py_code_object_list (issue #9630) are no more needed: remove them
-
- 14 Oct, 2010 1 commit
-
-
Georg Brandl authored
-
- 07 Oct, 2010 1 commit
-
-
Victor Stinner authored
All unicode functions uses PyObject* except PyUnicode_AsWideChar(). Fix the prototype for the new function PyUnicode_AsWideCharString().
-
- 02 Oct, 2010 2 commits
-
-
Victor Stinner authored
UTF-16 surrogate pairs by single non-BMP characters for 16 bits Py_UNICODE and 32 bits wchar_t (eg. Linux in narrow build).
-
Victor Stinner authored
And write unit tests for PyUnicode_AsWideChar() and PyUnicode_AsWideCharString().
-
- 29 Sep, 2010 3 commits
-
-
Victor Stinner authored
-
Victor Stinner authored
Redecode the filenames of: - all modules: __file__ and __path__ attributes - all code objects: co_filename attribute - sys.path - sys.meta_path - sys.executable - sys.path_importer_cache (keys) Keep weak references to all code objects until initfsencoding() is called, to be able to redecode co_filename attribute of all code objects.
-
Victor Stinner authored
-
- 12 Sep, 2010 3 commits
-
-
Benjamin Peterson authored
-
Victor Stinner authored
-
Benjamin Peterson authored
-
- 11 Sep, 2010 1 commit
-
-
Victor Stinner authored
a non-ASCII byte in the format string. Document also the encoding.
-
- 03 Sep, 2010 1 commit
-
-
Victor Stinner authored
-
- 01 Sep, 2010 5 commits
-
-
Victor Stinner authored
-
Victor Stinner authored
-
Victor Stinner authored
Inline its value in PyUnicode_GetDefaultEncoding(). The comment is now outdated (we will not change its value anymore).
-
Antoine Pitrou authored
are now removed, since their effect was inexistent in 3.x (the default encoding is hardcoded to utf-8 and cannot be changed).
-
Antoine Pitrou authored
properly. Patch by Stefan Behnel.
-
- 24 Aug, 2010 1 commit
-
-
Daniel Stutzbach authored
Issue 8781: On systems a signed 4-byte wchar_t and a 4-byte Py_UNICODE, use memcpy to convert between the two (as already done when wchar_t is unsigned)
-
- 18 Aug, 2010 1 commit
-
-
Victor Stinner authored
-
- 16 Aug, 2010 1 commit
-
-
Victor Stinner authored
The code is based on strncmp() of the libiberty library, function in the public domain.
-
- 13 Aug, 2010 2 commits
-
-
Victor Stinner authored
It's a ParseTuple converter: decode bytes objects to unicode using PyUnicode_DecodeFSDefaultAndSize(); str objects are output as-is. * Don't specify surrogateescape error handler in the comments nor the documentation, but PyUnicode_DecodeFSDefaultAndSize() and PyUnicode_EncodeFSDefault() because these functions use strict error handler for the mbcs encoding (on Windows). * Remove PyUnicode_FSConverter() comment in unicodeobject.c to avoid inconsistency with unicodeobject.h.
-
Victor Stinner authored
Similar to PyErr_WarnEx() but use PyUnicode_FromFormatV() to format the warning message. Strip also some trailing spaces.
-
- 11 Aug, 2010 1 commit
-
-
Alexander Belopolsky authored
va_copy, but available on all python platforms. Untabified a few unrelated files.
-
- 10 Aug, 2010 1 commit
-
-
Victor Stinner authored
-
- 01 Aug, 2010 2 commits
-
-
Georg Brandl authored
Revert r83395, it introduces test failures and is not necessary anyway since we now have to nul-terminate the string anyway.
-
Georg Brandl authored
#8821: do not rely on Unicode strings being terminated with a \u0000, rather explicitly check range before looking for a second surrogate character.
-
- 29 Jul, 2010 1 commit
-
-
Georg Brandl authored
-
- 19 Jul, 2010 1 commit
-
-
Stefan Krah authored
-
- 05 Jul, 2010 1 commit
-
-
Senthil Kumaran authored
-
- 03 Jul, 2010 1 commit
-
-
Ezio Melotti authored
-
- 01 Jul, 2010 1 commit
-
-
Ezio Melotti authored
1) #8271: when a byte sequence is invalid, only the start byte and all the valid continuation bytes are now replaced by U+FFFD, instead of replacing the number of bytes specified by the start byte. See http://www.unicode.org/versions/Unicode5.2.0/ch03.pdf (pages 94-95); 2) 5- and 6-bytes-long UTF-8 sequences are now considered invalid (no changes in behavior); 3) Change the error messages "unexpected code byte" to "invalid start byte" and "invalid data" to "invalid continuation byte"; 4) Add an extensive set of tests in test_unicode; 5) Fix test_codeccallbacks because it was failing after this change.
-
- 27 Jun, 2010 1 commit
-
-
Georg Brandl authored
-
- 26 Jun, 2010 1 commit
-
-
Ezio Melotti authored
svn+ssh://pythondev@svn.python.org/python/trunk ........ r82248 | ezio.melotti | 2010-06-26 21:44:42 +0300 (Sat, 26 Jun 2010) | 1 line Fix extra space. ........
-
- 16 Jun, 2010 1 commit
-
-
Victor Stinner authored
mode raises unicode errors. The encoder only supports "strict" and "replace" error handlers, the decoder only supports "strict" and "ignore" error handlers.
-
- 12 Jun, 2010 1 commit
-
-
Mark Dickinson authored
-
- 11 Jun, 2010 2 commits
-
-
Victor Stinner authored
filenames and enable os.fsencode().
-
Antoine Pitrou authored
svn+ssh://pythondev@svn.python.org/python/trunk ........ r81907 | antoine.pitrou | 2010-06-11 23:42:26 +0200 (ven., 11 juin 2010) | 5 lines Issue #8941: decoding big endian UTF-32 data in UCS-2 builds could crash the interpreter with characters outside the Basic Multilingual Plane (higher than 0x10000). ........
-
- 10 Jun, 2010 1 commit
-
-
Victor Stinner authored
Don't use normalize_encoding() result if it is truncated.
-