Commits · bd3b71a7bb5fcf80de1339fe71c85918a44fb7e3 · Kirill Smelkov / cpython

16 Oct, 2010 1 commit

Add an optional size argument to _Py_char2wchar() · bd3b71a7

Victor Stinner authored 14 years ago

_Py_char2wchar() callers usually need the result size in characters. Since it's
trivial to compute it in _Py_char2wchar() (O(1) whereas wcslen() is O(n)), add
an option to get it.

bd3b71a7

15 Oct, 2010 1 commit

Use locale encoding if Py_FileSystemDefaultEncoding is not set · a47a00d4

Victor Stinner authored 14 years ago

 * PyUnicode_EncodeFSDefault(), PyUnicode_DecodeFSDefaultAndSize() and
   PyUnicode_DecodeFSDefault() use the locale encoding instead of UTF-8 if
   Py_FileSystemDefaultEncoding is NULL
 * redecode_filenames() functions and _Py_code_object_list (issue #9630)
   are no more needed: remove them

a47a00d4

14 Oct, 2010 1 commit
- #9418: first step of moving private string methods to _string module. · e9e78942
  Georg Brandl authored 14 years ago
  
  e9e78942
07 Oct, 2010 1 commit

PyUnicode_AsWideCharString() takes a PyObject*, not a PyUnicodeObject* · 90fd8770

Victor Stinner authored 14 years ago

All unicode functions uses PyObject* except PyUnicode_AsWideChar(). Fix the
prototype for the new function PyUnicode_AsWideCharString().

90fd8770

02 Oct, 2010 2 commits
- Issue #8670: PyUnicode_AsWideChar() and PyUnicode_AsWideCharString() replace · 51eca407
  Victor Stinner authored 14 years ago
```
UTF-16 surrogate pairs by single non-BMP characters for 16 bits Py_UNICODE
and 32 bits wchar_t (eg. Linux in narrow build).
```
  51eca407
- Issue #8870: PyUnicode_AsWideCharString() doesn't count the trailing nul character · 9622410e
  Victor Stinner authored 14 years ago
```
And write unit tests for PyUnicode_AsWideChar() and PyUnicode_AsWideCharString().
```
  9622410e
29 Sep, 2010 3 commits
- Fix PyUnicode_AsWideCharString(): set *size if size is not NULL · 59c973ee
  Victor Stinner authored 14 years ago
  
  59c973ee
- Issue #9630: Redecode filenames when setting the filesystem encoding · 82deba1f
  Victor Stinner authored 14 years ago
```
Redecode the filenames of:

 - all modules: __file__ and __path__ attributes
 - all code objects: co_filename attribute
 - sys.path
 - sys.meta_path
 - sys.executable
 - sys.path_importer_cache (keys)

Keep weak references to all code objects until initfsencoding() is called, to
be able to redecode co_filename attribute of all code objects.
```
  82deba1f
- Issue #9979: Create function PyUnicode_AsWideCharString(). · b876f4df
  Victor Stinner authored 14 years ago
  
  b876f4df
12 Sep, 2010 3 commits
- use return NULL; it's just as correct · 2f15b108
  Benjamin Peterson authored 14 years ago
  
  2f15b108
- Issue #9738, #9836: Fix refleak introduced by r84704 · 8630ada7
  Victor Stinner authored 14 years ago
  
  8630ada7
- detect non-ascii characters much earlier (plugs ref leak) · f96ec53b
  Benjamin Peterson authored 14 years ago
  
  f96ec53b
11 Sep, 2010 1 commit
- Issue #9738: PyUnicode_FromFormat() and PyErr_Format() raise an error on · cf409113
  Victor Stinner authored 14 years ago
```
a non-ASCII byte in the format string.

Document also the encoding.
```
  cf409113
03 Sep, 2010 1 commit
- Rename PyUnicode_strdup() to PyUnicode_AsUnicodeCopy() · 30af7c33
  Victor Stinner authored 14 years ago
  
  30af7c33
01 Sep, 2010 5 commits
- Create PyUnicode_strdup() function · 60fbac95
  Victor Stinner authored 14 years ago
  
  60fbac95
- Create Py_UNICODE_strcat() function · 74676426
  Victor Stinner authored 14 years ago
  
  74676426
- Remove unicode_default_encoding constant · eda3e00d
  Victor Stinner authored 14 years ago
```
Inline its value in PyUnicode_GetDefaultEncoding(). The comment is now outdated
(we will not change its value anymore).
```
  eda3e00d
- Issue #9549: sys.setdefaultencoding() and PyUnicode_SetDefaultEncoding() · dbbf653e
  Antoine Pitrou authored 14 years ago
```
are now removed, since their effect was inexistent in 3.x (the default
encoding is hardcoded to utf-8 and cannot be changed).
```
  dbbf653e
- Issue #7415: PyUnicode_FromEncodedObject() now uses the new buffer API · a3e883f7
  Antoine Pitrou authored 14 years ago
```
properly.  Patch by Stefan Behnel.
```
  a3e883f7
24 Aug, 2010 1 commit

Issue 8781: On systems a signed 4-byte wchar_t and a 4-byte Py_UNICODE, use... · 8b4800c8

Daniel Stutzbach authored 14 years ago

Issue 8781: On systems a signed 4-byte wchar_t and a 4-byte Py_UNICODE, use memcpy to convert between the two (as already done when wchar_t is unsigned)

8b4800c8

18 Aug, 2010 1 commit
- Fix PyUnicode_EncodeFSDefault() indentation · fd751a66
  Victor Stinner authored 14 years ago
  
  fd751a66
16 Aug, 2010 1 commit
- Issue #9425: Create Py_UNICODE_strncmp() function · 26fd2a01
  Victor Stinner authored 14 years ago
```
The code is based on strncmp() of the libiberty library,
function in the public domain.
```
  26fd2a01
13 Aug, 2010 2 commits

Issue #9542: Create PyUnicode_FSDecoder() function · 0cf2b810

Victor Stinner authored 14 years ago

It's a ParseTuple converter: decode bytes objects to unicode using
PyUnicode_DecodeFSDefaultAndSize(); str objects are output as-is.

 * Don't specify surrogateescape error handler in the comments nor the
   documentation, but PyUnicode_DecodeFSDefaultAndSize() and
   PyUnicode_EncodeFSDefault() because these functions use strict error handler
   for the mbcs encoding (on Windows).
 * Remove PyUnicode_FSConverter() comment in unicodeobject.c to avoid
   inconsistency with unicodeobject.h.

0cf2b810

Issue #9425: Create PyErr_WarnFormat() function · 5c829d60

Victor Stinner authored 14 years ago

Similar to PyErr_WarnEx() but use PyUnicode_FromFormatV() to format the warning
message.

Strip also some trailing spaces.

5c829d60

11 Aug, 2010 1 commit
- Issue #2443: Added a new macro, Py_VA_COPY, which is equivalent to C99 · 35ce36d9
  Alexander Belopolsky authored 14 years ago
```
va_copy, but available on all python platforms.  Untabified a few
unrelated files.
```
  35ce36d9
10 Aug, 2010 1 commit
- Issue #9425: create Py_UNICODE_strrchr() function · 40cbe578
  Victor Stinner authored 14 years ago
  
  40cbe578
01 Aug, 2010 2 commits

Revert r83395, it introduces test failures and is not necessary anyway since... · 28cf3b41
Georg Brandl authored 14 years ago
```
Revert r83395, it introduces test failures and is not necessary anyway since we now have to nul-terminate the string anyway.
```
28cf3b41

#8821: do not rely on Unicode strings being terminated with a \u0000, rather... · 5fe7888e

Georg Brandl authored 14 years ago

#8821: do not rely on Unicode strings being terminated with a \u0000, rather explicitly check range before looking for a second surrogate character.

5fe7888e

29 Jul, 2010 1 commit
- Use Py_CLEAR(). · 6275527d
  Georg Brandl authored 14 years ago
  
  6275527d
19 Jul, 2010 1 commit
- Sub-issue of #9036: Fix incorrect use of Py_CHARMASK. · 917ea3ff
  Stefan Krah authored 14 years ago
  
  917ea3ff
05 Jul, 2010 1 commit
- Fix the docstrings of the capitalize method. · 9e27dd8a
  Senthil Kumaran authored 14 years ago
  
  9e27dd8a
03 Jul, 2010 1 commit
- Update comment about surrogates. · 91544b8c
  Ezio Melotti authored 14 years ago
  
  91544b8c
01 Jul, 2010 1 commit

Update PyUnicode_DecodeUTF8 from RFC 2279 to RFC 3629. · 6d1bd194

Ezio Melotti authored 14 years ago

1) #8271: when a byte sequence is invalid, only the start byte and all the
valid continuation bytes are now replaced by U+FFFD, instead of replacing
the number of bytes specified by the start byte.
See http://www.unicode.org/versions/Unicode5.2.0/ch03.pdf (pages 94-95);
2) 5- and 6-bytes-long UTF-8 sequences are now considered invalid (no changes
in behavior);
3) Change the error messages "unexpected code byte" to "invalid start byte"
and "invalid data" to "invalid continuation byte";
4) Add an extensive set of tests in test_unicode;
5) Fix test_codeccallbacks because it was failing after this change.

6d1bd194

27 Jun, 2010 1 commit
- #9078: fix some Unicode C API descriptions, in comments and docs. · 65708255
  Georg Brandl authored 14 years ago
  
  65708255
26 Jun, 2010 1 commit

Merged revisions 82248 via svnmerge from · 5e6ff1de

Ezio Melotti authored 14 years ago

svn+ssh://pythondev@svn.python.org/python/trunk

........
  r82248 | ezio.melotti | 2010-06-26 21:44:42 +0300 (Sat, 26 Jun 2010) | 1 line

  Fix extra space.
........

5e6ff1de

16 Jun, 2010 1 commit

Issue #850997: mbcs encoding (Windows only) handles errors argument: strict · 9ab7b8dd

Victor Stinner authored 14 years ago

mode raises unicode errors. The encoder only supports "strict" and "replace"
error handlers, the decoder only supports "strict" and "ignore" error handlers.

9ab7b8dd

12 Jun, 2010 1 commit
- Silence 'unused variable' gcc warning. Patch by Éric Araujo. · 89448c19
  Mark Dickinson authored 14 years ago
  
  89448c19
11 Jun, 2010 2 commits

Issue #8969: On Windows, use mbcs codec in strict mode to encode and decode · d86accdc
Victor Stinner authored 14 years ago
```
filenames and enable os.fsencode().
```
d86accdc

Merged revisions 81907 via svnmerge from · 3c1dbd91

Antoine Pitrou authored 14 years ago

svn+ssh://pythondev@svn.python.org/python/trunk

........
  r81907 | antoine.pitrou | 2010-06-11 23:42:26 +0200 (ven., 11 juin 2010) | 5 lines

  Issue #8941: decoding big endian UTF-32 data in UCS-2 builds could crash
  the interpreter with characters outside the Basic Multilingual Plane
  (higher than 0x10000).
........

3c1dbd91

10 Jun, 2010 1 commit
- Fix r81869: ISO-8859-15 was seen as an alias to ISO-8859-1 · a8f37e60
  Victor Stinner authored 14 years ago
```
Don't use normalize_encoding() result if it is truncated.
```
  a8f37e60