Commit 07985ef3 authored by Serhiy Storchaka's avatar Serhiy Storchaka

Issue #22286: The "backslashreplace" error handlers now works with

decoding and translating.
parent 58f02019
...@@ -280,8 +280,9 @@ and optionally an *errors* argument. ...@@ -280,8 +280,9 @@ and optionally an *errors* argument.
The *errors* argument specifies the response when the input string can't be The *errors* argument specifies the response when the input string can't be
converted according to the encoding's rules. Legal values for this argument are converted according to the encoding's rules. Legal values for this argument are
``'strict'`` (raise a :exc:`UnicodeDecodeError` exception), ``'replace'`` (use ``'strict'`` (raise a :exc:`UnicodeDecodeError` exception), ``'replace'`` (use
``U+FFFD``, ``REPLACEMENT CHARACTER``), or ``'ignore'`` (just leave the ``U+FFFD``, ``REPLACEMENT CHARACTER``), ``'ignore'`` (just leave the
character out of the Unicode result). character out of the Unicode result), or ``'backslashreplace'`` (inserts a
``\xNN`` escape sequence).
The following examples show the differences:: The following examples show the differences::
>>> b'\x80abc'.decode("utf-8", "strict") #doctest: +NORMALIZE_WHITESPACE >>> b'\x80abc'.decode("utf-8", "strict") #doctest: +NORMALIZE_WHITESPACE
...@@ -291,6 +292,8 @@ The following examples show the differences:: ...@@ -291,6 +292,8 @@ The following examples show the differences::
invalid start byte invalid start byte
>>> b'\x80abc'.decode("utf-8", "replace") >>> b'\x80abc'.decode("utf-8", "replace")
'\ufffdabc' '\ufffdabc'
>>> b'\x80abc'.decode("utf-8", "backslashreplace")
'\\x80abc'
>>> b'\x80abc'.decode("utf-8", "ignore") >>> b'\x80abc'.decode("utf-8", "ignore")
'abc' 'abc'
......
...@@ -314,8 +314,8 @@ The following error handlers are only applicable to ...@@ -314,8 +314,8 @@ The following error handlers are only applicable to
| | reference (only for encoding). Implemented | | | reference (only for encoding). Implemented |
| | in :func:`xmlcharrefreplace_errors`. | | | in :func:`xmlcharrefreplace_errors`. |
+-------------------------+-----------------------------------------------+ +-------------------------+-----------------------------------------------+
| ``'backslashreplace'`` | Replace with backslashed escape sequences | | ``'backslashreplace'`` | Replace with backslashed escape sequences. |
| | (only for encoding). Implemented in | | | Implemented in |
| | :func:`backslashreplace_errors`. | | | :func:`backslashreplace_errors`. |
+-------------------------+-----------------------------------------------+ +-------------------------+-----------------------------------------------+
| ``'namereplace'`` | Replace with ``\N{...}`` escape sequences | | ``'namereplace'`` | Replace with ``\N{...}`` escape sequences |
...@@ -350,6 +350,10 @@ In addition, the following error handler is specific to the given codecs: ...@@ -350,6 +350,10 @@ In addition, the following error handler is specific to the given codecs:
.. versionadded:: 3.5 .. versionadded:: 3.5
The ``'namereplace'`` error handler. The ``'namereplace'`` error handler.
.. versionchanged:: 3.5
The ``'backslashreplace'`` error handlers now works with decoding and
translating.
The set of allowed values can be extended by registering a new named error The set of allowed values can be extended by registering a new named error
handler: handler:
...@@ -417,9 +421,9 @@ functions: ...@@ -417,9 +421,9 @@ functions:
.. function:: backslashreplace_errors(exception) .. function:: backslashreplace_errors(exception)
Implements the ``'backslashreplace'`` error handling (for encoding with Implements the ``'backslashreplace'`` error handling (for
:term:`text encodings <text encoding>` only): the :term:`text encodings <text encoding>` only): malformed data is
unencodable character is replaced by a backslashed escape sequence. replaced by a backslashed escape sequence.
.. function:: namereplace_errors(exception) .. function:: namereplace_errors(exception)
......
...@@ -973,9 +973,8 @@ are always available. They are listed here in alphabetical order. ...@@ -973,9 +973,8 @@ are always available. They are listed here in alphabetical order.
Characters not supported by the encoding are replaced with the Characters not supported by the encoding are replaced with the
appropriate XML character reference ``&#nnn;``. appropriate XML character reference ``&#nnn;``.
* ``'backslashreplace'`` (also only supported when writing) * ``'backslashreplace'`` replaces malformed data by Python's backslashed
replaces unsupported characters with Python's backslashed escape escape sequences.
sequences.
* ``'namereplace'`` (also only supported when writing) * ``'namereplace'`` (also only supported when writing)
replaces unsupported characters with ``\N{...}`` escape sequences. replaces unsupported characters with ``\N{...}`` escape sequences.
......
...@@ -825,11 +825,12 @@ Text I/O ...@@ -825,11 +825,12 @@ Text I/O
exception if there is an encoding error (the default of ``None`` has the same exception if there is an encoding error (the default of ``None`` has the same
effect), or pass ``'ignore'`` to ignore errors. (Note that ignoring encoding effect), or pass ``'ignore'`` to ignore errors. (Note that ignoring encoding
errors can lead to data loss.) ``'replace'`` causes a replacement marker errors can lead to data loss.) ``'replace'`` causes a replacement marker
(such as ``'?'``) to be inserted where there is malformed data. When (such as ``'?'``) to be inserted where there is malformed data.
writing, ``'xmlcharrefreplace'`` (replace with the appropriate XML character ``'backslashreplace'`` causes malformed data to be replaced by a
reference), ``'backslashreplace'`` (replace with backslashed escape backslashed escape sequence. When writing, ``'xmlcharrefreplace'``
sequences) or ``'namereplace'`` (replace with ``\N{...}`` escape sequences) (replace with the appropriate XML character reference) or ``'namereplace'``
can be used. Any other error handling name that has been registered with (replace with ``\N{...}`` escape sequences) can be used. Any other error
handling name that has been registered with
:func:`codecs.register_error` is also valid. :func:`codecs.register_error` is also valid.
.. index:: .. index::
......
...@@ -118,7 +118,9 @@ Other Language Changes ...@@ -118,7 +118,9 @@ Other Language Changes
Some smaller changes made to the core Python language are: Some smaller changes made to the core Python language are:
* None yet. * Added the ``'namereplace'`` error handlers. The ``'backslashreplace'``
error handlers now works with decoding and translating.
(Contributed by Serhiy Storchaka in :issue:`19676` and :issue:`22286`.)
......
...@@ -127,7 +127,8 @@ class Codec: ...@@ -127,7 +127,8 @@ class Codec:
'surrogateescape' - replace with private code points U+DCnn. 'surrogateescape' - replace with private code points U+DCnn.
'xmlcharrefreplace' - Replace with the appropriate XML 'xmlcharrefreplace' - Replace with the appropriate XML
character reference (only for encoding). character reference (only for encoding).
'backslashreplace' - Replace with backslashed escape sequences 'backslashreplace' - Replace with backslashed escape sequences.
'namereplace' - Replace with \\N{...} escape sequences
(only for encoding). (only for encoding).
The set of allowed values can be extended via register_error. The set of allowed values can be extended via register_error.
...@@ -359,7 +360,8 @@ class StreamWriter(Codec): ...@@ -359,7 +360,8 @@ class StreamWriter(Codec):
'xmlcharrefreplace' - Replace with the appropriate XML 'xmlcharrefreplace' - Replace with the appropriate XML
character reference. character reference.
'backslashreplace' - Replace with backslashed escape 'backslashreplace' - Replace with backslashed escape
sequences (only for encoding). sequences.
'namereplace' - Replace with \\N{...} escape sequences.
The set of allowed parameter values can be extended via The set of allowed parameter values can be extended via
register_error. register_error.
...@@ -429,7 +431,8 @@ class StreamReader(Codec): ...@@ -429,7 +431,8 @@ class StreamReader(Codec):
'strict' - raise a ValueError (or a subclass) 'strict' - raise a ValueError (or a subclass)
'ignore' - ignore the character and continue with the next 'ignore' - ignore the character and continue with the next
'replace'- replace with a suitable replacement character; 'replace'- replace with a suitable replacement character
'backslashreplace' - Replace with backslashed escape sequences;
The set of allowed parameter values can be extended via The set of allowed parameter values can be extended via
register_error. register_error.
......
...@@ -246,6 +246,11 @@ class CodecCallbackTest(unittest.TestCase): ...@@ -246,6 +246,11 @@ class CodecCallbackTest(unittest.TestCase):
"\u0000\ufffd" "\u0000\ufffd"
) )
self.assertEqual(
b"\x00\x00\x00\x00\x00".decode("unicode-internal", "backslashreplace"),
"\u0000\\x00"
)
codecs.register_error("test.hui", handler_unicodeinternal) codecs.register_error("test.hui", handler_unicodeinternal)
self.assertEqual( self.assertEqual(
...@@ -565,17 +570,6 @@ class CodecCallbackTest(unittest.TestCase): ...@@ -565,17 +570,6 @@ class CodecCallbackTest(unittest.TestCase):
codecs.backslashreplace_errors, codecs.backslashreplace_errors,
UnicodeError("ouch") UnicodeError("ouch")
) )
# "backslashreplace" can only be used for encoding
self.assertRaises(
TypeError,
codecs.backslashreplace_errors,
UnicodeDecodeError("ascii", bytearray(b"\xff"), 0, 1, "ouch")
)
self.assertRaises(
TypeError,
codecs.backslashreplace_errors,
UnicodeTranslateError("\u3042", 0, 1, "ouch")
)
# Use the correct exception # Use the correct exception
self.assertEqual( self.assertEqual(
codecs.backslashreplace_errors( codecs.backslashreplace_errors(
...@@ -701,6 +695,16 @@ class CodecCallbackTest(unittest.TestCase): ...@@ -701,6 +695,16 @@ class CodecCallbackTest(unittest.TestCase):
UnicodeEncodeError("ascii", "\udfff", 0, 1, "ouch")), UnicodeEncodeError("ascii", "\udfff", 0, 1, "ouch")),
("\\udfff", 1) ("\\udfff", 1)
) )
self.assertEqual(
codecs.backslashreplace_errors(
UnicodeDecodeError("ascii", bytearray(b"\xff"), 0, 1, "ouch")),
("\\xff", 1)
)
self.assertEqual(
codecs.backslashreplace_errors(
UnicodeTranslateError("\u3042", 0, 1, "ouch")),
("\\u3042", 1)
)
def test_badhandlerresults(self): def test_badhandlerresults(self):
results = ( 42, "foo", (1,2,3), ("foo", 1, 3), ("foo", None), ("foo",), ("foo", 1, 3), ("foo", None), ("foo",) ) results = ( 42, "foo", (1,2,3), ("foo", 1, 3), ("foo", None), ("foo",), ("foo", 1, 3), ("foo", None), ("foo",) )
......
...@@ -378,6 +378,10 @@ class ReadTest(MixInCheckStateHandling): ...@@ -378,6 +378,10 @@ class ReadTest(MixInCheckStateHandling):
before + after) before + after)
self.assertEqual(test_sequence.decode(self.encoding, "replace"), self.assertEqual(test_sequence.decode(self.encoding, "replace"),
before + self.ill_formed_sequence_replace + after) before + self.ill_formed_sequence_replace + after)
backslashreplace = ''.join('\\x%02x' % b
for b in self.ill_formed_sequence)
self.assertEqual(test_sequence.decode(self.encoding, "backslashreplace"),
before + backslashreplace + after)
class UTF32Test(ReadTest, unittest.TestCase): class UTF32Test(ReadTest, unittest.TestCase):
encoding = "utf-32" encoding = "utf-32"
...@@ -1300,14 +1304,19 @@ class UnicodeInternalTest(unittest.TestCase): ...@@ -1300,14 +1304,19 @@ class UnicodeInternalTest(unittest.TestCase):
"unicode_internal") "unicode_internal")
if sys.byteorder == "little": if sys.byteorder == "little":
invalid = b"\x00\x00\x11\x00" invalid = b"\x00\x00\x11\x00"
invalid_backslashreplace = r"\x00\x00\x11\x00"
else: else:
invalid = b"\x00\x11\x00\x00" invalid = b"\x00\x11\x00\x00"
invalid_backslashreplace = r"\x00\x11\x00\x00"
with support.check_warnings(): with support.check_warnings():
self.assertRaises(UnicodeDecodeError, self.assertRaises(UnicodeDecodeError,
invalid.decode, "unicode_internal") invalid.decode, "unicode_internal")
with support.check_warnings(): with support.check_warnings():
self.assertEqual(invalid.decode("unicode_internal", "replace"), self.assertEqual(invalid.decode("unicode_internal", "replace"),
'\ufffd') '\ufffd')
with support.check_warnings():
self.assertEqual(invalid.decode("unicode_internal", "backslashreplace"),
invalid_backslashreplace)
@unittest.skipUnless(SIZEOF_WCHAR_T == 4, 'specific to 32-bit wchar_t') @unittest.skipUnless(SIZEOF_WCHAR_T == 4, 'specific to 32-bit wchar_t')
def test_decode_error_attributes(self): def test_decode_error_attributes(self):
...@@ -2042,6 +2051,16 @@ class CharmapTest(unittest.TestCase): ...@@ -2042,6 +2051,16 @@ class CharmapTest(unittest.TestCase):
("ab\ufffd", 3) ("ab\ufffd", 3)
) )
self.assertEqual(
codecs.charmap_decode(b"\x00\x01\x02", "backslashreplace", "ab"),
("ab\\x02", 3)
)
self.assertEqual(
codecs.charmap_decode(b"\x00\x01\x02", "backslashreplace", "ab\ufffe"),
("ab\\x02", 3)
)
self.assertEqual( self.assertEqual(
codecs.charmap_decode(b"\x00\x01\x02", "ignore", "ab"), codecs.charmap_decode(b"\x00\x01\x02", "ignore", "ab"),
("ab", 3) ("ab", 3)
...@@ -2118,6 +2137,25 @@ class CharmapTest(unittest.TestCase): ...@@ -2118,6 +2137,25 @@ class CharmapTest(unittest.TestCase):
("ab\ufffd", 3) ("ab\ufffd", 3)
) )
self.assertEqual(
codecs.charmap_decode(b"\x00\x01\x02", "backslashreplace",
{0: 'a', 1: 'b'}),
("ab\\x02", 3)
)
self.assertEqual(
codecs.charmap_decode(b"\x00\x01\x02", "backslashreplace",
{0: 'a', 1: 'b', 2: None}),
("ab\\x02", 3)
)
# Issue #14850
self.assertEqual(
codecs.charmap_decode(b"\x00\x01\x02", "backslashreplace",
{0: 'a', 1: 'b', 2: '\ufffe'}),
("ab\\x02", 3)
)
self.assertEqual( self.assertEqual(
codecs.charmap_decode(b"\x00\x01\x02", "ignore", codecs.charmap_decode(b"\x00\x01\x02", "ignore",
{0: 'a', 1: 'b'}), {0: 'a', 1: 'b'}),
...@@ -2194,6 +2232,18 @@ class CharmapTest(unittest.TestCase): ...@@ -2194,6 +2232,18 @@ class CharmapTest(unittest.TestCase):
("ab\ufffd", 3) ("ab\ufffd", 3)
) )
self.assertEqual(
codecs.charmap_decode(b"\x00\x01\x02", "backslashreplace",
{0: a, 1: b}),
("ab\\x02", 3)
)
self.assertEqual(
codecs.charmap_decode(b"\x00\x01\x02", "backslashreplace",
{0: a, 1: b, 2: 0xFFFE}),
("ab\\x02", 3)
)
self.assertEqual( self.assertEqual(
codecs.charmap_decode(b"\x00\x01\x02", "ignore", codecs.charmap_decode(b"\x00\x01\x02", "ignore",
{0: a, 1: b}), {0: a, 1: b}),
...@@ -2253,9 +2303,13 @@ class TypesTest(unittest.TestCase): ...@@ -2253,9 +2303,13 @@ class TypesTest(unittest.TestCase):
self.assertRaises(UnicodeDecodeError, codecs.unicode_escape_decode, br"\U00110000") self.assertRaises(UnicodeDecodeError, codecs.unicode_escape_decode, br"\U00110000")
self.assertEqual(codecs.unicode_escape_decode(r"\U00110000", "replace"), ("\ufffd", 10)) self.assertEqual(codecs.unicode_escape_decode(r"\U00110000", "replace"), ("\ufffd", 10))
self.assertEqual(codecs.unicode_escape_decode(r"\U00110000", "backslashreplace"),
(r"\x5c\x55\x30\x30\x31\x31\x30\x30\x30\x30", 10))
self.assertRaises(UnicodeDecodeError, codecs.raw_unicode_escape_decode, br"\U00110000") self.assertRaises(UnicodeDecodeError, codecs.raw_unicode_escape_decode, br"\U00110000")
self.assertEqual(codecs.raw_unicode_escape_decode(r"\U00110000", "replace"), ("\ufffd", 10)) self.assertEqual(codecs.raw_unicode_escape_decode(r"\U00110000", "replace"), ("\ufffd", 10))
self.assertEqual(codecs.raw_unicode_escape_decode(r"\U00110000", "backslashreplace"),
(r"\x5c\x55\x30\x30\x31\x31\x30\x30\x30\x30", 10))
class UnicodeEscapeTest(unittest.TestCase): class UnicodeEscapeTest(unittest.TestCase):
...@@ -2894,11 +2948,13 @@ class CodePageTest(unittest.TestCase): ...@@ -2894,11 +2948,13 @@ class CodePageTest(unittest.TestCase):
(b'[\xff]', 'strict', None), (b'[\xff]', 'strict', None),
(b'[\xff]', 'ignore', '[]'), (b'[\xff]', 'ignore', '[]'),
(b'[\xff]', 'replace', '[\ufffd]'), (b'[\xff]', 'replace', '[\ufffd]'),
(b'[\xff]', 'backslashreplace', '[\\xff]'),
(b'[\xff]', 'surrogateescape', '[\udcff]'), (b'[\xff]', 'surrogateescape', '[\udcff]'),
(b'[\xff]', 'surrogatepass', None), (b'[\xff]', 'surrogatepass', None),
(b'\x81\x00abc', 'strict', None), (b'\x81\x00abc', 'strict', None),
(b'\x81\x00abc', 'ignore', '\x00abc'), (b'\x81\x00abc', 'ignore', '\x00abc'),
(b'\x81\x00abc', 'replace', '\ufffd\x00abc'), (b'\x81\x00abc', 'replace', '\ufffd\x00abc'),
(b'\x81\x00abc', 'backslashreplace', '\\xff\x00abc'),
)) ))
def test_cp1252(self): def test_cp1252(self):
......
...@@ -10,6 +10,9 @@ Release date: TBA ...@@ -10,6 +10,9 @@ Release date: TBA
Core and Builtins Core and Builtins
----------------- -----------------
- Issue #22286: The "backslashreplace" error handlers now works with
decoding and translating.
- Issue #23253: Delay-load ShellExecute[AW] in os.startfile for reduced - Issue #23253: Delay-load ShellExecute[AW] in os.startfile for reduced
startup overhead on Windows. startup overhead on Windows.
......
...@@ -864,74 +864,112 @@ PyObject *PyCodec_XMLCharRefReplaceErrors(PyObject *exc) ...@@ -864,74 +864,112 @@ PyObject *PyCodec_XMLCharRefReplaceErrors(PyObject *exc)
PyObject *PyCodec_BackslashReplaceErrors(PyObject *exc) PyObject *PyCodec_BackslashReplaceErrors(PyObject *exc)
{ {
if (PyObject_IsInstance(exc, PyExc_UnicodeEncodeError)) { PyObject *object;
PyObject *restuple; Py_ssize_t i;
PyObject *object; Py_ssize_t start;
Py_ssize_t i; Py_ssize_t end;
Py_ssize_t start; PyObject *res;
Py_ssize_t end; unsigned char *outp;
PyObject *res; int ressize;
unsigned char *outp; Py_UCS4 c;
Py_ssize_t ressize;
Py_UCS4 c; if (PyObject_IsInstance(exc, PyExc_UnicodeDecodeError)) {
if (PyUnicodeEncodeError_GetStart(exc, &start)) unsigned char *p;
if (PyUnicodeDecodeError_GetStart(exc, &start))
return NULL; return NULL;
if (PyUnicodeEncodeError_GetEnd(exc, &end)) if (PyUnicodeDecodeError_GetEnd(exc, &end))
return NULL; return NULL;
if (!(object = PyUnicodeEncodeError_GetObject(exc))) if (!(object = PyUnicodeDecodeError_GetObject(exc)))
return NULL;
if (!(p = (unsigned char*)PyBytes_AsString(object))) {
Py_DECREF(object);
return NULL; return NULL;
if (end - start > PY_SSIZE_T_MAX / (1+1+8))
end = start + PY_SSIZE_T_MAX / (1+1+8);
for (i = start, ressize = 0; i < end; ++i) {
/* object is guaranteed to be "ready" */
c = PyUnicode_READ_CHAR(object, i);
if (c >= 0x10000) {
ressize += 1+1+8;
}
else if (c >= 0x100) {
ressize += 1+1+4;
}
else
ressize += 1+1+2;
} }
res = PyUnicode_New(ressize, 127); res = PyUnicode_New(4 * (end - start), 127);
if (res == NULL) { if (res == NULL) {
Py_DECREF(object); Py_DECREF(object);
return NULL; return NULL;
} }
for (i = start, outp = PyUnicode_1BYTE_DATA(res); outp = PyUnicode_1BYTE_DATA(res);
i < end; ++i) { for (i = start; i < end; i++, outp += 4) {
c = PyUnicode_READ_CHAR(object, i); unsigned char c = p[i];
*outp++ = '\\'; outp[0] = '\\';
if (c >= 0x00010000) { outp[1] = 'x';
*outp++ = 'U'; outp[2] = Py_hexdigits[(c>>4)&0xf];
*outp++ = Py_hexdigits[(c>>28)&0xf]; outp[3] = Py_hexdigits[c&0xf];
*outp++ = Py_hexdigits[(c>>24)&0xf];
*outp++ = Py_hexdigits[(c>>20)&0xf];
*outp++ = Py_hexdigits[(c>>16)&0xf];
*outp++ = Py_hexdigits[(c>>12)&0xf];
*outp++ = Py_hexdigits[(c>>8)&0xf];
}
else if (c >= 0x100) {
*outp++ = 'u';
*outp++ = Py_hexdigits[(c>>12)&0xf];
*outp++ = Py_hexdigits[(c>>8)&0xf];
}
else
*outp++ = 'x';
*outp++ = Py_hexdigits[(c>>4)&0xf];
*outp++ = Py_hexdigits[c&0xf];
} }
assert(_PyUnicode_CheckConsistency(res, 1)); assert(_PyUnicode_CheckConsistency(res, 1));
restuple = Py_BuildValue("(Nn)", res, end);
Py_DECREF(object); Py_DECREF(object);
return restuple; return Py_BuildValue("(Nn)", res, end);
}
if (PyObject_IsInstance(exc, PyExc_UnicodeEncodeError)) {
if (PyUnicodeEncodeError_GetStart(exc, &start))
return NULL;
if (PyUnicodeEncodeError_GetEnd(exc, &end))
return NULL;
if (!(object = PyUnicodeEncodeError_GetObject(exc)))
return NULL;
}
else if (PyObject_IsInstance(exc, PyExc_UnicodeTranslateError)) {
if (PyUnicodeTranslateError_GetStart(exc, &start))
return NULL;
if (PyUnicodeTranslateError_GetEnd(exc, &end))
return NULL;
if (!(object = PyUnicodeTranslateError_GetObject(exc)))
return NULL;
} }
else { else {
wrong_exception_type(exc); wrong_exception_type(exc);
return NULL; return NULL;
} }
if (end - start > PY_SSIZE_T_MAX / (1+1+8))
end = start + PY_SSIZE_T_MAX / (1+1+8);
for (i = start, ressize = 0; i < end; ++i) {
/* object is guaranteed to be "ready" */
c = PyUnicode_READ_CHAR(object, i);
if (c >= 0x10000) {
ressize += 1+1+8;
}
else if (c >= 0x100) {
ressize += 1+1+4;
}
else
ressize += 1+1+2;
}
res = PyUnicode_New(ressize, 127);
if (res == NULL) {
Py_DECREF(object);
return NULL;
}
outp = PyUnicode_1BYTE_DATA(res);
for (i = start; i < end; ++i) {
c = PyUnicode_READ_CHAR(object, i);
*outp++ = '\\';
if (c >= 0x00010000) {
*outp++ = 'U';
*outp++ = Py_hexdigits[(c>>28)&0xf];
*outp++ = Py_hexdigits[(c>>24)&0xf];
*outp++ = Py_hexdigits[(c>>20)&0xf];
*outp++ = Py_hexdigits[(c>>16)&0xf];
*outp++ = Py_hexdigits[(c>>12)&0xf];
*outp++ = Py_hexdigits[(c>>8)&0xf];
}
else if (c >= 0x100) {
*outp++ = 'u';
*outp++ = Py_hexdigits[(c>>12)&0xf];
*outp++ = Py_hexdigits[(c>>8)&0xf];
}
else
*outp++ = 'x';
*outp++ = Py_hexdigits[(c>>4)&0xf];
*outp++ = Py_hexdigits[c&0xf];
}
assert(_PyUnicode_CheckConsistency(res, 1));
Py_DECREF(object);
return Py_BuildValue("(Nn)", res, end);
} }
static _PyUnicode_Name_CAPI *ucnhash_CAPI = NULL; static _PyUnicode_Name_CAPI *ucnhash_CAPI = NULL;
...@@ -1444,8 +1482,8 @@ static int _PyCodecRegistry_Init(void) ...@@ -1444,8 +1482,8 @@ static int _PyCodecRegistry_Init(void)
backslashreplace_errors, backslashreplace_errors,
METH_O, METH_O,
PyDoc_STR("Implements the 'backslashreplace' error handling, " PyDoc_STR("Implements the 'backslashreplace' error handling, "
"which replaces an unencodable character with a " "which replaces malformed data with a backslashed "
"backslashed escape sequence.") "escape sequence.")
} }
}, },
{ {
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment